Generating a unique primary key - c#

I have a SQL Server database that will contain many tables that all connect, each with a primary key. I have a Dictionary that keeps track of the the primary keys fields are for each table. My task is to extract data every day from attribute-centric XML files and insert them into a master database. Each XML file has the same schema. I'm doing this by using an XMLReader and importing the data into a DataSet.
I can't use an AutoNumber for the keys. Let's say yesterday's XML file produced a DataTable similar to the following, and it was imported into a database
-------------------------------------
| Key | Column1 | Column2 | Column3 |
|-----------------------------------|
| 0 | dsfsfsd | sdfsrer | sdfsfsf |
|-----------------------------------|
| 1 | dertert | qweqweq | xczxsdf |
|-----------------------------------|
| 2 | prwersd | xzcsdfw | qwefkgs |
-------------------------------------
If today's XML file produces the following DataTable
-------------------------------------
| Key | Column1 | Column2 | Column3 |
|-----------------------------------|
| 0 | sesdfsd | hjghjgh | edrgffb |
|-----------------------------------|
| 1 | wrwerwr | zxcxfsd | pijghjh |
|-----------------------------------|
| 2 | vcbcvbv | vbnvnbn | bnvfgnf |
-------------------------------------
Then when I go to import the new data into the database using SqlBulkCopy, then there will be duplicate keys. My solution to this is to use DateTime.Now.Ticks to generate unique keys. Theoretically, this should always create a unique key.
However, for some reason DateTime.Now.Ticks is not unique. For example, 5 records in a row might all have the key 635387859864435908, and the next 7 records might have the key 635387859864592164, even though I am generating that value at different times. I want to say that the cause of the problem is that my script is calling DateTime.Now.Ticks several times before it updates the time.
Can anyone else think of a better way to generate keys?

It's possible that the value of DateTime.Now is cached for a small amount of time for performance reasons. We do something similar to this and there are 2 possible options that we use:
Keep a list of numbers that you've used on the server you're on and increment if you can determine the number has already been used
Convert the field to a string and append a GUID or some other random identifier on the end of it. A GUID can be created with System.Guid.NewGuid().ToString();
Obviously neither of these plans are going to make the risk of collision zero, but they can help in reducing it.

If you have huge amount of data and you need to have a unique key for each row just use GUID

You could do something like the following to get a unique id (SQL Fiddle):
SELECT
CONCAT(YEAR(GETDATE()), DATEDIFF(DAY, STR(YEAR(GETDATE()), 4) + '0101',
GETDATE() ) + 1, ROW_NUMBER() OVER(ORDER BY id DESC)) UniqueID
FROM supportContacts s
This would work if you only run the query once per day. If you ran it more than once per day you would need to grab the seconds or something else (SQL Fiddle):
SELECT CONCAT(CurrYear, CurrJulian, CurrSeconds, Row) AS UniqueID
FROM
(
SELECT
YEAR(GETDATE()) AS CurrYear,
DATEDIFF(DAY, STR(YEAR(GETDATE()), 4) + '0101', GETDATE() ) + 1 AS CurrJulian,
ROW_NUMBER() OVER(ORDER BY id DESC) AS Row,
datediff(second, left(convert(varchar(20), getdate(), 126), 10), getdate()) AS CurrSeconds
from supportContacts s
) AS m

Related

Multiple incremental series with different prefix in SQL Server?

Here is the scenario:
Config Table:
+--------+-----------+-------+
| Prefix | Separator | Seed |
+--------+-----------+-------+
| A | # | 10000 |
+--------+-----------+-------+
Transaction Table:
+----+----------+------+
| Id | SerialNo | Col3 |
+----+----------+------+
| 1 | A#10000 | |
| 2 | A#10001 | |
+----+----------+------+
The Transaction table has a SerialNo column that has a sequential number generated based on configuration table. Configuration table determines the prefix separator and the seed value of the serial number.
In the above example the serial number would start at A#10000 and increment by 1.
But if after few months someone updates the configuration table to have
+--------+-----------+-------+
| Prefix | Separator | Seed |
+--------+-----------+-------+
| B | # | 10000 |
+--------+-----------+-------+
Then the Transaction table is supposed to look something like this:
+----+----------+------+
| Id | SerialNo | Col3 |
+----+----------+------+
| 1 | A#13000 | |
| 2 | B#10001 | |
+----+----------+------+
However there could be no duplicate serial numbers at any given point in time in Transaction table.
If someone sets Prefix back to A and seed to 10000 then the next serial number should not be A#10000 because it already exists. It should be A#13001
One could simply write a select query with MAX() and CONCAT() by then it could cause issues with concurrency. Don't want to have duplicate serial numbers. Also, would want to have this as performance friendly as possible.
Another solution that I could come up with is that I create a windows service that will keep on running and watching the table. The records get inserted with null as serial number and the windows service will update the serial number. This way there will be no concurrency issues but then I am not sure how reliable this is. There will be delays.
There will only be one entry in configuration table at any given point in time.
You can solve the seed value problem quite easily in SQL Server. When someone updates the seed value back to 10000 you will need to do this via a stored procedure. The stored procedure then determines what the actual next available value should be because clearly 10000 could be the wrong value. The stored procedure then executes DBCC CHECKIDENT with the correct "new_reseed_value". Then when new records are inserted the server will handle the values again correctly.
Please look at this link for usage on the DBCC CHECKIDENT command. SQL Server DBCC CHECKIDENT

sql to check if only entry with that value

I have a table looking as follows: (with a few more irrelevant rows)
| user_id | user_name | fk_role_id| password|
| 1 | us1 | 1 | 1234 |
| 2 | us2 | 2 | 1234 |
| 3 | us3 | 2 | 1234 |
| 4 | us4 | 4 | 1234 |
I need to form/create an SQL statement that is counting the amount of entries with the fk_role_id of 1.
If there is more than one user with that fk_role_id, it can delete that user, but if there is only one user with fk_role_id it will fail, or give an error message stating that, that user is the last one with that fk_role_id and therefore it can't be deleted.
So far I have not found anything anywhere near that, that works. So hopefully someone in here will be able to help me quickly.
SQL server (2008 onwards):
with CTE as
(
select MT.*, row_number() over(partition by fk_role_id order by user_id) as rn
from MyTable MT
)
delete
from CTE
where rn >1
please try this statement.
DELETE TOP (1) FROM table_name WHERE fk_role_id= (SELECT fk_role_id
FROM table_name GROUP BY fk_role_id HAVING COUNT(fk_role_id)>1)
Each time the statement is executed it deletes the top row till last only one record is left. For the last record, Having condition fails and hence it will not be deleted from your table.

Oracle update statement on overly indexed table

I am struggling with a simple update statement in Oracle. The update itself has not changed in forever but the table has grown massively and the performance is now unacceptable.
Here is the low down:
70 columns
27 indexes (which I am not under any circumstances allowed to reduce)
50M rows
Update statement is just hitting one table.
Update statement:
update TABLE_NAME
set NAME = 'User input string',
NO = NO,
PLANNED_START_DATE = TO_DATE('3/2/2016','dd/mm/yyyy'),
PLANNED_END_DATE = TO_DATE('3/2/2016','dd/mm/yyyy'),
WHERE ID = 999999 /*pk on the table*/
Execution Plan:
==================
Plan hash value: 2165476569
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | 245 | 1 (0)| 00:00:01 |
| 1 | UPDATE | TABLE_NAME | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TABLE_NAME | 1 | 245 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_INDEX | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID"=35133238)
==================================================
The update statement originates in a C# application but I am free to change the statement there.
Select statements still perform well thanks to all the indexes but as I see it that is exactly what is wrong with the update - it has to go update all the indexes.
We are licensed for partitioning but this table is NOT partitioned.
How can I improve the performance of this update statement without altering the table or its indexes?
Are you sure that column id is primary key? And is primary key based on unique index? Because in this case CBO would use INDEX UNIQUE SCAN. In your plan CBO expected 188 rows using filter ID (primary kay) = value and uses INDEX RANGE SCAN

how will i create multiple type id in one column database design

I am designing SQL Server database and i have to create multiple FK in one column
so I have these tables and create a menu in one table
Table 1 Table 2 Table 3
| Pages | Jobs | News
|------------ |--------- |-----------
| Pageid | Jobid | NewsId
| PageName | JobName | NewsTitle
| MenuName | MenuName | MenuName
My aim is to reference these table in one column
I have a table from this scenario
| MenuGroup
|------------
| menuGroupId
| MenuName
| RecordeId
So how will i achieve the normalize database design?
Sol 1 (Fixed no of Columns):
This is the most standard and normalized solution. You can create a new table with nullable columns as suggested by #Tim
| JoiningTable
|------------
| Id
| PageId
| JobId
| NewsId
Sol2:(Dynamic no of Columns):
Although I do not consider it a good approach , since referential integrity is lost here, but in case of dynamic number of columns I don't have anyother solution except this one.
Type:
|------------
| TypeId
| Name
JoiningTable
|------------
| Id
| JoiningId
| TypeId (news,job,pages etc etc)
You can compress these two tables into one by replacing a TypeId with type field in JoiningTable.
NoSQl may also be a solution but I have no experience of working on NOSQL so I cannot recommend you anything about that.
I will suggest you create another table that has the 3 tables' ids as foreign keys then use the primary key of this new table in MenuGroup table and you can use LEFT JOIN to get individual tables through the new table.
Having multiple FK's in a single table isn't bad, try using the MenuName column as FK so you won't have to create an extra field in your database.

Saving an array of strings to an SQL column

I want to create a "sessions" table in an SQL database for a school project. Each session should have:
Session ID
Lecturer name
Time and date
Module name
Course name
List of student IDs
List of student statuses(Present, absent, late)
How can I represent such a thing ???
Is it better to create ONE table to represent all sessions, where each session would be one row, and have an array of strings in each column that represents the names, IDs, and status of students??
OR
Create a new table for each new session ??
What is better, and please explain how to do it briefly.
Bear in mind, that I would need to insert/delete/update/view each table from a C# windows application, and the maximum expected number of sessions is just 100.
Also, I am using SQL Server 2012 with a C# windows application developed with Visual Studio 2012
Thanks
I think you should have 3 tables :
Students
------------------------------------
| StudentID | FirstName | LastName |
|-----------|-----------|----------|
| 4456 | John | Doe |
| 6678 | Billy | Bob |
------------------------------------
Here StudentID is Primary Key
Sessions
---------------------------------------------------------------------
| SessionID | Lecturer | DateTime | Module | Course |
|-----------|----------|----------|------------------|--------------|
| 1 | Mr.Joe | 524523461| Natural Sciences | Oceanography |
---------------------------------------------------------------------
Here dateTime would be a Unix Timestamp and SessionId is Primary Key
SessionAttendance
-------------------------------------
| SessionID | StudentID | Status |
|-----------|-----------|-----------|
| 1 | 4456 | 'Late' |
| 1 | 6678 | 'Present' |
-------------------------------------
Here SessionID and StudentID are both Primary Keys
Reason
Here you don't need to parse all the lists of attendance and statuses. The queries may get a little bigger, but it will save you alot of parsing code.
Example Query :
SELECT SessionID FROM SessionAttendance WHERE StudentID = (SELECT StudentID FROM Students WHERE FirstName = 'John' AND LastName = 'Doe') AND Status = 'Late';
It will get all the sessions in which John Doe was late. Simple, right?
Other Comments
You cannot store arrays of information in an database column. It
must be a list that is delimited somehow
In your application, I reccomend you always keep the StudentID with
its FirstName and LastName because it will make queries easier, and
will keep the names localized (so that you can change it in one
place and will change everywhere else)

Categories

Resources