C# Generate Specific Length With-In Specific Range Unique Integer Number - c#

I am developing an API in .Net Core where I have to save an integer number in the SQL table. The number must be of length '9' any number starting from 000000001 - 9 digit number and it should never repeat in the future also I want this number at the memory level because I have to use this number for some other purposes. After doing some search, one of the most common solution is using DateTime.Now.Ticks and trimming its length but the problem is when concurrent HTTP requests come the ticks value might be the same.
One solution to solve this is by applying a lock on the method and releasing it when data will saved in the database but this will slow the performance of the application as the lock is expensive to use.
Second solution is by introducing a new table and setting the initial counter value to 1, so on every HTTP request first apply the UnitOfWork read the value from the table and increment it by one and then save it and then process the other request but again there is a performance hit and not an optimum solution.
So, Is there any other solution that is faster and less expensive?
Thanks in advance

I think you can create a computed column/field in combination with ID(Auto increment). Auto ID will help you insert unique number and computed field will help you make that field Generate specific in length.
An example:
CREATE TABLE [dbo].[EmployeeMaster](
[ID] [int] IDENTITY(1,1) NOT NULL,
[PreFix] [varchar](50) NOT NULL,
[EmployeeNo] AS ([PreFix]+ RIGHT('0000000' + CAST(Id AS VARCHAR(9)), 9)) PERSISTED,
[EmployeeName] VARCHAR(50),
CONSTRAINT [PK_AutoInc] PRIMARY KEY ([ID] ASC)
)

Related

Performance impact of using GUID in SQL Server

I have tried searching this before asking but every result I have found mentions GUIDs as a PK which is not the case here.
I have a database that's using INT as the PK on all tables. However the data is accessed via API calls and a requirement was for the INT value not to be returned or used in any API. Therefore I thought of having an extra column on the tables containing a GUID.
Now my question is, if I Index the GUID column what kind of performance impact will this have? Would it be positive or negative? Bear in mind the GUID is NOT a PK or FK.
I think you are on the right track, but don't take it from me...
In the comments section on one of Kimberly Tripp's articles, she responds to a comment that advocates the opposite of your position, and she disagrees and argues for the same solution you are proposing (nonclustered indexed guid with a clustered int/bigint primary key).
Herman:
If the GUID is the intrinsic identifier for the entity being modelled (i.e. used by selects) then it should be the clustered primary key without question. The reason is that adding a surrogate identity key (of int or bigint) and demoting the GUID primary key to a column with an index/unique constraint requires 2 indexes to be maintained and slows down, in my experience, by a factor of 2.
Kimberly Tripp
Hey there Herman – Actually, I disagree. For point-based queries using a nonclustered index does not add a significant amount of costly IOs. And, the maintenance of a nonclustered index that’s heavily fragmented is a lot cheaper than the required maintenance on a heavily fragmented clustered index. Additionally, the GUID might make your nonclustered indexes unnecessarily wide – making them take: more log space, more disk space, more cache as well as adding time on insert and access (especially in larger queries/joins). So, while you might not feel like an arbitrary/surrogate key is useful (because you never directly query against it) it can be incredibly efficient to use indirectly through your nonclustered indexes. There’s definitely an element of “it depends” here but if you have even just a few nonclustered indexes then it’s likely to be more beneficial than negative and often significantly so.
Cheers,
kt ~ GUIDs as PRIMARY KEYs and/or the clustering key - Kimberly L. Tripp
This should be fine. Of course, you have the normal impact of any index and any column taking up more space. So, data modifications will be a bit slower. The use of a GUID to locate a record versus an integer is slightly slower. Unless you have a very high throughput application, these are probably not important considerations.
One key point is that the GUID column should not be clustered. This is very important because GUIDs are random, but primary keys are ordered. If a GUID were used for a clustered index, almost every insert would go between two existing records, requiring a lot of movement of data. By contrast, an identity column as a clustered index always inserts at the end of the data.
I am guessing that your references on GUIDs have discussed this issue.

Primary key constraint duplicate key exceptions with time series data and DATETIME2

I have a database table as follows:
CREATE TABLE some_table
(
price FLOAT NOT NULL,
size FLOAT NOT NULL,
retrieved DATETIME2 DEFAULT SYSUTCDATETIME(),
runner_id INT NOT NULL,
FOREIGN KEY (runner_id) REFERENCES runner(id),
PRIMARY KEY (retrieved, price, size, runner_id)
);
CREATE INDEX some_table_index ON some_table (runner_id);
This table is populated by sets of price/size data retrieved from a web service which is essentially time-series in nature. As far as I can tell (and I have put some comparison logic in my code to make sure) price and size are never the duplicated in a single set of entries retrieved from the web service. They may however be duplicated in subsequent requests for price/size data related to the same runner.
I am getting intermittent primary key constraint duplicate key exceptions even though I am forming my key off a high resolution date time value as well as the rest of the table columns. At this stage I am considering dropping the composite key in favor of an auto-generated primary key. Can anyone suggest why this might be happening based on the table schema? I consider it unlikely that I am trying to insert two separate sets of price/size data with duplicate values simultaneously given the nature of the code and resolution of the date time value. I guess it is possible though - I am using asynchronous methods to interact with the database and web service.
Thanks
Is each runner_id inserting multiple rows into the table in bulk? It's possible the same price and size would be processed in less than 100 nanoseconds. This would result in them not being unique.
SQL Server obtains the date and time values by using the GetSystemTimeAsFileTime() Windows API. The accuracy depends on the computer hardware and version of Windows on which the instance of SQL Server is running. The precision of this API is fixed at 100 nanoseconds. The accuracy can be determined by using the GetSystemTimeAdjustment() Windows API.
https://msdn.microsoft.com/en-us/library/bb630387.aspx

sql server 24/7 insert delete select performance

I am building an C# application that inserts 2000 records every second using Bulkinsert.
Database version is 2008 R2
The application calls a SP that deletes the records when they are more than 2 hours old in chunks using TOP (10000). This is performed after each insert.
The enduser selects records to view in a diagram using dateranges and a selection of 2 to 10 parameterids.
Since the application will run 24/7 with no downtime i am concerned about performance issues.
Partitioning is not an option since the customer dont have an Enterprise edition.
Is the clustered index definition good?
Is it neccesary to implement any index recreation / reindexation to increase performance due to the fact that rows are inserted in one end of the table and removed in the other end?
What about update statistics, is it still an issue in 2008 R2?
I use OPTION (RECOMPILE) to avoid using outdated queryplans in the select, is that a good approach?
Are there any tablehints that can speed up the SELECT?
Any suggestions around locking strategies?
In addition to the scenario above i have 3 more tables that works in the same way with different timeframes. One inserts every 20 seconds and deletes rows older than 1 week, another inserts every minute and deletes rows older than six weeks and the last inserts every 5 minutes and deletes rows older than 3 years.
CREATE TABLE [dbo].[BufferShort](
[DateTime] [datetime2](2) NOT NULL,
[ParameterId] [int] NOT NULL,
[BufferStateId] [smallint] NOT NULL,
[Value] [real] NOT NULL,
CONSTRAINT [PK_BufferShort] PRIMARY KEY CLUSTERED
(
[DateTime] ASC,
[ParameterId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
ALTER PROCEDURE [dbo].[DeleteFromBufferShort]
#DateTime DateTime,
#BufferSizeInHours int
AS
BEGIN
DELETE TOP (10000)
FROM BufferShort
FROM BufferStates
WHERE BufferShort.BufferStateId = BufferStates.BufferStateId
AND BufferShort.[DateTime] < #DateTime
AND (BufferStates.BufferStateType = 'A' OR BufferStates.Deleted = 'True')
RETURN 0
END
ALTER PROCEDURE [dbo].[SelectFromBufferShortWithParameterList]
#DateTimeFrom Datetime2(2),
#DateTimeTo Datetime2(2),
#ParameterList varchar(max)
AS
BEGIN
SET NOCOUNT ON;
-- Split ParameterList into a temporary table
SELECT * INTO #TempTable FROM dbo.splitString(#ParameterList, ',');
SELECT *
FROM BufferShort Datapoints
JOIN Parameters P ON P.ParameterId = Datapoints.ParameterId
JOIN #TempTable TT ON TT.Token = P.ElementReference
WHERE Datapoints.[DateTime] BETWEEN #DateTimeFrom AND #DateTimeTo
ORDER BY [DateTime]
OPTION (RECOMPILE)
RETURN 0
END
This is a classic case of penny wise/pound foolish. You are inserting 150 million records per day and you are not using Enterprise.
The main reason not to use a clustered index is because the machine cannot keep up the quantity of rows being inserted. Otherwise you should always use a clustered index. The decision of whether to use a clustered index is usually argued between those who believe that every table should have a clustered index and those who believe that perhaps one or two percent of tables should not have a clustered index. (I don't have time to engage in a 'religious' type debate about this- just research the web.) I always go with a clustered index unless the inserts on a table are failing.
I would not use the STATISTICS_NORECOMPUTE clause. I would only turn it off if inserts are failing. Please see Kimberly Tripp's (an MVP and a real SQL Server expert) article at http://sqlmag.com/blog/statisticsnorecompute-when-would-anyone-want-use-it.
I would also not use OPTION (RECOMPILE) unless you see queries are not using the right indexes (or join types) in the actual query plan. If your query is executed many times per minute/second this can have an unnecessary impact on the performance of your machine.
The clustered index definition seems good as long as all queries specify at least the leading DateTime column. The index will also maximize insert speed, assuming the times are incremental, as well as reduce fragmentation. You shouldn't need to reorg/reorganize often.
If you have only the clustered index on this table, I wouldn't expect you need to update stats frequently because there isn't another data access path. If you have other indexes and complex queries, verify the index is branded ascending with the query below. You may need to update stats frequently if it is not branded ascending and you have complex queries:
DBCC TRACEON(2388);
DBCC SHOW_STATISTICS('dbo.BufferShort', 'PK_BufferShort');
DBCC TRACEOFF(2388);
For the #ParameterList, consider a table-valued-parameter instead. Specify a primary key of Token on the table type.
I would suggest you introduce the RECOMPILE hint only if needed; I suspect you will get a stable plan with a clustered index seek without it.
If you have blocking problems, consider altering the database to specify the READ_COMMITTED_SNAPSHOT option so that row versioning instead of blocking is used for read consistency. Note that this will add 14 bytes of row overhead and use tempdb more heavily, but the concurrency benefits might outweigh the costs.

Size of a PK int in SQL Server / ASP.NET MVC 4

Currently, my primary key data type is an int, not null, and is defined as following from the application:
public virtual int Id { get; set; }
I am a bit worried though, since int is restricted, at some point, it will not be possible to add new rows because of the size of the primary key.
How should this problem be approached? I was thinking about a) simply changing the data type to long, etc. or, b) if possible, remove the primary key since it is not used at any time in the application?
Don't remove your primary key, you need it to identify records in your database. If you expect to have more than int can handle, you can:
Make it bigint
Make it uniqueidentifier
Make it a composite key (made up of two or more fields)
Unless you're dealing with very small tables (10 rows), you never want to go without a primary key. The primary key dramatically affects performance as it provides your initial unique and clustered index. Unique indexes play a vital role in maintaining data integrity. Clustered indexes play a vital role in allowing SQL Server to perform index and row seeks instead of scans. Basically, does it have to load one row or all of the rows.
Changing the data-type will affect your primary index size, row size, as well the size of as any index placed on the table. Unless you're worried about exceeding 2,147,483,647 rows in the near future, I would stick with an INT. Every data type has a restricted row count.
Do you really think you'll get above 2,147,483,647 rows? I doubt it. I wouldn't worry about it.
If you, at some point, begin to reach the limit, it should be trivial to change it to a bigint.
It depends on how big you're expecting this table to become - take a look at the reference page for SQL Server for supported ranges and you can answer the question about the need to change the data type of the PK for yourself.
If the key is really never used (not even as a foreign key) then culling it is entirely seemly.
You should always have a primary key, so I wouldn't remove that. However, do you really think you're going to exceed to limit of 2,147,483,647 rows in your table?
If it's really a concern, you could just change your dataType to a bigint.
Here is also a limits sheet on what SQL server can handle - that may help you get a fix on what you need to plan for.

Modification in Database due to use of GUID (uniqueidentifier)

The application I have completed has gone live and we are facing some very specific problems as far as response time is concerned in specific tables.
In short, response time in some of the tables that have 5k rows is very low. And these tables will grow in size.
Some of these tables (e.g. Order Header table) have a uniqueidentifier as the P.K. We figure that this may be the reason for the low response time.
On studying the situation we have decided the following options
Convert the index of the primary key in the table OrderHeader to a non-clustered one.
Use newsequentialid() as the default value for the PK instead of newid()
Convert the PK to a bigint
We feel that option number 2 is ideal since option number 3 will require big ticket changes.
But to implement that we need to move some of our processing in the insert stored procedures to triggers. This is because we need to trap the PK from the OrderHeader table and there is no way we can use
Select #OrderID = newsequentialid() within the insert stored procedure.
Whereas if we move the processing to a trigger we can use
select OrderID from inserted
Now for the questions?
Will converting the PK from newid() to newsequentialid() result in performance gain?
Will converting the index of the PK to a non-clustered one and retaining both uniqueidentifier as the data type for PK and newid() for generating the PK solve our problems?
If you faced a similar sort of situation please do let provide helpful advice
Thanks a tons in advance people
Romi
Convert the index of the primary key in the table OrderHeader to a non-clustered one.
Seems like a good option to do regardless of what you do. If your table is clustered using your pkey and the latter is a UUID, it means you're constantly writing somewhere in the middle of the table instead of appending new rows to the end of it. That alone will result in a performance hit.
Prefer to cluster your table using an index that's actually useful for sorting; ideally something on a date field, less ideally (but still very useful) a title/name, etc.
Move the clustered index off the GUID column and onto some other combination of columns (your most often run range search, for instance)
Please post your table structure and index definitions, and problem query(s)
Before you make any changes: you need to measure and determine where your actual bottleneck is.
One of the common reasons for a GUID Primary Key, is generating these ID's in a client layer, but you do not mention this.
Also, are your statistics up to date? Do you rebuild indexes regularly?

Categories

Resources