SQL Server - formatted identity column

SQL Server - formatted identity column - c#

I would like to have a primary key column in a table that is formatted as FOO-BAR-[identity number], for example:
FOO-BAR-1
FOO-BAR-2
FOO-BAR-3
FOO-BAR-4
FOO-BAR-5
Can SQL Server do this? Or do I have to use C# to manage the sequence? If that's the case, how can I get the next [identity number] part using EntityFramwork?
Thanks
EDIT:
I needed to do this is because this column represents a unique identifier of a notice send out to customers.
FOO will be a constant string
BAR will be different depending on the type of the notice (either Detection, Warning or Enforcement)
So is it better to have just an int identity column and append the values in Business Logic Layer in C#?

If you want this 'composited' field in your reports, I propose you to:
Use INT IDENTITY field as PK in table
Create view for this table. In this view you can additionally generate the field that you want using your strings and types.
Use this view in your repoorts.
But I still think, that there is BIG problem with DB design. I hope you'll try to redesign using normalization.

You can set anything as the PK in a table. But in this instance I would set IDENTITY to just an auto-incrementing int and manually be appending FOO-BAR- to it in the SQL, BLL, or UI depending on why it's being used. If there is a business reason for FOO and BAR then you should also set these as values in your DB row. You can then create a key in the DB between the two three columns depending on why your actually using the values.
But IMO I really don't think there is ever a real reason to concatenate an ID in such a fashion and store it as such in the DB. But then again I really only use an int as my ID's.

Another option would be to use what an old team I used to be on called a codes and value table. We didn't use it for precisely this (we used it in lieu of auto-incrementing identities to prevent environment mismatches for some key tables), but what you could do is this:
Create a table that has a row for each of your categories. Two (or more) columns in the row - minimum of category name and next number.
When you insert a record in the other table, you'll run a stored proc to get the next available identity number for that category, increment the number in the codes and values table by 1, and concatenate the category and number together for your insert.
However, if you're main table is a high-volume table with lots of inserts, it's possible you could wind up with stuff out of sequence.
In any event, even if it's not high volume, I think you'd be better off to reexamine why you want to do this, and see if there's another, better way to do it (such as having the business layer or UI do it, as others have suggested).

It is quite possible by using computed column like this:
CREATE TABLE #test (
id INT IDENTITY UNIQUE CLUSTERED,
pk AS CONCAT('FOO-BAR-', id) PERSISTED PRIMARY KEY NONCLUSTERED,
name NVARCHAR(20)
)
INSERT INTO #test (name) VALUES (N'one'), (N'two'), (N'three')
SELECT id, pk, name FROM #test
DROP TABLE #test
Note that pk is set to NONCLUSTERED on purpose because it is of VARCHAR type, while the IDENTITY field, which will be unique anyway, is set to UNIQUE CLUSTERED.

Related

How to handle invalid user input table name

I am writing a C# WinForms program which includes a user input textbox, the value of which will be used to create a table. I have been thinking about what the best way to handle invalid T-SQL table names is (though this can be extended to many other situations). Currently the only method I can think of would be to check the input string for any violations of valid table names individually, though this seems long winded and could be prone to missing certain characters for example due to my own ignorance of what is a violation and what is not.
I feel like there should be a better way of doing this but have been unable to find anything in my search so far. Can anyone help point me in the right direction?

As told you in a comment already you should not do this...
You might use something like this
USE master;
GO
CREATE DATABASE dbTest;
GO
USE dbTest;
GO
CREATE TABLE UserTables(ID INT IDENTITY CONSTRAINT PK_UserTables PRIMARY KEY
,UserInput NVARCHAR(500) NOT NULL CONSTRAINT UQ_UserInput UNIQUE);
GO
INSERT INTO UserTables VALUES(N'blah')
,(N'invalid !%$& <<& >< $')
,(N'silly 💖');
GO
SELECT * FROM UserTables;
/*
ID UserInput
1 blah
2 invalid !%$& <<& >< $
3 silly 💖
*/
GO
USE master;
GO
DROP DATABASE dbTest;
GO
You would then create your tables as Table1, Table2 and so on.
Whenever a user enters his string, you visit the table, pick the ID and create the table's name by concatenating the word Table with the ID.
There are better approaches!
But you should think of a fix schema. You will have to define columns (how many, which type, how to name them?). You will feel in hell when you have to query this. Nothing to rely on...
One approach is a classical n:m mapping
A User table (UserID, Name, ...)
A test table (TestID, TestName, TestType, ...)
The mapping table (ID, UserID, TestID, Result VARCHAR(MAX))
Depending on what you need you might add a table
question table (QuestionID, QuestionText ...)
Then use a mapping to bind questions to tests and another mapping to bind answers to such mapped questions.
another approach was to store the result as a generic container (XML or JSON). This keeps your tables slim, but needs to knwo the XML's structure in order to query it.
Many ways to skin a rabbit...
UPDATE
You ask for an explanation...
The main advantage of a relational database is the pre-known structure.
Precompiled queries, cached results, statisics, indexes demand for known structures.
Data integrity is ensured with constraints, foreign keys and so on. All this demands for known names, known types(!) and known relations.
User-specific table names, and even worse: generically defined structures, do not allow for any join, or other typical RDBMS operation. The only approach is to create each and any statement dynamically (string building)
The rule of thumb is: Whenever you think to have to create several objects of for the same, but with different names you should think about the design. It is bad to store Phone1, Phone2 and Phone3. It is better to have a side table, with a UserID and a Phone column (classical 1:n). It is bad to have SalesCustomerA, SalesCustomerB, better use a Customer table and bind its ID into a general Sales table as FK.
You see what I mean? What belongs together should live in one single table. if you need separation add columns to your table and use them for grouping and filtering.
Just imagine you want to do some statistical evaluation of all your user test tables. How would you gather the data into one big pool, if you cannot rely on some structure in common?
I hope this makes it clear...
If you still wnat to stick to your idea, you should give my code sample a try. this allows to map any silly string to a secure and easy to handle table name.

Lots can go wrong with users entering table names. A bunch of whacked out names is maintenance nightmare. A user should not even be aware of table name. It is a security risk as now the program has to have database owner authority. You want to limit users to minimum authority.
Similar to Shnugo but with composite primary key. This will prevent duplicate userID, testID.
user
ID int identity PK
varchar(100) fName
varchar(100) lName
test
ID int identity PK
varchar(100) name
userTest
int userID PK FK to User
int testID PK FK to Test
int score
select t.Name, u.Name, ut.score
from userTest ut
join Test t
on t.ID = ut.testID
join User u
on u.ID = ut.userID
order by t.Name, u.Name

Editing duplicate values in a database

I have a DataGrid View pulling some items from my database. What I want to achieve is to be able to edit the pack size or the bar_code fields. I am aware on how to update values in a database but how would I go about doing it if the data is the same? Meaning in many instances a bar code would have multiple pack sizes that is related to the one bar code number. Let's say I have the below screenshot. A data entry error was made and the bar_code and PackSize columns are the exact same. I want to change the first bar code to "1234." How would I achieve this? I can't say update barcode to 'textBox1.Text' where bar_code = '771313166386' because it would then change both data. How do I go about only focusing on one row of data at a time?

You can try using this query to update only the first row:
UPDATE TOP (1) my_table
SET bar_code = '1234'
WHERE bar_code = '771313166386'
You should have an auto-increment id column or a Primary key in your table.

I'd suggest you handle the logic of data duplicate manipulation at the backend rather than pulling them inside the grid and handle it there.
The following query will help you retrieve the duplicate records based on the mentioned columns. You can change it to UPDATE or DELETE as per your requirement.
-- Using cte and ranking function
;With CTE
As
(
Select
Product,
Description,
BarCode,
PackSize
Row_Number() Over(Partition By Product, BarCode, PackSize Order By Product) As RowNum
From YourTable
)
Select * From CTE
-- Where RowNum > 1;
Hope this is helpful :)

This might not help you directly in your answer. But, it is important to mention that your table design is incorrect. You should ensure the data integrity by creating a primary key in your table.
So when you need to update a product you have only one row to update.
Then you can add more tables and use foreign key references between them.

You need to uniquely represent the products. As per your sample data, I guess that there isn't any primary key on your table.
What you can do is either specify a unique constraint on columns to ensure that this type of data entry cannot be done.
If you cannot come up with list of columns to uniquely identify the rows, you can use surrogate keys by specifying Identity column and then while updating, always put a constraint where thisIdentityColumn=value

A data entry error was made and the bar_code and PackSize columns are
the exact same
I think this is the key. Essentially, the exact duplicates are unintentional, and the rows should be unique. Further it looks like bar_code + pack_size is your primary key (subject to data being entered correctly).
So, when you do an update, simply update the first row found that matches a bar_code and a pack_size. If it isn't unique, then the update should ensure that you are one step closer to unique rows in the database.
If you need a non-verbal answer, let me know.

Getting the Average in ROWS c#

i have a sql server database with table. These are
1stAP_TB, 2ndAP_TB, 3rdAP_TB, 4thAP_TB, 1steng_TB, 2ndeng_TB, 3rdeng_TB,
4theng_TB
all in them are in row. The numbers will be solve individually on specific column. Now, i need to know how am i going to get the average of 1stAP_TB, 2ndAP_TB, 3rdAP_TB and 4thAP_TB while there are in rows.
Also, there are multiple data that will be save inside the database. I am using C# programming language.

Try below method
create table aveexample
(a1stAP_TB int,
a2ndAP_TB int,
a3rdAP_TB int,
a4thAP_TB int,
a1steng_TB int,
a2ndeng_TB int,
a3rdeng_TB int,
a4theng_TB int
)
Sample data
insert into aveexample values(1,2,3,4,5,6,7,8)
insert into aveexample values(11,22,33,44,55,66,77,78)
insert into aveexample values(2,3,1,4,10,10,45,5)
Method 1
select *, (select AVG(totaldata)
from (values(a1stAP_TB),
(a2ndAP_TB),(a3rdAP_TB),(a4thAP_TB),(a1steng_TB),
(a2ndeng_TB),(a3rdeng_TB),(a4theng_TB)) total(totaldata))as average
from aveexample
Method 2
select ((a1stAP_TB)+
(a2ndAP_TB)+(a3rdAP_TB)+(a4thAP_TB)+(a1steng_TB)+
(a2ndeng_TB)+(a3rdeng_TB)+(a4theng_TB))/8 as Average
from aveexample

It is difficult to give concrete advice given the very limited description in the question, but from the description and comments so far, it seems to me like the database needs to be redesigned to better fit your requirements. First, you have no ID field, so there is no way to differentiate one row from the next. Then, what you are left with is a series of repeated values. The clue here is that you have "1st", "2nd", "3rd" in the column names. That's probably a sign that those columns need to be moved into rows of a related table. It may not instantly seem to be the best approach, but this is called "First Normal Form" and is a typical best practice with SQL databases. See also Database Normalization Basics.
It seems to me that what you have here is some entity (which you haven't mentioned in your question) that has a number of values associated with it. The 'entity' here should be given a unique ID and then all of the values for that entity stored with its ID.
You might have a table with the following columns:
CREATE TABLE MyItems (
ID int NOT NULL,
Sequence int NOT NULL,
Value int NOT NULL,
CONSTRAINT PK_MyValues_ID_Sequence PRIMARY KEY
(ID,Sequence)
)
Note: ID + sequence forms the unique primary key for the table and makes every row unique. This also lets you keep track of what order the items were added in. This may or may not be important to you but every table should probably have a unique primary key.
Your data table would then look something like this (the example represents two different entities, the first having 4 values and the second having 3 values):
It's difficult to show a sensible example without knowing more about the application and what it does... but with this table design you have a basis from which to add values one at a time, as you said you needed, and a way to query them back. You can use grouping to produce things like totals and averages, or you can do that in code by iterating over the results of a query or in a LINQ statement.
You can then compute the average for an entity of a given ID using a LINQ query along the lines of:
var average = MyItems.Where(p=>p.ID == 1).Average(q=>q.Value);
As an example of the flexibility of this sort of approach, you could just as easily compute the average of every second value entered across the entire database:
var averageOfSecondItems = MyItems.Where(p => p.Sequence == 2).Average(q => q.Value);
The example I've shown deals with one type of value. In your question it appears that you might have two different types of value. There are several ways you could handle that - for example you could add another column to the table if the values are always entered in pairs, or you could create a second table to hold the separate values. Again, it's hard to make a recommendation based on the limited information given.
If putting your data into First Normal Form seems like a lot of work, then your application might be a better fit for a document database ("NoSQL" database), but that is really a different question. In the question, a SQL database was specified so I've concentrated on that.

Primary key violation error in sql server 2008

I have created two threads in C# and I am calling two separate functions in parallel. Both functions read the last ID from XYZ table and insert new record with value ID+1. Here ID column is the primary key. When I execute the both functions I am getting primary key violation error. Both function having the below query:
insert into XYZ values((SELECT max(ID)+1 from XYZ),'Name')
Seems like both functions are reading the value at a time and trying to insert with the same value.
How can I solve this problem.. ?

Let the database handle selecting the ID for you. It's obvious from your code above that what you really want is an auto-incrementing integer ID column, which the database can definitely handle doing for you. So set up your table properly and instead of your current insert statement, do this:
insert into XYZ values('Name')
If your database table is already set up I believe you can issue a statement similar to:
alter table your_table modify column you_table_id int(size) auto_increment
Finally, if none of these solutions are adequate for whatever reason (including, as you indicated in the comments section, inability to edit the table schema) then you can do as one of the other users suggested in the comments and create a synchronized method to find the next ID. You would basically just create a static method that returns an int, issue your select id statement in that static method, and use the returned result to insert your next record into the table. Since this method would not guarantee a successful insert (due to external applications ability to also insert into the same table) you would also have to catch Exceptions and retry on failure).

Set ID column to be "Identity" column. Then, you can execute your queries as:
insert into XYZ values('Name')
I think that you can't use ALTER TABLE to change column to be Identity after column is created. Use Managament Studio to set this column to be Identity. If your table has many rows, this can be a long running process, because it will actually copy your data to a new table (will perform table re-creation).
Most likely that option is disabled in your Managament Studio. In order to enable it open Tools->Options->Designers and uncheck option "Prevent saving changes that require table re-creation"...depending on your table size, you will probably have to set timeout, too. Your table will be locked during that time.

A solution for such problems is to have generate the ID using some kind of a sequence.
For example, in SQL Server you can create a sequence using the command below:
CREATE SEQUENCE Test.CountBy1
START WITH 1
INCREMENT BY 1 ;
GO
Then in C#, you can retrieve the next value out of Test and assign it to the ID before inserting it.

It sounds like you want a higher transaction isolation level or more restrictive locking.
I don't use these features too often, so hopefully somebody will suggest an edit if I'm wrong, but you want one of these:
-- specify the strictest isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
insert into XYZ values((SELECT max(ID)+1 from XYZ),'Name')
or
-- make locks exclusive so other transactions cannot access the same rows
insert into XYZ values((SELECT max(ID)+1 from XYZ WITH (XLOCK)),'Name')

Modification in Database due to use of GUID (uniqueidentifier)

The application I have completed has gone live and we are facing some very specific problems as far as response time is concerned in specific tables.
In short, response time in some of the tables that have 5k rows is very low. And these tables will grow in size.
Some of these tables (e.g. Order Header table) have a uniqueidentifier as the P.K. We figure that this may be the reason for the low response time.
On studying the situation we have decided the following options
Convert the index of the primary key in the table OrderHeader to a non-clustered one.
Use newsequentialid() as the default value for the PK instead of newid()
Convert the PK to a bigint
We feel that option number 2 is ideal since option number 3 will require big ticket changes.
But to implement that we need to move some of our processing in the insert stored procedures to triggers. This is because we need to trap the PK from the OrderHeader table and there is no way we can use
Select #OrderID = newsequentialid() within the insert stored procedure.
Whereas if we move the processing to a trigger we can use
select OrderID from inserted
Now for the questions?
Will converting the PK from newid() to newsequentialid() result in performance gain?
Will converting the index of the PK to a non-clustered one and retaining both uniqueidentifier as the data type for PK and newid() for generating the PK solve our problems?
If you faced a similar sort of situation please do let provide helpful advice
Thanks a tons in advance people
Romi

Convert the index of the primary key in the table OrderHeader to a non-clustered one.
Seems like a good option to do regardless of what you do. If your table is clustered using your pkey and the latter is a UUID, it means you're constantly writing somewhere in the middle of the table instead of appending new rows to the end of it. That alone will result in a performance hit.
Prefer to cluster your table using an index that's actually useful for sorting; ideally something on a date field, less ideally (but still very useful) a title/name, etc.

Move the clustered index off the GUID column and onto some other combination of columns (your most often run range search, for instance)
Please post your table structure and index definitions, and problem query(s)
Before you make any changes: you need to measure and determine where your actual bottleneck is.
One of the common reasons for a GUID Primary Key, is generating these ID's in a client layer, but you do not mention this.
Also, are your statistics up to date? Do you rebuild indexes regularly?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.