Implement C# migration code in SQL - c#

I am new to SQL and today I got assigned an important task - to create a migration script for data in a table. From my understanding, a migration script is copying data from table A and move it to other tables B and C and so on. This seems to be frequent when database designs change constantly and the team wants to preserve data.
My task:
I have a JobOffer table, with the CityId field. Now the team wants to delete that field, and to preserve information they will add the CityId to the Address table and connect both tables using an intermidiary table called Location (this allows a JobOffer to have several Addresses).
I have no idea on how to perform this task. An analogy in c# of what I prentend is this:
foreach (var row in JobOffer)
{
int addressId;
if (!Address.Contains(row.CityId)){
addressId = Address.add(row.CityId);
Locaion.add(row.JobOfferId, addressId);
}
else
{
Locaion.add(row.JobOfferId, Address.get(row.CityId));
}
}
How do I do it in SQL?

You need three tables - one for the candidates, one for the addresses (locations) and one that links the two. The third table is necessary because what you described is a many to many relationship. A single candidate may have multiple locations and a single location may house multiple candidates.
When I created similar to yours it took two scans of the input data:
The first checked if I had all the locations. If any were missing I inserted it into the location table.
The second scan inserted data into the candidate and canditatelocs table. At this point I knew for sure that I had an address for every candidate in the locations table.
Here is a description of the tables:
create table candidate (candidateid int identity primary key, idate datetime default getdate(), name varchar(200))
create table candidatelocs (candidateid int, locid int)
create table locations ( locid int identity primary key, city varchar(500), state varchar(500))

Related

How to handle invalid user input table name

I am writing a C# WinForms program which includes a user input textbox, the value of which will be used to create a table. I have been thinking about what the best way to handle invalid T-SQL table names is (though this can be extended to many other situations). Currently the only method I can think of would be to check the input string for any violations of valid table names individually, though this seems long winded and could be prone to missing certain characters for example due to my own ignorance of what is a violation and what is not.
I feel like there should be a better way of doing this but have been unable to find anything in my search so far. Can anyone help point me in the right direction?
As told you in a comment already you should not do this...
You might use something like this
USE master;
GO
CREATE DATABASE dbTest;
GO
USE dbTest;
GO
CREATE TABLE UserTables(ID INT IDENTITY CONSTRAINT PK_UserTables PRIMARY KEY
,UserInput NVARCHAR(500) NOT NULL CONSTRAINT UQ_UserInput UNIQUE);
GO
INSERT INTO UserTables VALUES(N'blah')
,(N'invalid !%$& <<& >< $')
,(N'silly 💖');
GO
SELECT * FROM UserTables;
/*
ID UserInput
1 blah
2 invalid !%$& <<& >< $
3 silly 💖
*/
GO
USE master;
GO
DROP DATABASE dbTest;
GO
You would then create your tables as Table1, Table2 and so on.
Whenever a user enters his string, you visit the table, pick the ID and create the table's name by concatenating the word Table with the ID.
There are better approaches!
But you should think of a fix schema. You will have to define columns (how many, which type, how to name them?). You will feel in hell when you have to query this. Nothing to rely on...
One approach is a classical n:m mapping
A User table (UserID, Name, ...)
A test table (TestID, TestName, TestType, ...)
The mapping table (ID, UserID, TestID, Result VARCHAR(MAX))
Depending on what you need you might add a table
question table (QuestionID, QuestionText ...)
Then use a mapping to bind questions to tests and another mapping to bind answers to such mapped questions.
another approach was to store the result as a generic container (XML or JSON). This keeps your tables slim, but needs to knwo the XML's structure in order to query it.
Many ways to skin a rabbit...
UPDATE
You ask for an explanation...
The main advantage of a relational database is the pre-known structure.
Precompiled queries, cached results, statisics, indexes demand for known structures.
Data integrity is ensured with constraints, foreign keys and so on. All this demands for known names, known types(!) and known relations.
User-specific table names, and even worse: generically defined structures, do not allow for any join, or other typical RDBMS operation. The only approach is to create each and any statement dynamically (string building)
The rule of thumb is: Whenever you think to have to create several objects of for the same, but with different names you should think about the design. It is bad to store Phone1, Phone2 and Phone3. It is better to have a side table, with a UserID and a Phone column (classical 1:n). It is bad to have SalesCustomerA, SalesCustomerB, better use a Customer table and bind its ID into a general Sales table as FK.
You see what I mean? What belongs together should live in one single table. if you need separation add columns to your table and use them for grouping and filtering.
Just imagine you want to do some statistical evaluation of all your user test tables. How would you gather the data into one big pool, if you cannot rely on some structure in common?
I hope this makes it clear...
If you still wnat to stick to your idea, you should give my code sample a try. this allows to map any silly string to a secure and easy to handle table name.
Lots can go wrong with users entering table names. A bunch of whacked out names is maintenance nightmare. A user should not even be aware of table name. It is a security risk as now the program has to have database owner authority. You want to limit users to minimum authority.
Similar to Shnugo but with composite primary key. This will prevent duplicate userID, testID.
user
ID int identity PK
varchar(100) fName
varchar(100) lName
test
ID int identity PK
varchar(100) name
userTest
int userID PK FK to User
int testID PK FK to Test
int score
select t.Name, u.Name, ut.score
from userTest ut
join Test t
on t.ID = ut.testID
join User u
on u.ID = ut.userID
order by t.Name, u.Name

C# Entity Framework and Linq - Insert into database using entity model

I am trying to create a program to export excel content/data to a database created in SQL Server 2014. I already have the data (variables) I want to insert into database. Now i am having some problems on the database diagram, in other words how should I build it, and how can I insert those values on it.
This is suppose to be a school schedule to get some querys to other program (independent of this one), so this program is just for the database management.
Now I only have one table for tests, but I know i need to do relations between them.
Original Table (Fields):
(PK) Id
StartTime
EndTime
Teacher
Class
Room
Subject
DayWeek
So now I want to create independent tables, which in my head would be:
Rooms (Id, Room)
Classes (Id, Class)
Teachers (Id, Teacher)
Subjects (Id, Subject)
So the original fields would be replaced by those tables in a one to many relationship if I am not wrong.
So the question is, I don't know how to insert with the relationship, because if there is already one Room/Teacher/Subject/Class with the same name as my variable, I will not insert into the respective table.
May some one help this newbie ^^?
Thanks and sorry for my bad english.
Edit:
Thanks for your answer, but I guess that isn't my problem.
So my database would be like this (any suggestion is welcome to improve the database structure):
Tables (and Fields):
Schedule (Id (PK), StartTime, EndTime, DayWeek, RoomId (FK), ClassId (FK), SubjectId (FK), TeacherId (FK))
Rooms (RoomId (PK), RoomName)
Classes (ClassId (PK), ClassName)
Subjects (SubjectId (PK), SubjectName)
Teachers (TeacherId (PK), TeacherName)
So the table Schedule have many relationships to different tables (One to Many, if i am not wrong).
Being more specific, the whole database is empty, with those tables and relationships created.
I am filling those tables with data from some excel files, I already read them and got them to variables, in concrete i got these values to variables, from a excel: StartTime, EndTime, DayWeek, Room, Class, Subject, Teacher.
My problem is I want to insert these values into the table Schedule since it will be the table that i want to get information, but for that I need to also insert data to the "foreign tables". So can you try to help me?
In my way of thinking since I don't have much SQL knowledge I would "ask"/select the Id of every Foreign table record (e.g.: I would get the id of the room/teacher/subject/class (variable got by excel) in the respective foreign table. If exist I would get the id and then i already have the id to insert into schedule table, else I would insert into Foreign table that field and get the id to the schedule table), is this way of thinking right, or there is a easiest way?
This database will only be written once per yer, since I will insert all the teachers schedules into it.
The original fields would be replaced by the Id of order table (foreign key).
If you wanna insert into origin table, your must have data in foreign table, you must have room,teacher, class, ...ect.
And the code will look like this, take example:
context.Fields.Add(new Field { StartTime = ...,......TeacherId = 5,ClassId =6,...ect});
Original Table (Fields):
(PK) Id
StartTime
EndTime
TeacherId
ClassId
RoomId
SubjectId
DayWeek

insert a row in a table that links with PK autoincrement of another table

I have two tables
contact table
contactID (PK auto increment)
FirstName
LastName
Address
etc..
Patient table
PatientID
contactID (FK)
How can I add the contact info for Patient first, then link that contactID to Patient table
when the contactID is autoincrement (therefore not known until after the row is created)
I also have other tables
-Doctor, nurse etc
that also links to contact table..
Teacher table
TeacherID
contactID (FK)
So therefore all the contact details are located in one table.
Is this a good database design?
or is it better to put contact info for each entity in it's own table..
So like this..
Patient table
PatientID (PK auto increment)
FirstName
LastName
Address
Doctor table
DoctorID (PK auto increment)
FirstName
LastName
Address
In terms of programming, it is easier to just have one insert statement.
eg.
INSERT INTO Patient VALUES(Id, #Firstname,#lastname, #Address)
But I do like the contact table separated (since it normalize the data) but then it has issue with not knowing what the contactID is until after it is inserted, and also probably needing to do two insert statements (which I am not sure how to do)
=======
Reply to EDIT 4
With the login table, would you still have a userid(int PK) column?
E.g
Login table
UserId (int PK), Username, Password..
Username should be unique
You must first create the Contact and then once you know its primary key then create the Patient and reference the contact with the PK you now know. Or if the FK in the Patient table is nullable you can create the Patient first with NULL as the ContactId, create the contact and then update the Patient but I wouldn't do it like this.
The idea of foreign key constraints is that the row being referenced MUST exist therefore the row being referenced must exist BEFORE the row referencing it.
If you really need to be able to have the same Contact for multiple Patients then I think it's good db design. If the relationship is actually one-to-one, then you don't need to separate them into two tables. Given your examples, it might be that what you need is a Person table where you can put all the common properties of Doctors, Teachers and Patients.
EDIT:
I think it's inheritance what you are really after. There are few styles of implementing inheritance in relational db but here's one example.
Person database design
PersonId in Nurse and Doctor are foreign keys referencing Person table but they are also the primary keys of those tables.
To insert a Nurse-row, you could do like this (SQL Server):
INSERT INTO Person(FirstName) VALUES('Test nurse')
GO
INSERT INTO Nurse(PersonId, IsRegistered) VALUES(SCOPE_IDENTITY(), 1)
GO
EDIT2:
Google reveals that SCOPE_IDENTITY() equivalent in mysql is LAST_INSERT_ID() [mysql doc]
EDIT3:
I wouldn't separate doctors and nurses into their own tables so that columns are duplicated. Doing a select without inner joins would probably be more efficient but performance shouldn't be the only criteria especially if the performance difference isn't that notable. There will many occasions when you just need the common person data so you don't always have to do the joins anyway. Having each person in the same table gives the possibility to look for a person in a single table. Having common properties in a single table also allows you have to have doctor who is also a patient without duplicating any data. Later, if you want to have more common attributes, you'd need to add them to each "derived" table too and I will assure you that one day you or someone else forgets to add the properties in one of the tables.
If for some reason you are still worried about performance and are willing to sacrifice normalization to gain performance, another possibility is to have all person columns in the same table and maybe have a type column there to distinguish them and just have a lot of null columns, so that all the nurse columns are null for doctors and so on. You can read about inheritance implementation strategies to get an idea of even though you aren't using Entity Framework.
EDIT4:
Even if you don't have any nurse-specific columns at the moment, I would still create a table for them if it's even slightly possible that there will be in the future. Doing an inner join is a pretty good way to find the nurses or you could do it in the WHERE-clause (there a probably a billion ways to do this). You could have type column in the Person table but that would prevent the same person being a doctor and a patient at the same time. Also in my opinion separate tables is more "strict" and more clear for (future) developers.
I would probably make PersonId nullable in the User table since you might have users that are not actual people in the organization. For example administrators or similar service users. Think about in terms of real world entities (forget about foreign keys and nullables), is every user absolutely part of the organization? But all this is up to you and the requirements of the software. Database design should begin with an entity relationship design where you figure out the real world relationships without considering how they will be mapped to a relational database. This helps you to figure out what the actual requirements are.

Records for Sales Person

I am designing this database and c# app, that a record gets saved to database. now say we have three Sales Person and each should be assigned a record in strict rotation so they get to work on equal amount of records.
What I have done so far was to create one table called Records and one SalesPerson, the Records would have salesperson id as foreign key and another column that would say which agent it is assigned to and will increment this column.
Do you think this is a good design, if not can you give any ideas?
To do this I would use the analytical functions ROW_NUMBER and NTILE (assuming your RDBMS supports them). This way you can allocate each available sales person a pseudo id incrementing upwards from 1, then randomly allocate each unassigned record one of these pseudo ids to assign them equally between sales people. Using pseudo ids rather than actual ids allows for the SalesPersonID field not being continuous. e.g.
-- CREATE SOME SAMPLE DATA
DECLARE #SalesPerson TABLE (SalesPersonID INT IDENTITY(1, 1) NOT NULL PRIMARY KEY, Name VARCHAR(50) NOT NULL, Active BIT NOT NULL)
DECLARE #Record TABLE (RecordID INT IDENTITY(1, 1) NOT NULL PRIMARY KEY, SalesPersonFK INT NULL, SomeOtherInfo VARCHAR(100))
INSERT #SalesPerson VALUES ('TEST1', 1), ('TEST2', 0), ('TEST3', 1), ('TEST4', 1);
INSERT #Record (SomeOtherInfo)
SELECT Name
FROM Sys.all_Objects
With this sample data the first step is to find the number of available sales people to allocate records to:
DECLARE #Count INT = (SELECT COUNT(*) FROM #SalesPerson WHERE Active = 1)
Next using CTEs to contain the window functions (as they can't be used in join clauses)
;WITH Records AS
( SELECT *,
NTILE(#Count) OVER(ORDER BY NEWID()) [PseudoSalesPersonID]
FROM #Record
WHERE SalesPersonFK IS NULL -- UNALLOCATED RECORDS
), SalesPeople AS
( SELECT SalesPersonID,
ROW_NUMBER() OVER (ORDER BY SalesPersonID) [RowNumber]
FROM #SalesPerson
WHERE Active = 1 -- ACTIVE SALES PEOPLE
)
Finally update the records CTE with the actual sales personID rather than a pseudo id
UPDATE Records
SET SalesPersonFK = SalesPeople.SalesPersonID
FROM Records
INNER JOIN SalesPeople
ON PseudoSalesPersonID = RowNumber
ALL COMBINED IN AN SQL FIDDLE
This is quite confusing as I suspect you're using the database term 'record' aswell as an object/entity 'Record'.
The simple concept of having a unique identifier in one table that also features as a foreign key in another table is fine though, yes. It avoids redundancy.
Basics of normalisation
Its mostly as DeeMac said. But if your Record is an object (i.e. it has all the work details or its a sale or a transaction) then you need to separate that table. Have a table Record with all the details to that particular object. Have another table `Salesman' with all the details about the Sales Person. (In a good design, you would only add particular business related attributes of the position in this table. All the personal detail will go in a different table)
Now for your problem, you can build two separate tables. One would be Record_Assignment where you will assign a Record to a Salesman. This table will hold all the active jobs. Another table will be Archived_Record_Assignment which will hold all the past jobs. You move all the completed jobs here.
For equal assignment of work, you said you want circular assignment. I am not sure if you want to spread work amongst all sales person available or only certain number. Usually assignments are given by team. Create a table (say table SalesTeam)with the Salesman ids of the sales persons you want to assign the jobs (add team id, if you have multiple teams working on their own assigned work areas or customers. That's usually the case). When you want to assign new job, query the Record_Assignment table for last record, get the Salesman id and assign the job to the next salesman in the SalesTeam table. The assignment will be done through business logic (coding).
I am not fully aware of your scenario. These are all my speculations so if you see something off according to your scenario, let me know.
Good Luck!

SQL Server - formatted identity column

I would like to have a primary key column in a table that is formatted as FOO-BAR-[identity number], for example:
FOO-BAR-1
FOO-BAR-2
FOO-BAR-3
FOO-BAR-4
FOO-BAR-5
Can SQL Server do this? Or do I have to use C# to manage the sequence? If that's the case, how can I get the next [identity number] part using EntityFramwork?
Thanks
EDIT:
I needed to do this is because this column represents a unique identifier of a notice send out to customers.
FOO will be a constant string
BAR will be different depending on the type of the notice (either Detection, Warning or Enforcement)
So is it better to have just an int identity column and append the values in Business Logic Layer in C#?
If you want this 'composited' field in your reports, I propose you to:
Use INT IDENTITY field as PK in table
Create view for this table. In this view you can additionally generate the field that you want using your strings and types.
Use this view in your repoorts.
But I still think, that there is BIG problem with DB design. I hope you'll try to redesign using normalization.
You can set anything as the PK in a table. But in this instance I would set IDENTITY to just an auto-incrementing int and manually be appending FOO-BAR- to it in the SQL, BLL, or UI depending on why it's being used. If there is a business reason for FOO and BAR then you should also set these as values in your DB row. You can then create a key in the DB between the two three columns depending on why your actually using the values.
But IMO I really don't think there is ever a real reason to concatenate an ID in such a fashion and store it as such in the DB. But then again I really only use an int as my ID's.
Another option would be to use what an old team I used to be on called a codes and value table. We didn't use it for precisely this (we used it in lieu of auto-incrementing identities to prevent environment mismatches for some key tables), but what you could do is this:
Create a table that has a row for each of your categories. Two (or more) columns in the row - minimum of category name and next number.
When you insert a record in the other table, you'll run a stored proc to get the next available identity number for that category, increment the number in the codes and values table by 1, and concatenate the category and number together for your insert.
However, if you're main table is a high-volume table with lots of inserts, it's possible you could wind up with stuff out of sequence.
In any event, even if it's not high volume, I think you'd be better off to reexamine why you want to do this, and see if there's another, better way to do it (such as having the business layer or UI do it, as others have suggested).
It is quite possible by using computed column like this:
CREATE TABLE #test (
id INT IDENTITY UNIQUE CLUSTERED,
pk AS CONCAT('FOO-BAR-', id) PERSISTED PRIMARY KEY NONCLUSTERED,
name NVARCHAR(20)
)
INSERT INTO #test (name) VALUES (N'one'), (N'two'), (N'three')
SELECT id, pk, name FROM #test
DROP TABLE #test
Note that pk is set to NONCLUSTERED on purpose because it is of VARCHAR type, while the IDENTITY field, which will be unique anyway, is set to UNIQUE CLUSTERED.

Categories

Resources