When should I consider representing the primary-key as classes?
Should we only represent primary keys as classes when a table uses composite-key?
For example:
public class PrimaryKey
{ ... ... ...}
Then
private PrimaryKey _parentID;
public PrimaryKey ParentID
{
get { return _parentID; }
set { _parentID = value; }
}
And
public void Delete(PrimaryKey id)
{...}
When should I consider storing data as comma-separated values in a column in a DB table rather than storing them in different columns?
When should I consider representing the table-id columns as classes?
Much more difficult to answer without knowing your application architecture. If you use an ORM such as nhibernate or linq to sql, they will create classes for you automatically.
In general - if your primary key is a composite and has meaning in your domain, create a class for it.
If it is not a composite, there is no need for a class.
If it has no meaning in your domain it is difficult to justify a class (if creating one, I would probably go with a struct instead of a class, as it would be a value type). The only reason I would use one is if the key needs to be used as is in your code and the constituent parts do not normally need to be accessed separately.
When should I consider storing data as comma-separated values in a column in a DB table rather than storing them in different columns?.
Never. You should keep your tables normalized, so different columns and tables for the different data. Using comma separated is bad practice in this regards, especially considering the fairly bad text manipulation support in SQL.
When should I consider representing
the table-id columns as classes?
what do you mean?
When should I consider storing data as
comma-separated values in a column in
a DB table rather than storing them in
different columns?
According to Normalization Rules you shouldn't store multiple values in the same column.
Comma separated values in fields (or any other similar trick) is poor practice and often resorted to either as a stopgap measure (i.e. you find out that you needed a multi-value when it's too late to change the data model) or some legacy cruft.
By having multivalues "encrypted" in a field you lose all the benefits of having a RDBMS model: in particular you make finding/sorting/comparing values in the comma-separated field(s) hard if not impossible to use along with the rest of your data.
In almost all cases one should store each comma-delimited value in its own column. This enables SQL selects to filter rows by specific comma-delimited values.
E.g., if 12 comma-delimited values were available for each table row representing sales per month (Jan-Dec) then storing the numbers in 12 columns enables manipulation; such as: return all rows where August sales > $100,000.00. Had one 'stuffed' all 12 values into a single column then all rows would need to be returned and the column parsed for each row to pull out the August figure.
One example where 'stuffing' may be considered... where the values in a comma-delimited data set cannot be be related or compared with data from another row.
E.g., in a multi-choice questionnaire the answer will be one or more options. Given a question with the correct answer being OPTION A & OPTION B and a second question with a correct answer of OPTION B & OPTION C then one could consider stuffing each answer into a single column. In this example would could store "B,C" and "B & C" if one accepts there is no required case to compare answers with each other.
The only situation I may consider using a custom class as the 'primary key' is using a OR/M against a legacy database that uses composite keys.
On the first not sure what you are looking for. I create business objects as classes that map to the my data layer, which is typically a datatable containing the data.
The second question is never. There are very few situations where I would store a comma separated list instead of creating a normalized data structure.
Related
I have to create a database structure. I have a question about foreing keys and good practice:
I have a table which must have a field that can be two different string values, either "A" or "B".
It cannot be anything else (therefore, i cannot use a string type field).
What is the best way to design this table:
1) create an int field which is a foreign key to another table with just two records, one for the string "A" and one for the string "B"
2) create an int field then, in my application, create an enumeration such as this
public enum StringAllowedValues
{
A = 1,
B
}
3) ???
In advance, thanks for your time.
Edit: 13 minutes later and I get all this awesome feedback. Thank you all for the ideas and insight.
Many database engines support enumerations as a data type. And there are, indeed, cases where an enumeration is the right design solution.
However...
There are two requirements which may decide that a foreign key to a separate table is better.
The first is: it may be necessary to increase the number of valid options in that column. In most cases, you want to do this without a software deployment; enumerations are "baked in", so in this case, a table into which you can write new data is much more efficient.
The second is: the application needs to reason about the values in this column, in ways that may go beyond "A" or "B". For instance, "A" may be greater/older/more expensive than "B", or there is some other attribute to A that you want to present to the end user, or A is short-hand for something.
In this case, it is much better to explicitly model this as columns in a table, instead of baking this knowledge into your queries.
In 30 years of working with databases, I personally have never found a case where an enumeration was the right decision....
Create a secondary table with the meanings of these integer codes. There's nothing that compels you to JOIN that in, but if you need to that data is there. Within your C# code you can still use an enum to look things up but try to keep that in sync with what's in the database, or vice-versa. One of those should be authoritative.
In practice you'll often find that short strings are easier to work with than rigid enums. In the 1990s when computers were slow and disk space scarce you had to do things like this to get reasonable performance. Now it's not really an issue even on tables with hundreds of millions of rows.
I have a table that has Constant Value...Is it better that I have this table in my Database(that is SQL)or have an Enum in my code and delete my table?
my table has only 2 Columns and maximum 20 rows that these rows are fixed and get filled once,first time that i run application.
I would suggest to create an Enum for your case. Since the values are fixed(and I am assuming that the table is not going to change very often) you can use Enum. Creating a table in database will require an unnecessary hit to the database and will require a database connection which could be skipped if you are using Enum.
Also a lot may depend on how much operation you are going to do with your values. For example: its tedious to query your Enum values to get distinct values from your table. Whereas if you will use table approach then it would be a simple select distinct. So you may have to look into your need and the operations which you will perform on these values.
As far as the performance is concerned you can look at: Enum Fields VS Varchar VS Int + Joined table: What is Faster?
As you can see, ENUM and VARCHAR results are almost the same, but join
query performance is 30% lower. Also note the times themselves –
traversing about same amount of rows full table scan performs about 25
times better than accessing rows via index (for the case when data
fits in memory!)
So, if you have an application and you need to have some table field
with a small set of possible values, I’d still suggest you to use
ENUM, but now we can see that performance hit may not be as large as
you expect. Though again a lot depends on your data and queries.
That depends on your needs.
You may want to translate the Enum Values (if you are showing it in GUI) and order a set of record based on translated values. For example: imagine you have a Employees table and a Position column. If the record set is big, and you want to sort or order by translated position column, then you have to keep the enum values + translations in database.
Otherwise KISS and have it in code. You will spare time on asking database for values.
I depends on character of that constants.
If they are some low level system constants that never should be change (like pi=3.1415) then it is better to keep them only in code part in some config file. And also if performance is critical parameter and you use them very often (on almost each request) it is better to keep them in code part.
If they are some constants (may be business constants) that can change in future it is Ok to put them in table - then you have more flexibility to change them (for instance from admin panel).
It really depends on what you actually need.
With Enum
It is faster to access
Bound to that certain application. (although you can share by making it as reference, but it just does not look as good as using DB)
You can use in switch statement
Enum usually does not care about value and it is limited to int.
With DB
It is slower, because you have to make connection and query.
The data can be shared widely.
You can set the value to be anything (any type any value).
So, if you will use it only on certain application, Enum is good enough. But if several applications are going to use it, then DB would be better option.
I am wondering which method is the best way to store a list of integers in a sql column.
.....i.e. "1,2,3,4,6,7"
EDIT: These values represent other IDs in SQL tables. The row would look like
[1] [2]
id, listOfOtherIDs
The choices I have researched so far are:
A varchar of separated value that are "explode-able" i.e. by commas or tabs
An XML containing all the values individually
Using individual rows for each value.
Which method is the best method to use?
Thanks,
Ian
A single element of a record can only refer to one value; it's a basic database design principle.
You will have to change the database's design: use a single row for each value.
You might want to read up on normalization.
As is shown here in the description of the first normal form:
First normal form states that at every row and column intersection in the table there, exists a single value, and never a list of values. For example, you cannot have a field named Price in which you place more than one Price. If you think of each intersection of rows and columns as a cell, each cell can hold only one value.
While Jeroen's answer is valid for "multi-valued" attributes, there are genuine situations where multiple comma-separated values may actually be representing one large value. Things like path data (on a map), integer sequence, list of prime factors and many more could well be stored in a comma-separated varchar. I think it is better to explain what exactly are you storing and how do you need to retrieve and use that value.
EDIT:
Looking at your edit, if by IDs you mean PK of another table, then this sounds like a genuine M-N relation between this table and the one whose IDs you're storing. This stuff should really be stored in a separate gerund, which BTW is a table that would have the PK of each of these tables as FKs, thus linking the related rows of both tables. So Jeroen's answer very well suits your situation.
I have this problem and I don't know what is the best solution for it.
I have table called Employees and there is column called LastWork, this column should only have custom values I choose for example:
value 1
value 2
and I want the user to select the value from ComboBox control so I have 2 ideas for it but I don't know what is the best for it.
A - add these value to Combobox as string in Items property and store them as string in DB.
B - create separate table in my db called for example 'LastWork' with 2 columns 'LastWorkID', 'LastWorkName' and insert my values in it, and then I can add binding source control and I can use data bound items to store the id as integer in my main table and show the LastWorkName for users.
I prefer to use the B method because in some forms I have DataGridView control with edit permission, and I want to display Combobox in it instead of Textbox to select from these custom values.
I hope you understood my questions.
Normally data normalization is a good thing, so I too would go with your option B.
By having a separate table and a foreign key relationship to it, you can enforce data integrity; easily get a list of all available (not just all selected) options; have a single place in which to change the text of an option (what if someone decides to call it "value one" instead of "value 1", for example?); and so on and so forth.
These might not be huge benefits in a small application and with only two possible options, but we all know that applications very often tend to grow in scope over time.
In a normalized database, your "option B" is usually the way to go because it eliminates duplicate data. It will potentially introduce an additional join into your queries when you need the name (and not just the ID), but it also allows you to rename lookup names easily without altering their underlying IDs.
For performance reasons, it's often a good idea to cache lookup values such as you describe in the business tier so that your lookup table is not hit over and over again (such as when building many rows of a grid).
I would always save them in the db. If you have to localize your app, this helps alot. Additonally, it let you to apply the referential integrity checks of the database.
I have an application that I need to query lifetables (for insurance calculation).
I was thinking about using XML to store the data, but thought it was a little big, but maybe a little small for using a full-fledged database. So I chose to use SQLite.
In my application, I have enums defining a few different things. For example, GENDER.Male, GENDER.Female. and JOBTYPE.BlueCollar, JOBTYPE.WhiteCollar. etc etc.
I have some methods that look like this: (example)
FindLifeExpectancy(int age, GENDER gender);
FindDeathRate(int age, JOBTYPE jobType);
So my question is: How do you model enums in a database? I don't think it is best practice to use 0 or 1 in the database to store JOBTYPE because that would be meaningless to anyone looking at it. But if you used nvarchar, to store "BlueCollar", there would be a lot of duplicate data.
I don't think GENDER or JOBTYPE should have an entire class, or be apart of the entity model because of the little information they provide.
How is this normally done?
Thanks.
I prefer to statically map my enums in my program to a lookup table in my database. I rarely actually use the lookup table to do a join. As an example I might have the following tables:
Gender
GenderID Name
1 Male
2 Female
Accounts
AccountID GenderID FirstName LastName
1 1 Andrew Siemer
2 2 Jessica Siemer
And in code I would then have my enum defined with the appropriate mapping
public enum Gender
{
Male = 1,
Female = 2
}
Then I can use my enum in code and when I need to use the enum in a LINQ to SQL query I just get its physical value like this
int genderValue = (int)Enum.Parse(typeof(Gender), Gender.Male));
This method may make some folks out there a bit queezy though given that you have just coupled your code to values in your database! But this method makes working with your code and the data that backs that code much easier. Generally, if someone swaps out the ID of a lookup table, you are gonna be hosed in some way or another given that it is mapped across your database any how! I prefer the readability and ubiquitous nature of this design though.
While it's unlikely that you will be adding a new gender, I wouldn't be so sure about the jobtype enum. I'd have used a separate table for both, and have foreign keys to this table every where I need to reference them. The schema will be extensible, the database will automatically check that only possible values are saved in the referencing tables.
The SQL equivalent of 'enums' are lookup tables. These are tables with two (sometimes more) columns:
a code, typically short, numeric or character (ex: 'R', 'S', 'M'...)
a text definition (ex: 'Retired', 'Student', 'Military'...)
extra columns can be used to store definitions, or alternate versions of the text for example a short abbreviation for columnar reports)
The short code is the type of value stored in the database, avoiding the replication you mentioned. For relatively established categories (say Male/Female), you may just use a code, without 'documenting' it in a lookup table.
If you have very many different codes, it may be preferable to keep their lookup in a single SQL table, rather than having a proliferation of dozen of tables. You can simply add a column that is the "category", which itself is a code, designating the nature of the group of codes defined in this category ("marital status", "employment", "education"...)
The info from the lookup tables can be used to populate drop downs and such, in the UI, wherey the end-user sees the clear text but the application can use the code to query the database. It is also used in the reverse direction, to produce the clear text for codes found in the database, for displaying results list and such.
A JOIN construct at the level of SQL is a convenient way to relate the lookup table and the main table. For example:
SELECT Name, Dob, M.MaritalStatus
FROM tblCustomers C
LEFT OUTER JOIN tblMaritalLkup M ON C.MStatus = M.Code
WHERE ...