I have a table that has Constant Value...Is it better that I have this table in my Database(that is SQL)or have an Enum in my code and delete my table?
my table has only 2 Columns and maximum 20 rows that these rows are fixed and get filled once,first time that i run application.
I would suggest to create an Enum for your case. Since the values are fixed(and I am assuming that the table is not going to change very often) you can use Enum. Creating a table in database will require an unnecessary hit to the database and will require a database connection which could be skipped if you are using Enum.
Also a lot may depend on how much operation you are going to do with your values. For example: its tedious to query your Enum values to get distinct values from your table. Whereas if you will use table approach then it would be a simple select distinct. So you may have to look into your need and the operations which you will perform on these values.
As far as the performance is concerned you can look at: Enum Fields VS Varchar VS Int + Joined table: What is Faster?
As you can see, ENUM and VARCHAR results are almost the same, but join
query performance is 30% lower. Also note the times themselves –
traversing about same amount of rows full table scan performs about 25
times better than accessing rows via index (for the case when data
fits in memory!)
So, if you have an application and you need to have some table field
with a small set of possible values, I’d still suggest you to use
ENUM, but now we can see that performance hit may not be as large as
you expect. Though again a lot depends on your data and queries.
That depends on your needs.
You may want to translate the Enum Values (if you are showing it in GUI) and order a set of record based on translated values. For example: imagine you have a Employees table and a Position column. If the record set is big, and you want to sort or order by translated position column, then you have to keep the enum values + translations in database.
Otherwise KISS and have it in code. You will spare time on asking database for values.
I depends on character of that constants.
If they are some low level system constants that never should be change (like pi=3.1415) then it is better to keep them only in code part in some config file. And also if performance is critical parameter and you use them very often (on almost each request) it is better to keep them in code part.
If they are some constants (may be business constants) that can change in future it is Ok to put them in table - then you have more flexibility to change them (for instance from admin panel).
It really depends on what you actually need.
With Enum
It is faster to access
Bound to that certain application. (although you can share by making it as reference, but it just does not look as good as using DB)
You can use in switch statement
Enum usually does not care about value and it is limited to int.
With DB
It is slower, because you have to make connection and query.
The data can be shared widely.
You can set the value to be anything (any type any value).
So, if you will use it only on certain application, Enum is good enough. But if several applications are going to use it, then DB would be better option.
Related
Just out of curiosity, how exactly does SELECT * FROM table WHERE column = "something" works?
Is the underlying principle same as that of a for/foreach loop with an if condition like:
for (iterator)
{
if(condition)
//print results
}
If am dealing with , say 100 records, will there be any considerable performance difference between the 2 approaches in getting the desired data I want ?
SQL is a 4th generation language, which makes it very different from programming languages. Instead of telling the computer how to do something (loop through rows, compare columns), you tell the computer what to do (get the rows matching a condition).
The DBMS may or may not use a loop. It could as well use hashes and buckets, pre-sort a data set, whatever. It is free to choose.
On the technical side, you can provide an index in the datebase, so the DBMS can look up the keys to quickly to access the rows (like quickly finding names in a telephone book). This gives the DBMS an option how to acces the data, but it is still free to use a completely different approach, e.g. read the whole table sequentially.
I have to create a database structure. I have a question about foreing keys and good practice:
I have a table which must have a field that can be two different string values, either "A" or "B".
It cannot be anything else (therefore, i cannot use a string type field).
What is the best way to design this table:
1) create an int field which is a foreign key to another table with just two records, one for the string "A" and one for the string "B"
2) create an int field then, in my application, create an enumeration such as this
public enum StringAllowedValues
{
A = 1,
B
}
3) ???
In advance, thanks for your time.
Edit: 13 minutes later and I get all this awesome feedback. Thank you all for the ideas and insight.
Many database engines support enumerations as a data type. And there are, indeed, cases where an enumeration is the right design solution.
However...
There are two requirements which may decide that a foreign key to a separate table is better.
The first is: it may be necessary to increase the number of valid options in that column. In most cases, you want to do this without a software deployment; enumerations are "baked in", so in this case, a table into which you can write new data is much more efficient.
The second is: the application needs to reason about the values in this column, in ways that may go beyond "A" or "B". For instance, "A" may be greater/older/more expensive than "B", or there is some other attribute to A that you want to present to the end user, or A is short-hand for something.
In this case, it is much better to explicitly model this as columns in a table, instead of baking this knowledge into your queries.
In 30 years of working with databases, I personally have never found a case where an enumeration was the right decision....
Create a secondary table with the meanings of these integer codes. There's nothing that compels you to JOIN that in, but if you need to that data is there. Within your C# code you can still use an enum to look things up but try to keep that in sync with what's in the database, or vice-versa. One of those should be authoritative.
In practice you'll often find that short strings are easier to work with than rigid enums. In the 1990s when computers were slow and disk space scarce you had to do things like this to get reasonable performance. Now it's not really an issue even on tables with hundreds of millions of rows.
Suppose i have one table that holds Blogs.
The schema looks like :
ID (int)| Title (varchar 50) | Value (longtext) | Images (longtext)| ....
In the field Images i store an XML Serialized List of images that are associated with the blog.
Should i use another table for this purpose?
Yes, you should put the images in another table. Having several values in the same field indicates denormalized data and makes it hard to work with the database.
As with all rules, there are exceptions where it makes sense to put XML with multiple values in one field in the database. The first rule is that:
The data should always read/written together. No need to read or update just one of the values.
If that is fulfilled, there can be a number of reasons to put the data together in one field:
Storage efficiency, if space has proved to be a problem.
Retrieval efficiency, if performance has proved to be a problem.
Schema flexilibity; where one XML field can eliminate tens or hundreds of different tables.
I would certainly use another table. If you use XML, what happens when you need to go through and update the references to all images? (Would you just rather do an Update blog_images Set ..., or parse through the XML for each row, make the update, then re-generate the updated XML for each?
Well, it is a bit "inner platform", but it will work. A separate table would allow better image querying, although on some RDBMS platforms this could also be achieved via an XML-type column and SQL/XML.
If this data only has to be opaque storage, then maybe. However, keep in mind you'll generally have to bring back the entire XML to the app-tier to do anything interesting with it (or: depending on platform, use SQL/XML, but I advise against this, as the DB isn't the place to do such processing in most cases).
My advice in all other cases: separate table.
That depends on whether you'd need to query on the actual image data itself. If you see a possible need to query on certain images, or images with certain attributes, then it would probably be best to store that image data in a different way.
Otherwise, leave it the way it is.
But remember, only include the fields in your SELECT when you need them.
Should i use another table for this purpose?
Not necessarily. You just have to ensure that you are not selecting the images field in your queries when you don't need it. But if you wanted to denormalize your schema you could use another table and when you need the images perform a join.
I have people and places data as:
Person entity has
IList<DateRangePlaces> each having
IList<Place> of possible places
Schedule day pattern as ie. 10 days available 4 unavailable
Within a particular DateRangePlaces date range one has to obey to Schedule pattern whether person can go to a particular place or not.
Place entity has
IList<DateRangeTiming> each defining opening/closing times within each date range
Overlapping date ranges work as LIFO. So for each day that has already been defined previously new timing definition takes preference.
The problem
Now I need to do something like this (in pseudo code):
for each Place
{
for each Day between minimum and maximum date in IList<DateRangeTiming>
{
get a set of People applicable for Place and on Day
}
}
This means that number of steps to execute my task is approx.:
∑(places)( ∑(days) × ∑(people) )
This to my understanding is
O(x × yx × z)
and likely approximates to this algorithm complexity:
O(n3)
I'm not an expert in theory so you can freely correct my assumptions. What is true is that this kind of complexity is definitely not acceptable especially given the fact that I will be operating over long date ranges with many places and people.
From the formula approximation we can see that people set would be iterated lots of times. Hence I would like to optimize at least this part. To ease things a bit I changed
Person.IList<DateRangePlaces>.IList<Place>
to
Person.IList<DateRangePlaces>.IDictionary<int, Place>
which would give me a faster result whether a person can go to some place on particular date, because I would only check whether Place.Id is present in the dictionary versus IList.Where() LINQ clause that would have to scan the whole list each and every time.
Question
Can you suggest any additional optimizations I could implement into my algorithm to make it faster or even make it less complex in terms of the big O notation?
Which memory structure types would you use where and why (lists, dictionaries, stacks, queues...) to improve performance?
Addendum: The whole problem is even more complex
There're also additional complexities that I didn't mention since I wanted to simplify my question to make it more clear. So. There's also:
Place.IList<Permission>
Person.IList<DateRangePermission>
So places require particular permissions and people have a limited time permission grants that expire.
Additional to that, there's also
Person.IList<DateRangeTimingRestriction>
which tells only particular times that person can go somewhere during particular date range. And
Person.IList<DateRangePlacePriorities>
Which defines place prioritization for a particular date range.
And during this process of getting applicable people I also have to calculate certain factor per every person per every place that's related to the:
number of places that a person can visit on particular day
person's place priority factor on that particular day
All these are the reasons why I decided to rather manipulate this data in memory than using a very complex stored procedure that would also be doing multiple table scans to get factors per person and place and day.
I think such stored procedure would be way to complex to handle and maintain. So I rather get all the data first (put it appropriate memory structures to aid performance) and then mangle with it in memory.
I suggest using a relational database and writing a stored procedure to retrieve the "set of People applicable for Place and on Day".
The stored procedure approach would not be complex nor difficult to maintain if the model is architected properly. Additionally, relational databases have primary keys and indexing to avoid table scans.
The only way to speed things up using collections would be:
change the collection type. You could use a KeyedCollection, IDictionary<> or even a disconnected recordset. Disconnected recordsets also give you the ability to set foreign keys to child recordsets, however I think this would be a fairly complex pattern to use.
maintain a collection within a collection - basically the same concept as a parent / child relationship with a foreign key. The object references will only be pointers to the original object's memory space or, if you're using a keyed collection you could simply store the index of the other collection.
maintain boolean properties that can allow you to skip iterations if true or false. For example, as you build your entities, set a boolean of "HasPlaceXPermission". if the value is false, you know not to retrieve information related to place X.
maintain flags - flags can be a very good optimization technique when used properly. Similar to #3, flags can be used to determine permissions very quickly, for example if((person.PlacePermissions & (Place.Colorado | Place.Florida) > 0) // do date/time scan on Colorado and Florida, else don't.
It's difficult to know which collection types I would use based upon the information you have provided, I would need a larger scope of the application to determine that architecturally. For example, where is the data stored, how is it retrieved, how is it prepared and how is it presented? Knowing how the application is architected would help to determine its optimization points.
You can't avoid O(n^2) as the minimal iteration you need is to pass every Place and every Date element to find a match for a given Person.
I think the best way is to use a DB similar to SQL server and run your query in SQL as a store procedure.
The date range is presumably fairly limited, perhaps never more than a few years. Call it constant. When you say, for each of those combinations, you need to "get a set of people applicable", then it's pretty clear: if you really do need to get all that data, then you can't improve the complexity of you solution, because you need to return a result for each combination.
Don't worry about complexity unless you're having trouble scaling with large numbers of people. Ordinary profiling is the place to start if you're having performance problems. O(#locations * #people) is not so bad.
I want to allow the user to add columns to a table in the UI.
The UI: Columns Name:______ Columns Type: Number/String/Date
My Question is how to build the SQL tables and C# objects so the implementation will be efficient and scalable.
My thought is to build two SQL tables:
TBL 1 - ColumnsDefinition:
ColId, ColName, ColType[Text]
TBL 2 - ColumnsValues:
RowId, ColId, Value [Text]
I want the solution to be efficient in DB space,
and I want to allow the user to sort the dynamic columns.
I work on .NET 3.5 / SQL Server 2008.
Thanks.
I believe that is essentially how the WebParts.SqlPersonalizationProvider works, which doesn't necessarily mean it's the best, but does mean that after some smart people thought about it for a while, that's what they came up with.
Sorting on a given field will be a bit tricky, particularly if the field text need a non-text sorting (i.e., if you want "2" to come before "10").
I'd suggest that from C#, you do one query on ColumnsDefinition, and based on that, choose one of several different queries for selecting/sort the data.
Add a DefaultValue to your ColumnDefinition. Only add a value in ColumnsValues if the value is not the default value. This will speed up things a lot.
The thing I hate about these kind of systems is that it is very difficult to transfer changes betwween dev/stage/production because you will have to keep structure and content of tables in sync.