I am not very proficient at SQL yet. I'm learning, but it's a slow process. I am working on a project at work which stores a good deal of information in a database in SQL Server. In one of the tables, ContactInformation, we're experiencing an error when an attempt to modify an entry runs afoul because a nonclustered index composed of all of the address information exceeds 900 bytes. I've used sys.dm_db_index_usage_stats to verify that modifying an entry in the table leads to 3 user_seeks and 1 user_update.
The C# code does not seem to be directly calling the index. It executes a single DbCommand that consists of a stored procedure command of the Update variety with 19 parameters. My thoughts are to either eliminate the index or to try to break up the DbCommand into multiple updates with a smaller number of parameters in hopes of having a smaller index to work with.
I am a bit at sea due to my lack of experience. I welcome any advice on which way to turn next.
The Index consists of the following:
| Name | Data Type | Size |
|----------------------|---------------|------|
| ContactInformationID | int | 4 |
| CompanyID | smallint | 2 |
| Address1 | nvarchar(420) | 840 |
| Address2 | nvarchar(420) | 840 |
| City | nvarchar(420) | 840 |
| State | nvarchar(220) | 440 |
| PostalCode | nvarchar(120) | 240 |
| Country | nvarchar(220) | 440 |
Yes, most of the columns are oversized. We apparently inherited this database from a different project. Our software limits most of the columns to no more than 100 characters, although there are some outliers.
The index size limit only applies to the key columns. It applies to all B-Tree bases storage modes (NCI and CI). This limit exists to ensure a certain degree on tree fanout in order to bound the tree height.
If you don't need to seek on columns such as Address1 and Address2 (considering that they might be null as well) make those columns included columns.
The index key should never be longer than the shortest key prefix that results in a unique index. Every column after that never helps compared to that column being included.
If ContactInformationID is unique, which I have a feeling it very well could be, then having any other fields in the index is pointless.
Such an index is useful only for queries where the value of ContactInformationID is present as a query parameter, and when it is, the rest of the fields are immaterial.
Related
Here is the scenario:
Config Table:
+--------+-----------+-------+
| Prefix | Separator | Seed |
+--------+-----------+-------+
| A | # | 10000 |
+--------+-----------+-------+
Transaction Table:
+----+----------+------+
| Id | SerialNo | Col3 |
+----+----------+------+
| 1 | A#10000 | |
| 2 | A#10001 | |
+----+----------+------+
The Transaction table has a SerialNo column that has a sequential number generated based on configuration table. Configuration table determines the prefix separator and the seed value of the serial number.
In the above example the serial number would start at A#10000 and increment by 1.
But if after few months someone updates the configuration table to have
+--------+-----------+-------+
| Prefix | Separator | Seed |
+--------+-----------+-------+
| B | # | 10000 |
+--------+-----------+-------+
Then the Transaction table is supposed to look something like this:
+----+----------+------+
| Id | SerialNo | Col3 |
+----+----------+------+
| 1 | A#13000 | |
| 2 | B#10001 | |
+----+----------+------+
However there could be no duplicate serial numbers at any given point in time in Transaction table.
If someone sets Prefix back to A and seed to 10000 then the next serial number should not be A#10000 because it already exists. It should be A#13001
One could simply write a select query with MAX() and CONCAT() by then it could cause issues with concurrency. Don't want to have duplicate serial numbers. Also, would want to have this as performance friendly as possible.
Another solution that I could come up with is that I create a windows service that will keep on running and watching the table. The records get inserted with null as serial number and the windows service will update the serial number. This way there will be no concurrency issues but then I am not sure how reliable this is. There will be delays.
There will only be one entry in configuration table at any given point in time.
You can solve the seed value problem quite easily in SQL Server. When someone updates the seed value back to 10000 you will need to do this via a stored procedure. The stored procedure then determines what the actual next available value should be because clearly 10000 could be the wrong value. The stored procedure then executes DBCC CHECKIDENT with the correct "new_reseed_value". Then when new records are inserted the server will handle the values again correctly.
Please look at this link for usage on the DBCC CHECKIDENT command. SQL Server DBCC CHECKIDENT
I am struggling with a simple update statement in Oracle. The update itself has not changed in forever but the table has grown massively and the performance is now unacceptable.
Here is the low down:
70 columns
27 indexes (which I am not under any circumstances allowed to reduce)
50M rows
Update statement is just hitting one table.
Update statement:
update TABLE_NAME
set NAME = 'User input string',
NO = NO,
PLANNED_START_DATE = TO_DATE('3/2/2016','dd/mm/yyyy'),
PLANNED_END_DATE = TO_DATE('3/2/2016','dd/mm/yyyy'),
WHERE ID = 999999 /*pk on the table*/
Execution Plan:
==================
Plan hash value: 2165476569
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | 245 | 1 (0)| 00:00:01 |
| 1 | UPDATE | TABLE_NAME | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TABLE_NAME | 1 | 245 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_INDEX | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID"=35133238)
==================================================
The update statement originates in a C# application but I am free to change the statement there.
Select statements still perform well thanks to all the indexes but as I see it that is exactly what is wrong with the update - it has to go update all the indexes.
We are licensed for partitioning but this table is NOT partitioned.
How can I improve the performance of this update statement without altering the table or its indexes?
Are you sure that column id is primary key? And is primary key based on unique index? Because in this case CBO would use INDEX UNIQUE SCAN. In your plan CBO expected 188 rows using filter ID (primary kay) = value and uses INDEX RANGE SCAN
I am designing a database table that will hold the column headers of many differently-formatted Excel files. I need to do this because I ultimately need to know the "format" of the Excel files that will be created dynamically upon the user needing certain reports.
I am wondering if there is a common practice/pattern on doing this, i.e. what format is best to essentially store a Dictionary<key, value> in a database? Maybe XML? Individual rows with a 2-column design (index, value)? Or how do I story a Dictionary or List in a database?
Say my Excel file looks like this:
| FirstName | LastName | PhoneNo |
I need to store the three cells, i.e. their corresponding index and value, e.g. [0, "FirstName"], [1, "LastName"], [2, "PhoneNo"].
I am thinking this can be stored in my Business Object as a List<int, string> ColumnHeaders, but am not sure how best to store this in a database (SQL Server) since when working with List<> objects in the past, they usually correspond to rows in the database and it does not make sense (at the moment at least) to store all of these column headers in each row, i.e. something like this:
ID | ProjectID | ColIndex | ColValue
1 | 32 | 0 | FirstName
2 | 32 | 1 | LastName
3 | 32 | 2 | PhoneNo
4 | 54 | 0 | Name
5 | 54 | 1 | City
6 | 54 | 2 | State
7 | 54 | 3 | Country
Any suggestions/tips?
Preamble
I have been investigating a concept and what I am posting below is a cut down version of what I have been trying. If you look at it and think "That doesn't make any sense to do it that way" then it is probably because I doesn't make any sense - there may be more efficient ways of doing this. I just wanted to try this out because it looks interesting.
What I am attempting to do it to calculate arbitrary calculations using CLR custom aggregations in SQL using a Reverse-Polish-like implementation. I'm using the data:
K | Amt | Instruction | Order
--+-----+-------------+------
A | 100 | Push | 1
A | 1 | Multiply | 2
A | 10 | Push | 3
A | 2 | Multiply | 4
A | | Add | 5
A | 1 | Push | 6
A | 3 | Multiply | 7
A | | Add | 8
The result of the calculation should be 123 ( = (100 * 1) + (10 * 2) + (1 * 3) ).
Using the following SQL (and the CLR functions ReversePolishAggregate and ToReversePolishArguments* that I have written) I can get the correct result:
SELECT K
, dbo.ReversePolishAggregate(dbo.ToReversePolishArguments(Instruction, Amt))
FROM dbo.ReversePolishCalculation
GROUP BY K;
The Problem
I want to generalise the solution more by putting the instructions and order in a separate table so that I can create calculations on arbitrary data. For example, I was thinking of a table like this:
Item | Type | Amount
-----+----------+-------
A | Budgeted | 10
A | Actual | 12
B | Actual | 20
B | Budgeted | 18
and joining it to a calculation table like this
Type | Instruction | Order
---------+-------------+------
Budgeted | Push | 1
Actual | Minus | 2
to calculated whether each item is over or under budget. The important consideration is that minus is non-commutative so I need to specify the order to ensure that the actual amount is subtracted from the budgeted amount, not the other way around. I expected that I would be able to do this with the ORDER BY clause inside the OVER clause of the aggregation (and then a little more tweaking that result).
SELECT K
, dbo.[ReversePolishAggregate](
dbo.[ToReversePolishArguments](Instruction, Amt))
OVER (PARTITION BY K ORDER by [Order])
FROM dbo.ReversePolishCalculation;
However I get the error:
Incorrect syntax near the keyword 'ORDER'.
I have checked the syntax by running the following SQL statement
SELECT K
, SUM(Amt) OVER (PARTITION BY K ORDER BY [Order])
FROM dbo.ReversePolishCalculation;
This works fine (it parses and runs, although I'm not sure that the result is meaningful), so I am left assuming that this is a problem with custom CLR aggregations or functions.
My Questions Is this supposed to work? Is there any documentation saying explicitly that this is not supported? Have I got the syntax right?
I'm using Visual Studio 2012 Premium, SQL Server Management Studio 2012 and .NET framework 4.0.
* I created 2 CLR functions as a work-around to not being able to pass multiple arguments into a single aggregation function - see this article.
EDIT: This post looks like it is not supported, but I was hoping for something a little more official.
As an answer:
Officially from Microsoft, No.
http://connect.microsoft.com/SQLServer/feedback/details/611026/add-support-for-over-order-by-for-clr-aggregate-functions
It doesn't appear to have made it into 2014 (or 2016 CTP 3) ... no mentions of many transact-sql changes:
http://msdn.microsoft.com/en-us/library/bb510411.aspx#TSQL
I need an idea. I have an app, a winform having multiple tabs in it. There are a bunch of people using it, but none of them needs to use all the tabs, just a couple of them. I've reached a point where it's hard to handle from the source code, so I need a solution to easily manage the permissions. The best would be to use an SQL table for this as I also have to provide for another guy the possibility to modify the rights. I think it would be fine to simply remove the tabs by creating an sql table like this, and at the program startup simply query something like this:
select tabid from table where loggedinuser = 0
and then just loop through the result and remove all of them
foreach(tabid in tabids)
{
tabControl1.TabPages.RemoveByKey(tabid);
}
table:
+----------+----------+-------+-------+-------+
| tabid | name | user1 | user2 | user3 |
+----------+----------+-------+-------+-------+
| tabPage1 | project1 | 0 | 1 | 0 |
+----------+----------+-------+-------+-------+
| tabPage2 | project2 | 1 | 0 | 1 |
+----------+----------+-------+-------+-------+
| tabPage3 | project3 | 1 | 0 | 0 |
+----------+----------+-------+-------+-------+
However I somehow feel that this is not an elegant solution, especially because you have create a new column each time a new guy has to be added. Do you have any idea how to solve it?
I think this is an issue of the database's design, and a basic one; perhaps you need to improve your understanding of SQL databases, particularly relationships and primary/foreign keys. You shouldn't add new columns but new rows.
You need a table for the users, one for the tabs and one to connect the two. Such as this:
User:
+---------+------+
| user_id | name |
+---------+------+
| 1 | John |
+---------+------+
| 2 | Jane |
+---------+------+
Tab:
+--------+----------+
| tab_id | title |
+--------+----------+
| 1 | Articles |
+--------+----------+
| 2 | Products |
+--------+----------+
UserTab:
+---------+--------+---------+
| user_id | tab_id | enabled |
+---------+--------+---------+
| 1 | 1 | 1 |
+---------+--------+---------+
| 1 | 2 | 0 |
+---------+--------+---------+
| 2 | 1 | 0 |
+---------+--------+---------+
| 2 | 2 | 1 |
+---------+--------+---------+
In this example, John can only access Articles and Jane can only access Products.
You should get the ID of the current user and get the entries from UserTab, then remove the tabs that correspond to the IDs for which enabled=0.
You also should make a "default" choice for when the right combination of user and tab doesn't exist in the UserTab table: either display the tab by default or hide it by default.
If you do it through SQL, a simple data model could be :
USER TABLE would have fields user_id,username,... all USER related fields you wish
ROLE TABLE would have fields role_id,role_name
USERROLE TABLE would have fields f_user_id,f_role_id(both foreign keys)
Each record (line) in this table links a user to a role, so a user can have many roles, and a role can be attribuated to many users. That's called a many-to-many relationship
ROLERIGHT TABLE would have fields f_role_id,tabid
Each record (line) in this table links a role to a tab that this role has access to. That means if a role should access all tabs and you've got 10 tabs, you'll have 10 lines with the same role_id and a different tabid from 1 to 10. It is also a many-to-many relationship.
This is quite an usual database pattern for access right management I guess. Now what you have to do is define the several roles. And assign it to the different users. If a new user comes in and he should have the same rights as another user, you just have to assign him the same role(s) as the other user. Depending on complexity and the number of possible tabs/users combinations, you will or not have many roles with few rights, or a few roles with access to several tabs. The latter would probably be the case for a limited number of users, but the good thing is that you could easily scale up without changing the model, only the data.