Performance issue with SQL Server stored procedure - c#

I used the ANTS profiler to identify the remaining bottleneck in my C# application: the SQL Server stored procedure. I am using SQL Server 2008. Can anybody here help me increase performance, or give me pointers as to what I can do to make it better or more performant?
First, here's the procedure:
PROCEDURE [dbo].[readerSimilarity]
-- Add the parameters for the stored procedure here
#id int,
#type int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
IF (#type=1) --by Article
SELECT id1, id2, similarity_byArticle FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_byArticle != 0
ELSE IF (#type=2) --by Parent
SELECT id1, id2, similarity_byParent FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_byParent != 0
ELSE IF (#type=3) --by Child
SELECT id1, id2, similarity_byChild FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_byChild != 0
ELSE IF (#type=4) --combined
SELECT id1, id2, similarity_combined FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_combined != 0
END
The table 'similarity' consists of two ids (id1 and id2) and a number of columns that store double values. The constraint is that id1 < id2.
Column Data
----- ----
ID1 PK, Indexed
ID2 PK, Indexed
The table contains 28.5 million entries.
Stored Procedure Background
The job of the stored procedure is to get all the rows that have the parameter id in either id1 or id2. Additionally, the column specified by the type-parameter cannot be zero.
The stored procedure is called multiple times for different ids. Although only taking ~1.6 ms per call, it sums up, when calling it 17,000 times.
The processor is running at only 25%, which seems to be because the application is waiting for the procedure call to return.
Do you see any way to speed things up?
Calling the Stored Procedure C# Code Snippet
private HashSet<NodeClustering> AddNeighbourNodes(int id)
{
HashSet<NodeClustering> resultSet = new HashSet<NodeClustering>();
HashSet<nodeConnection> simSet = _graphDataLoader.LoadEdgesOfNode(id);
foreach (nodeConnection s in simSet)
{
int connectedId = s.id1;
if (connectedId == id)
connectedId = s.id2;
// if the corresponding node doesn't exist yet, add it to the graph
if (!_setNodes.ContainsKey(connectedId))
{
NodeClustering nodeToAdd = CreateNode(connectedId);
GraphAddOuter(nodeToAdd);
ChangeWeightIntoCluster(nodeToAdd.id, s.weight);
_bFlowOuter += s.weight;
resultSet.Add(nodeToAdd);
}
}
// the nodes in the result set have been added
to the outernodes -> add to the outernodes count
_setNodes[id].countEdges2Outside += resultSet.Count;
return resultSet;
}
C# Code Background Information
This method is called each time a new id is added to the cluster. It gets all the connected nodes of that id (they are connected, when there is an entry in the db with id1=id or id2=id) via
_graphDataLoader.LoadEdgesOfNode(id);
Then it checks all the connected ids and if they are not loaded yet:
if (!_setNodes.ContainsKey(connectedId))
It Loads them:
CreateNode(connectedId);
The Method:
_graphDataLoader.LoadEdgesOfNode(id);
is called again, this time with the connectedId.
I need this to get all the connections of the new nodes with those nodes that are already in the set.
I probably could collect the ids of all nodes i need to add and call my stored procedure only once with a list of the ids.
Ideas
I could probably load the connected ids connection at once via something like
SELECT id1, id2, similarity_byArticle FROM similarity WHERE
(id1 = #id OR id2 = #id OR
id1 IN (SELECT id1 FROM similarity WHERE id2 = #id) OR
id2 IN (SELECT id1 FROM similarity WHERE id2 = #id) OR
id1 IN (SELECT id2 FROM similarity WHERE id1 = #id) OR
id2 IN (SELECT id2 FROM similarity WHERE id1 = #id))
AND similarity_byArticle != 0
but then I would get more entries than I'd need, because I would get them for already loaded nodes too (which from my tests would make up around 75% of the call).
Questions
How can I speed up the Stored Procedure?
Can I do it differently, is there a more performant way?
Can I use a List<int> as a SP-Parameter?
Any other thoughts?

If it runs that quickly, your problem is probably in the sheer number of repeated calls to the procedure. Is there a way that you could modify the stored procedure and code to return all the results the app needs in a single call?
Optimizing a query that runs in less than 2ms is probably not a fruitful effort. I doubt you will be able to shave more than fractions of a millisecond with query tweaks.

I'd try to change the application to only call this one time per ID, but if that is not possible, try this (make sure that there is an index on similarity.id1 and another index on similarity.id2):
PROCEDURE [dbo].[readerSimilarity]
-- Add the parameters for the stored procedure here
#id int,
#type int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
IF #type=1 --by Article
BEGIN
SELECT
id1, id2,similarity_byArticle
FROM similarity
WHERE id1 = #id AND similarity_byArticle!=0
UNION
SELECT
id1, id2,similarity_byArticle
FROM similarity
WHERE id2 = #id AND similarity_byArticle!=0
END
ELSE IF #type=2 --by Parent
BEGIN
SELECT
id1, id2,similarity_byParent
FROM similarity
WHERE id1 = #id AND similarity_byParent!=0
UNION
SELECT
id1, id2,similarity_byParent
FROM similarity
WHERE id2 = #id AND similarity_byParent!=0
END
ELSE IF #type=3 --by Child
BEGIN
SELECT
id1, id2,similarity_byChild
FROM similarity
WHERE id1 = #id AND similarity_byChild!=0
UNION
SELECT
id1, id2,similarity_byChild
FROM similarity
WHERE id2 = #id AND similarity_byChild!=0
END
ELSE IF #type=4 --combined
BEGIN
SELECT
id1, id2,similarity_combined
FROM similarity
WHERE id1 = #id AND similarity_combined!=0
UNION
SELECT
id1, id2,similarity_combined
FROM similarity
WHERE id2 = #id AND similarity_combined!=0
END
END
GO
EDIT based on OP's latest comment:
The whole graph is stored in the
MSSQL-Database and I load it
successively with the procedure into
some Dictionary structures
You need to redesign your load process. You should call the database just one time to load all of this data. Since the IDs are already in a Database table, you can use a join in this query to get the proper IDs from the other table. edit your question with the table schema that contain the IDs to graph, and how they relate to the already posted code. Once you get a single query to return all the data, it will be much faster that 17,000 calls for a single row each time.

Pass all the ids into the stored proc at once, using a delimited list (Use a comma or a slash or whatever, I use a pipe character [ | ]..
Add the User defined function (UDF) listed below to your database. It will convert a delimited list into a table which you can join to your similarity table. Then in your actual stored proc, you can write...
Create Procedure GetSimilarityIDs
#IdValues Text -- #IdValues is pipe-delimited [|] list of Id Values
As
Set NoCount On
Declare #IDs Table
(rowNum Integer Primary Key Identity Not Null,
Id Integer Not Null)
Insert Into #IDs(Id)
Select Cast(sVal As Integer)
From dbo.ParseString(#IdValues, '|') -- specify delimiter
-- ---------------------------------------------------------
Select id1, id2, similarity_byArticle
From similarity s Join #IDs i On i.Id = s.Id
Where similarity_byArticle <> 0
Return 0
-- ***********************************************************
The below code is to create the generic function UDF that can parse any text string into a table of string values...:
Create FUNCTION [dbo].[ParseTextString] (#S Text, #delim VarChar(5))
Returns #tOut Table
(ValNum Integer Identity Primary Key,
sVal VarChar(8000))
As
Begin
Declare #dLLen TinyInt -- Length of delimiter
Declare #sWin VarChar(8000) -- Will Contain Window into text string
Declare #wLen Integer -- Length of Window
Declare #wLast TinyInt -- Boolean to indicate processing Last Window
Declare #wPos Integer -- Start Position of Window within Text String
Declare #sVal VarChar(8000) -- String Data to insert into output Table
Declare #BtchSiz Integer -- Maximum Size of Window
Set #BtchSiz = 7900 -- (Reset to smaller values to test routine)
Declare #dPos Integer -- Position within Window of next Delimiter
Declare #Strt Integer -- Start Position of each data value within Window
-- -------------------------------------------------------------------------
If #delim is Null Set #delim = '|'
If DataLength(#S) = 0 Or
Substring(#S, 1, #BtchSiz) = #delim Return
-- ---------------------------
Select #dLLen = Len(#delim),
#Strt = 1, #wPos = 1,
#sWin = Substring(#S, 1, #BtchSiz)
Select #wLen = Len(#sWin),
#wLast = Case When Len(#sWin) = #BtchSiz
Then 0 Else 1 End,
#dPos = CharIndex(#delim, #sWin, #Strt)
-- ------------------------------------
While #Strt <= #wLen
Begin
If #dPos = 0 -- No More delimiters in window
Begin
If #wLast = 1 Set #dPos = #wLen + 1
Else
Begin
Set #wPos = #wPos + #Strt - 1
Set #sWin = Substring(#S, #wPos, #BtchSiz)
-- ----------------------------------------
Select #wLen = Len(#sWin), #Strt = 1,
#wLast = Case When Len(#sWin) = #BtchSiz
Then 0 Else 1 End,
#dPos = CharIndex(#delim, #sWin, 1)
If #dPos = 0 Set #dPos = #wLen + 1
End
End
-- -------------------------------
Set #sVal = LTrim(Substring(#sWin, #Strt, #dPos - #Strt))
Insert #tOut (sVal) Values (#sVal)
-- -------------------------------
-- Move #Strt to char after last delimiter
Set #Strt = #dPos + #dLLen
Set #dPos = CharIndex(#delim, #sWin, #Strt)
End
Return
End

First create a view
CREATE VIEW ViewArticles
AS
SELECT id1, id2, similarity_byArticle
FROM similarity
WHERE (id1 = #id or id2 = #id)
and similarity_byArticle != 0
In your code populate all the needed ids into a table.
Create a function which takes all the ids table as parameter.
CREATE FUNCTION
SelectArticles
(
#Ids TABLE
)
RETURNS TABLE
AS
RETURN
(
SELECT id1, id2, similarity_byArticle FROM ViewArticles
INNER JOIN #Ids I ON I.Id = id1
UNION
SELECT id1, id2, similarity_byArticle FROM ViewArticles
INNER JOIN #Ids I ON I.Id = id2
)

Related

Batch delete operation procedure not working

I have a stored procedure which looks like following:
alter procedure [dbo].[zsp_deleteEndedItems]
(
#ItemIDList nvarchar(max)
)
as
delete from
SearchedUserItems
WHERE EXISTS (SELECT 1 FROM dbo.SplitStringProduction(#ItemIDList,',') S1 WHERE ItemID=S1.val)
The parameter IDList is passed like following:
124125125,125125125...etc etc
And the split string function look like following:
ALTER FUNCTION [dbo].[SplitStringProduction]
(
#string nvarchar(max),
#delimiter nvarchar(5)
) RETURNS #t TABLE
(
val nvarchar(500)
)
AS
BEGIN
declare #xml xml
set #xml = N'<root><r>' + replace(#string,#delimiter,'</r><r>') + '</r></root>'
insert into #t(val)
select
r.value('.','varchar(500)') as item
from #xml.nodes('//root/r') as records(r)
RETURN
END
This is supposed to delete all items from table "SearcheduserItems" under the IDs:
124125125 and 125125125
But for some reason after I do a select to check it out:
select * from SearchedUserItems
where itemid in('124125125','125125125')
The records are still there...
What am I doing wrong here? Can someone help me out?
As mentioned in the comments, a different option would be to use a table type parameter. This makes a couple of assumptions (some commented), however, should get you on the right path:
CREATE TYPE dbo.IDList AS TABLE (ItemID int NOT NULL); --Assumed int datatype;
GO
ALTER PROC dbo.zsp_deleteEndedItems #ItemIDList dbo.IDList READONLY AS
DELETE SUI
FROM dbo.SearchedUserItems SUI
JOIN #ItemIDList IDL ON SUI.ItemID = IDL.ItemID;
GO
--Example of usage
DECLARE #ItemList dbo.IDList;
INSERT INTO #ItemList
VALUES(123456),(123457),(123458);
EXEC dbo.zsp_deleteEndedItems #ItemList;
GO
In regards to the question of an inline table value function, one such example is the below, which I quickly wrote up, that provides a tally table of the next 1000 numbers:
CREATE FUNCTION dbo.NextThousand (#Start int)
RETURNS TABLE
AS RETURN
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)
)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 + #Start AS I
FROM N N1 --10
CROSS JOIN N N2 --100
CROSS JOIN N N3; --1,000
GO
The important thing about an iTVF is that it has only one statement, and that is the RETURN statement. Declaring the table as a return type variable, inserting data into it, and returning that variable turns it into a multi-line TVF; which perform far slower.

SQL While loop or c# linq foreach?

I have a credit card payment process stored procedure written in SQL. This is using SQL while to loop through half million payment records. This is horribly slow, because i have to loop through another set of data for each of these payment records and update bunch of tables accordingly. That means while loop inside while loop in sql.
I have to re-write this in c# to improve the performance. We are using LINQ & Entity framework. Can anybody please suggest me if this is going to help or not?
Thanks
Updating my question with modified stored procedure. I have replaced the references to my tables and removed some part of the code. But let me know if this is not clear.
CREATE PROCEDURE [dbo].[USP_ProcessPayments]
(
#UserKey int
)
AS
BEGIN
SET NOCOUNT ON;
CREATE TABLE #TempUserPayments
(
-- with Columns declared
)
INSERT INTO #TempUserPayments
SELECT DISTINCT up.*, ROW_NUMBER() OVER(ORDER BY up.UserKey DESC, up.DateReceived) AS RN --generate row number
FROM UserPayment up
where up.UserKey = #UserKey
CREATE INDEX UP_1 on #TempUserPayments (RN);
DECLARE #LOOPCOUNTER INT, #ROWCOUNTER INT, #REQUESTEDAMOUNT MONEY, #UPDATEDREQUESTEDAMOUNT MONEY, #AMOUNT MONEY,
#AMOUNTNOTAPPLIED MONEY, #APPLIEDAMOUNT MONEY, #SYSTEMKEY INT, #LOOPCOUNTER_PR INT, #ROWCOUNTER_PR INT, #IS_UPDATE BIT
SELECT #ROWCOUNTER = COUNT(*) from #TempUserPayments;
SET #LOOPCOUNTER = 1
WHILE (#LOOPCOUNTER <= #ROWCOUNTER)
BEGIN
SELECT *,
ROW_NUMBER() OVER (ORDER BY mp.Month DESC, mp.PayPriority) AS RN_PR
INTO #TempPR
FROM Requests pr
INNER JOIN Premium mp ON mp.PremiumKey = pr.PremiumKey
--Joing couple of other tables here
ORDER BY mp.[Month] DESC, mp.PayPriority
CREATE INDEX PR_1 on #TempPR (RN_PR);
SELECT #SYSTEMKEY = SystemKey FROM #TempUserPayments WHERE RN = #LOOPCOUNTER;
SELECT #ROWCOUNTER_PR = COUNT(*) from #TempPR;
SET #LOOPCOUNTER_PR = 1
WHILE (#LOOPCOUNTER_PR <= #ROWCOUNTER_PR)
BEGIN
SET #IS_UPDATE = 0;
IF(#APPLIEDAMOUNT IS NULL OR (#REQUESTEDAMOUNT <> #APPLIEDAMOUNT AND #APPLIEDAMOUNT < #REQUESTEDAMOUNT))
BEGIN
IF(#SYSTEMKEY = 1 OR #SYSTEMKEY = 3)
BEGIN
--Check all the conditions here
SET #IS_UPDATE = 1;
END
IF(#IS_UPDATE = 1)
BEGIN
--Update bunch of requets tables ---
INSERT INTO userpaid
(
--Column names --
)
VALUES
(
--Values --
)
END
END
-- END
SET #LOOPCOUNTER_PR = #LOOPCOUNTER_PR + 1;
END
SET #LOOPCOUNTER = #LOOPCOUNTER + 1;
DROP TABLE #TempPR;
END
DROP TABLE #TempUserPayments;
END

ADO.NET stored procedure for inserting not working

I have a database which stores information about a library (books, authors & categories).
But I can't get my stored procedure to work for inserting data. The stored procedure itself executes fine, but when I perform a test, it simply doesn't add anything to the database. Can anyone see what I'm missing?
This is my stored procedure (for category):
USE MyLibrary
GO
IF EXISTS (SELECT 1 FROM sysobjects WHERE name = 'CategoryInsert' AND TYPE = 'P')
BEGIN
DROP PROC CategoryInsert
END
GO
CREATE PROCEDURE CategoryInsert
(
#Id int out,
#Name nvarchar(255),
#InsertedBy nvarchar(120),
#InsertedOn datetime
)
AS
DECLARE #CurrentId int
SELECT #CurrentId = Id FROM Category WHERE lower(#Name) = lower(#Name)
IF #CurrentId IS NOT NULL
BEGIN
SET #Id = -100
RETURN
END
INSERT INTO Category
(
Name,
InsertedBy,
InsertedOn
)
VALUES
(
#Name,
#InsertedBy,
#InsertedOn
)
SET #Id = SCOPE_IDENTITY()
GO
This is my test:
USE MyLibrary
GO
DECLARE #NewId int
DECLARE #date datetime
SET #date = getdate()
EXEC CategoryInsert #NewId, 'Testing', 'AL', #date
SELECT #NewId
GO
This line:
SELECT #CurrentId = Id FROM Category WHERE lower(#Name) = lower(#Name)
IF #CurrentId IS NOT NULL
The equality check will always return true because you're essentially comparing WHERE 1 = 1, which means that #CurrentID will always have a value and thus your stored procedure will always return before the INSERT happens.

using stored procedure in entity framework

I'm trying to add the following sproc to Entity Framework. After adding this via "Update from Model" the Model Browser shows this sproc in the "Function Imports" and "Stored Procedures/Functions" of the model. Using the "Edit" from Function Imports dialog I can not "Get Column Information" and unsuccessfully determine a Return type for the collection.
The output from the sproc is a temporary table but I do define the columns being returned. Is this sproc the problem? Am I missing a step in setting up the EF?
ALTER PROCEDURE [dbo].[addrApproxSP]
-- Add the parameters for the stored procedure here
#frontage bigint = 0,
#housedir varchar(1) = 0,
#streetnum bigint = 0,
#streetdir varchar(1) = 0,
#distance bigint = 0
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Set Variables
DECLARE #lowfront bigint,
#highfront bigint,
#lowstreet bigint,
#highstreet bigint,
#currname varchar(25),
#streetcur varchar(25),
#whereclause varchar(40),
#fixstreet varchar(25),
#pos int,
#piece varchar(500)
--Set variables to proper values in range
Set #lowfront = #frontage - #distance
set #highfront = #frontage + #distance
set #lowstreet = #streetnum - #distance
set #highstreet = #streetnum + #distance
-- Process for Street Names that are Numeric Values
-- Create Temp Table
CREATE TABLE #StreetNames
(streetname varchar(25))
-- SELECT StreetName and put in Temp table
INSERT #Streetnames(streetname)
select distinct streetname
from ADDR_STREETCOORD
where begincoord between #lowstreet and #highstreet
and STREETDIR = #streetdir
and lowhouse > #lowfront and HIGHHOUSE < #highfront
and HOUSEDIR = #housedir
union
select distinct streetname
from ADDR_STREETCOORD
where lowhouse > #lowstreet and HIGHHOUSE < #highstreet
and HOUSEDIR = #housedir
and begincoord between #lowfront and #highfront
and STREETDIR = #streetdir
-- Check each Streetname and those that are a coordinate (ex: "1000 S") change to "1000"
CREATE TABLE #FixStreets(streetname varchar(25))
DECLARE curStreet CURSOR FOR SELECT streetname FROM #StreetNames
OPEN curStreet
FETCH NEXT FROM curStreet INTO #fixstreet
WHILE ##FETCH_STATUS = 0
BEGIN
--insert code here
INSERT #FixStreets(streetname)
--Call Function to: parse the street name
--if the first part isnumeric then insert that
--if not numeric then keep as is
select [dbo].fnParseCoordinate(#fixstreet)
FETCH NEXT FROM curStreet INTO #fixstreet
END
CLOSE curStreet
DEALLOCATE curStreet
--select * from #FixStreets
--create a temp table to store the results of each street name in a single table
-- in order to return the results as a single table
--For Each street name search its frontage range values
--loop through each streetname to get the parcels matching those streets
CREATE TABLE #AllResults(Parcel varchar(14),Prop_locat varchar(50))
DECLARE curName CURSOR FOR SELECT streetname FROM #FixStreets
OPEN curName
FETCH NEXT FROM curName INTO #streetcur
WHILE ##FETCH_STATUS = 0
BEGIN
--insert code here
INSERT #AllResults(Parcel, Prop_locat)
select parcel_id,prop_locat
from ADDR_ParcelsWWW
where StreetName = #streetcur
and predir = #housedir
and housefrom between #lowfront and #highfront
union
select parcel_id,LOCATOR_ADDRESS
from ADDR_MASTERADDRESS
where StreetName = #streetcur
and predir = #housedir
and housefrom between #lowfront and #highfront
FETCH NEXT FROM curName INTO #streetcur
END
CLOSE curName
DEALLOCATE curName
--Select the results of all the tables
select Parcel, prop_locat from #AllResults
END
When EF polls your SP, it does this first: SET FMTONLY ON
This can screw things up if you use temp tables in your SP, which it looks like you're doing.
Try explicitly setting this at the beginning of your SP: SET FMTONLY OFF
That should allow EF to detect your columns.

SQL Server (2008) Pass ArrayList or String to SP for IN()

I was wondering how I can pass either an ArrayList, List<int> or StringBuilder comma delimited list to a stored procedure such that I find a list of IDs using IN():
#myList varchar(50)
SELECT *
FROM tbl
WHERE Id IN (#myList)
In C# I am currently building the list as a string which is comma delimeted; however when using nvarchar(50) for example, as the type for the param in the stored procedure - I get an error as it can't convert '1,2,3' to int which it expects between the IN().
Any ideas? Much appreciated.
Pete
You could use a User Defined function such as
CREATE function [dbo].[csl_to_table] ( #list nvarchar(MAX) )
RETURNS #list_table TABLE ([id] INT)
AS
BEGIN
DECLARE #index INT,
#start_index INT,
#id INT
SELECT #index = 1
SELECT #start_index = 1
WHILE #index <= DATALENGTH(#list)
BEGIN
IF SUBSTRING(#list,#index,1) = ','
BEGIN
SELECT #id = CAST(SUBSTRING(#list, #start_index, #index - #start_index ) AS INT)
INSERT #list_table ([id]) VALUES (#id)
SELECT #start_index = #index + 1
END
SELECT #index = #index + 1
END
SELECT #id = CAST(SUBSTRING(#list, #start_index, #index - #start_index ) AS INT)
INSERT #list_table ([id]) VALUES (#id)
RETURN
END
Which accepts an nvarchar comma separated list of ids and returns a table of those ids as ints. You can then join on the returned table in your stored procedure like so -
DECLARE #passed_in_ids TABLE (id INT)
INSERT INTO #passed_in_ids (id)
SELECT
id
FROM
[dbo].[csl_to_table] (#your_passed_in_csl)
SELECT *
FROM
myTable
INNER JOIN
#passed_in_ids ids
ON
myTable.id = ids.id
In SQL 2008 there are table-valued-parameters, that make a friendly alternative to parsing CSV; see here for an example.
Otherwise, another option is xml - the xml data type in SQL Server allows you to read this pretty easily (although it takes more transfer bytes).

Categories

Resources