How do I select few random values from a datatable column - c#

I have a DataTable with a column studentid. It has 1000 records. I need to select some 30 random ids and insert them into a database table. Then, I need to exclude these 30 ids, select another 30 random ids, and... so on until 1000 records.
And, in every iteration, I will have a given number of ids, so only that many ids should be selected (the 300 is not constant, it may be 30, 25, 23, 24...).

This may get you started:
--Create a temporary table
CREATE TABLE #temp (id INTEGER)
go
--Insert 30 randow ids into the #temp table, excluding any ids that were previously picked.
-- run this line as many times as needed.
INSERT INTO #temp select top 30 id from [student] where [id] not in (select [id] from #temp) order by newid()

Create an array with the 1,000 student ids and shuffle the array. Then just start at the beginning of the array and go forward.
If you have to be persistent, you can write the contents of the array to a temporary table and step through it sequentially.
Or, you could do:
SELECT id FROM table
ORDER BY RAND()
and write it to a temp table. I don't remember the SQL syntax. SELECT INTO? That'll put all of the id's into a table in random order and then you can pick them out 30 at a time, starting at the beginning.

The use of some dynamic SQL may be tolerable here.
IF ( OBJECT_ID( 'tempdb.dbo.#t_Student' ) IS NULL )
BEGIN
CREATE TABLE #t_Student
(
StudentID INTEGER
);
SET NOCOUNT ON;
DECLARE #i INTEGER;
SET #i = 0;
WHILE ( #i < 1000 )
BEGIN
INSERT INTO #t_Student ( StudentID )
VALUES ( #i );
SET #i = #i + 1;
END;
SET NOCOUNT OFF;
END;
IF ( OBJECT_ID( 'tempdb.dbo.#t_Processed' ) IS NOT NULL )
BEGIN
DROP TABLE #t_Processed;
END;
DECLARE #Random INTEGER,
#LowerBound INTEGER,
#UpperBound INTEGER,
#SQL NVARCHAR( MAX );
SET #LowerBound = 1;
SET #UpperBound = 30;
CREATE TABLE #t_Processed
(
StudentID INTEGER
);
WHILE ( ( SELECT COUNT( 1 )
FROM dbo.#t_Processed ) <
( SELECT COUNT( 1 )
FROM dbo.#t_Student ) )
BEGIN
SET #Random = ROUND( ( ( #UpperBound - #LowerBound ) * RAND() + #LowerBound ), 0 );
SET #SQL = '
SELECT TOP ' + LEFT( #Random, 10 ) + ' StudentID
FROM #t_Student
WHERE StudentID NOT IN ( SELECT StudentID
FROM #t_Processed )
ORDER BY NEWID();';
INSERT INTO #t_Processed ( StudentID )
EXECUTE dbo.sp_executesql #statement = #SQL;
END;
GO

Related

Batch delete operation procedure not working

I have a stored procedure which looks like following:
alter procedure [dbo].[zsp_deleteEndedItems]
(
#ItemIDList nvarchar(max)
)
as
delete from
SearchedUserItems
WHERE EXISTS (SELECT 1 FROM dbo.SplitStringProduction(#ItemIDList,',') S1 WHERE ItemID=S1.val)
The parameter IDList is passed like following:
124125125,125125125...etc etc
And the split string function look like following:
ALTER FUNCTION [dbo].[SplitStringProduction]
(
#string nvarchar(max),
#delimiter nvarchar(5)
) RETURNS #t TABLE
(
val nvarchar(500)
)
AS
BEGIN
declare #xml xml
set #xml = N'<root><r>' + replace(#string,#delimiter,'</r><r>') + '</r></root>'
insert into #t(val)
select
r.value('.','varchar(500)') as item
from #xml.nodes('//root/r') as records(r)
RETURN
END
This is supposed to delete all items from table "SearcheduserItems" under the IDs:
124125125 and 125125125
But for some reason after I do a select to check it out:
select * from SearchedUserItems
where itemid in('124125125','125125125')
The records are still there...
What am I doing wrong here? Can someone help me out?
As mentioned in the comments, a different option would be to use a table type parameter. This makes a couple of assumptions (some commented), however, should get you on the right path:
CREATE TYPE dbo.IDList AS TABLE (ItemID int NOT NULL); --Assumed int datatype;
GO
ALTER PROC dbo.zsp_deleteEndedItems #ItemIDList dbo.IDList READONLY AS
DELETE SUI
FROM dbo.SearchedUserItems SUI
JOIN #ItemIDList IDL ON SUI.ItemID = IDL.ItemID;
GO
--Example of usage
DECLARE #ItemList dbo.IDList;
INSERT INTO #ItemList
VALUES(123456),(123457),(123458);
EXEC dbo.zsp_deleteEndedItems #ItemList;
GO
In regards to the question of an inline table value function, one such example is the below, which I quickly wrote up, that provides a tally table of the next 1000 numbers:
CREATE FUNCTION dbo.NextThousand (#Start int)
RETURNS TABLE
AS RETURN
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)
)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 + #Start AS I
FROM N N1 --10
CROSS JOIN N N2 --100
CROSS JOIN N N3; --1,000
GO
The important thing about an iTVF is that it has only one statement, and that is the RETURN statement. Declaring the table as a return type variable, inserting data into it, and returning that variable turns it into a multi-line TVF; which perform far slower.

Is it possible to transpose a table's entries and store it into a temp table in SQL Server?

I would like to transpose the data from my table and do some plottings into powerBI.
Here is how I feel up my database from my application:
using (SqlCommand cmd = connect.CreateCommand())
{
cmd.CommandText = #"INSERT INTO PoD_NewPriceList_Data
(ID, Product_Barcode, Product_Name,
Store_Price, Internet_Price, InsertDate)
VALUES (#ID, #Product_Barcode, #Product_Name,
#Store_Price, #Internet_Price, #InsertDate)";
cmd.Parameters.Add("Product_Barcode", SqlDbType.NVarChar).Value = barcode;
cmd.Parameters.Add("Product_Name", SqlDbType.NVarChar).Value = PriceList.name;
cmd.Parameters.Add("Store_Price", SqlDbType.Float).Value = Convert.ToDouble(storePrice, CultureInfo.InvariantCulture);
cmd.Parameters.Add("Internet_Price", SqlDbType.Float).Value = Convert.ToDouble(PriceList.price, CultureInfo.InvariantCulture);
cmd.Parameters.Add("InsertDate", SqlDbType.DateTime).Value = InsertDate.AddDays(2);
cmd.Parameters.Add("ID", SqlDbType.Int).Value = barcode.GetHashCode();
result = result && (cmd.ExecuteNonQuery() > 0);
}
And in SQL Server Management Studio here is how my table looks like:
SELECT
[ID], [Product_Barcode], [Product_Name],
[Store_Price], [Internet_Price], [InsertDate]
FROM
[dbo].[PoD_NewPriceList_Data]
and I get the following output:
The main issue is when trying to create the plots as requested in PowerBI I need my data to look as follows:
F5321
Product_Name Sony Xperia...
Store_Price 399
Internet_Price 327.51
InsertDate 2017.04.27
Any help would be well appreciated.
Check and modify this SQL script. I use #t table variable, replace it with your table name [PoD_NewPriceList_Data].
DECLARE #t TABLE (
id int,
product_barcode varchar(max),
product_name varchar(max),
store_price int,
internet_price decimal,
insert_date date
)
INSERT INTO #t VALUES (1,'F5321', 'Sony Xperia', 399, 255.1, '2017-04-25')
INSERT INTO #t VALUES (2,'F5833', 'Sony Xperia XZ', 458, 398.2, '2017-04-26')
INSERT INTO #t VALUES (3,'F5121', 'Sony Xperia XA Rose', 161, 155.6, '2017-04-27')
IF OBJECT_ID ('tempdb..#Unpivoted') IS NOT NULL
DROP TABLE #Unpivoted
IF OBJECT_ID ('tempdb..#Transposed') IS NOT NULL
DROP TABLE #Transposed
/* Unpivot table to get rows instead of columns */
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 0)) as rn
INTO #Unpivoted
FROM (SELECT product_barcode, product_name,
CAST(store_price as varchar(max)) store_price,
CAST(internet_price as varchar(max)) internet_price,
CAST(insert_date as varchar(max)) as insert_date
FROM #t) src
UNPIVOT (
value FOR field IN (
product_barcode, product_name, store_price, internet_price, insert_date
)
) unpiv
CREATE TABLE #Transposed
(Field varchar(50) PRIMARY KEY NOT NULL )
DECLARE #SQL NVARCHAR(MAX)
SELECT #SQL = STUFF((
SELECT 'ALTER TABLE #Transposed ADD item' +
RIGHT('000' + CAST(sv.number AS VARCHAR(3)), 3) + ' varchar(max) '
FROM [master].dbo.spt_values sv
WHERE sv.[type] = 'p'
AND sv.number BETWEEN 1 AND (SELECT COUNT(*) FROM #t)
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 0, '')
Exec(#SQL) /* Dynamically create columns */
INSERT INTO #Transposed (Field) SELECT DISTINCT Field FROM #Unpivoted
/*populate field names*/
DECLARE #fieldCount int = (SELECT COUNT(*) FROM #Transposed)
/* using rn to filter proper record from transposed table */
SELECT #SQL = STUFF((
SELECT '
UPDATE #Transposed SET item' + RIGHT('000' + CAST(sv.number AS VARCHAR(3)), 3)
+ ' = up.value FROM #Transposed t CROSS APPLY
( SELECT TOP 1 u.value FROM #unpivoted u WHERE u.field = t.field AND u.rn > '
+ CAST((sv.number-1)*#fieldCount AS VARCHAR(10)) + ' ORDER BY rn) up '
FROM [master].dbo.spt_values sv
WHERE sv.[type] = 'p'
AND sv.number BETWEEN 1 AND (SELECT COUNT(*) FROM #t)
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 0, '')
Exec(#SQL) /*Dynamically fill in values */
SELECT t.* FROM #Transposed t
OUTER APPLY (SELECT TOP 1 rn FROM #Unpivoted u WHERE u.field=t.field) up
ORDER BY up.rn ASC /* add a link to Unpivoted to fix the item order */
DROP TABLE #Unpivoted
DROP TABLE #Transposed
It does what you need in several steps
converts columns to rows with UNPIVOT. Watch that you have to CAST all the values to the exactly same type. Adds a row number to filter the rows in step 3.
creates a temp table with dynamic number of columns corresponding to the number of rows
fills in the columns names into rows into the dynamically created table
fills in values into the dynamically created table
Credits to this answer and this answer.
Of course the number of columns is limited here, so if you try to convert many rows into columns, you get:
Cannot create a row of size 8066 which is greater than the allowable
maximum row size of 8060.

SQL While loop or c# linq foreach?

I have a credit card payment process stored procedure written in SQL. This is using SQL while to loop through half million payment records. This is horribly slow, because i have to loop through another set of data for each of these payment records and update bunch of tables accordingly. That means while loop inside while loop in sql.
I have to re-write this in c# to improve the performance. We are using LINQ & Entity framework. Can anybody please suggest me if this is going to help or not?
Thanks
Updating my question with modified stored procedure. I have replaced the references to my tables and removed some part of the code. But let me know if this is not clear.
CREATE PROCEDURE [dbo].[USP_ProcessPayments]
(
#UserKey int
)
AS
BEGIN
SET NOCOUNT ON;
CREATE TABLE #TempUserPayments
(
-- with Columns declared
)
INSERT INTO #TempUserPayments
SELECT DISTINCT up.*, ROW_NUMBER() OVER(ORDER BY up.UserKey DESC, up.DateReceived) AS RN --generate row number
FROM UserPayment up
where up.UserKey = #UserKey
CREATE INDEX UP_1 on #TempUserPayments (RN);
DECLARE #LOOPCOUNTER INT, #ROWCOUNTER INT, #REQUESTEDAMOUNT MONEY, #UPDATEDREQUESTEDAMOUNT MONEY, #AMOUNT MONEY,
#AMOUNTNOTAPPLIED MONEY, #APPLIEDAMOUNT MONEY, #SYSTEMKEY INT, #LOOPCOUNTER_PR INT, #ROWCOUNTER_PR INT, #IS_UPDATE BIT
SELECT #ROWCOUNTER = COUNT(*) from #TempUserPayments;
SET #LOOPCOUNTER = 1
WHILE (#LOOPCOUNTER <= #ROWCOUNTER)
BEGIN
SELECT *,
ROW_NUMBER() OVER (ORDER BY mp.Month DESC, mp.PayPriority) AS RN_PR
INTO #TempPR
FROM Requests pr
INNER JOIN Premium mp ON mp.PremiumKey = pr.PremiumKey
--Joing couple of other tables here
ORDER BY mp.[Month] DESC, mp.PayPriority
CREATE INDEX PR_1 on #TempPR (RN_PR);
SELECT #SYSTEMKEY = SystemKey FROM #TempUserPayments WHERE RN = #LOOPCOUNTER;
SELECT #ROWCOUNTER_PR = COUNT(*) from #TempPR;
SET #LOOPCOUNTER_PR = 1
WHILE (#LOOPCOUNTER_PR <= #ROWCOUNTER_PR)
BEGIN
SET #IS_UPDATE = 0;
IF(#APPLIEDAMOUNT IS NULL OR (#REQUESTEDAMOUNT <> #APPLIEDAMOUNT AND #APPLIEDAMOUNT < #REQUESTEDAMOUNT))
BEGIN
IF(#SYSTEMKEY = 1 OR #SYSTEMKEY = 3)
BEGIN
--Check all the conditions here
SET #IS_UPDATE = 1;
END
IF(#IS_UPDATE = 1)
BEGIN
--Update bunch of requets tables ---
INSERT INTO userpaid
(
--Column names --
)
VALUES
(
--Values --
)
END
END
-- END
SET #LOOPCOUNTER_PR = #LOOPCOUNTER_PR + 1;
END
SET #LOOPCOUNTER = #LOOPCOUNTER + 1;
DROP TABLE #TempPR;
END
DROP TABLE #TempUserPayments;
END

Performance issue with SQL Server stored procedure

I used the ANTS profiler to identify the remaining bottleneck in my C# application: the SQL Server stored procedure. I am using SQL Server 2008. Can anybody here help me increase performance, or give me pointers as to what I can do to make it better or more performant?
First, here's the procedure:
PROCEDURE [dbo].[readerSimilarity]
-- Add the parameters for the stored procedure here
#id int,
#type int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
IF (#type=1) --by Article
SELECT id1, id2, similarity_byArticle FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_byArticle != 0
ELSE IF (#type=2) --by Parent
SELECT id1, id2, similarity_byParent FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_byParent != 0
ELSE IF (#type=3) --by Child
SELECT id1, id2, similarity_byChild FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_byChild != 0
ELSE IF (#type=4) --combined
SELECT id1, id2, similarity_combined FROM similarity WHERE (id1 = #id OR id2 = #id)
AND similarity_combined != 0
END
The table 'similarity' consists of two ids (id1 and id2) and a number of columns that store double values. The constraint is that id1 < id2.
Column Data
----- ----
ID1 PK, Indexed
ID2 PK, Indexed
The table contains 28.5 million entries.
Stored Procedure Background
The job of the stored procedure is to get all the rows that have the parameter id in either id1 or id2. Additionally, the column specified by the type-parameter cannot be zero.
The stored procedure is called multiple times for different ids. Although only taking ~1.6 ms per call, it sums up, when calling it 17,000 times.
The processor is running at only 25%, which seems to be because the application is waiting for the procedure call to return.
Do you see any way to speed things up?
Calling the Stored Procedure C# Code Snippet
private HashSet<NodeClustering> AddNeighbourNodes(int id)
{
HashSet<NodeClustering> resultSet = new HashSet<NodeClustering>();
HashSet<nodeConnection> simSet = _graphDataLoader.LoadEdgesOfNode(id);
foreach (nodeConnection s in simSet)
{
int connectedId = s.id1;
if (connectedId == id)
connectedId = s.id2;
// if the corresponding node doesn't exist yet, add it to the graph
if (!_setNodes.ContainsKey(connectedId))
{
NodeClustering nodeToAdd = CreateNode(connectedId);
GraphAddOuter(nodeToAdd);
ChangeWeightIntoCluster(nodeToAdd.id, s.weight);
_bFlowOuter += s.weight;
resultSet.Add(nodeToAdd);
}
}
// the nodes in the result set have been added
to the outernodes -> add to the outernodes count
_setNodes[id].countEdges2Outside += resultSet.Count;
return resultSet;
}
C# Code Background Information
This method is called each time a new id is added to the cluster. It gets all the connected nodes of that id (they are connected, when there is an entry in the db with id1=id or id2=id) via
_graphDataLoader.LoadEdgesOfNode(id);
Then it checks all the connected ids and if they are not loaded yet:
if (!_setNodes.ContainsKey(connectedId))
It Loads them:
CreateNode(connectedId);
The Method:
_graphDataLoader.LoadEdgesOfNode(id);
is called again, this time with the connectedId.
I need this to get all the connections of the new nodes with those nodes that are already in the set.
I probably could collect the ids of all nodes i need to add and call my stored procedure only once with a list of the ids.
Ideas
I could probably load the connected ids connection at once via something like
SELECT id1, id2, similarity_byArticle FROM similarity WHERE
(id1 = #id OR id2 = #id OR
id1 IN (SELECT id1 FROM similarity WHERE id2 = #id) OR
id2 IN (SELECT id1 FROM similarity WHERE id2 = #id) OR
id1 IN (SELECT id2 FROM similarity WHERE id1 = #id) OR
id2 IN (SELECT id2 FROM similarity WHERE id1 = #id))
AND similarity_byArticle != 0
but then I would get more entries than I'd need, because I would get them for already loaded nodes too (which from my tests would make up around 75% of the call).
Questions
How can I speed up the Stored Procedure?
Can I do it differently, is there a more performant way?
Can I use a List<int> as a SP-Parameter?
Any other thoughts?
If it runs that quickly, your problem is probably in the sheer number of repeated calls to the procedure. Is there a way that you could modify the stored procedure and code to return all the results the app needs in a single call?
Optimizing a query that runs in less than 2ms is probably not a fruitful effort. I doubt you will be able to shave more than fractions of a millisecond with query tweaks.
I'd try to change the application to only call this one time per ID, but if that is not possible, try this (make sure that there is an index on similarity.id1 and another index on similarity.id2):
PROCEDURE [dbo].[readerSimilarity]
-- Add the parameters for the stored procedure here
#id int,
#type int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
IF #type=1 --by Article
BEGIN
SELECT
id1, id2,similarity_byArticle
FROM similarity
WHERE id1 = #id AND similarity_byArticle!=0
UNION
SELECT
id1, id2,similarity_byArticle
FROM similarity
WHERE id2 = #id AND similarity_byArticle!=0
END
ELSE IF #type=2 --by Parent
BEGIN
SELECT
id1, id2,similarity_byParent
FROM similarity
WHERE id1 = #id AND similarity_byParent!=0
UNION
SELECT
id1, id2,similarity_byParent
FROM similarity
WHERE id2 = #id AND similarity_byParent!=0
END
ELSE IF #type=3 --by Child
BEGIN
SELECT
id1, id2,similarity_byChild
FROM similarity
WHERE id1 = #id AND similarity_byChild!=0
UNION
SELECT
id1, id2,similarity_byChild
FROM similarity
WHERE id2 = #id AND similarity_byChild!=0
END
ELSE IF #type=4 --combined
BEGIN
SELECT
id1, id2,similarity_combined
FROM similarity
WHERE id1 = #id AND similarity_combined!=0
UNION
SELECT
id1, id2,similarity_combined
FROM similarity
WHERE id2 = #id AND similarity_combined!=0
END
END
GO
EDIT based on OP's latest comment:
The whole graph is stored in the
MSSQL-Database and I load it
successively with the procedure into
some Dictionary structures
You need to redesign your load process. You should call the database just one time to load all of this data. Since the IDs are already in a Database table, you can use a join in this query to get the proper IDs from the other table. edit your question with the table schema that contain the IDs to graph, and how they relate to the already posted code. Once you get a single query to return all the data, it will be much faster that 17,000 calls for a single row each time.
Pass all the ids into the stored proc at once, using a delimited list (Use a comma or a slash or whatever, I use a pipe character [ | ]..
Add the User defined function (UDF) listed below to your database. It will convert a delimited list into a table which you can join to your similarity table. Then in your actual stored proc, you can write...
Create Procedure GetSimilarityIDs
#IdValues Text -- #IdValues is pipe-delimited [|] list of Id Values
As
Set NoCount On
Declare #IDs Table
(rowNum Integer Primary Key Identity Not Null,
Id Integer Not Null)
Insert Into #IDs(Id)
Select Cast(sVal As Integer)
From dbo.ParseString(#IdValues, '|') -- specify delimiter
-- ---------------------------------------------------------
Select id1, id2, similarity_byArticle
From similarity s Join #IDs i On i.Id = s.Id
Where similarity_byArticle <> 0
Return 0
-- ***********************************************************
The below code is to create the generic function UDF that can parse any text string into a table of string values...:
Create FUNCTION [dbo].[ParseTextString] (#S Text, #delim VarChar(5))
Returns #tOut Table
(ValNum Integer Identity Primary Key,
sVal VarChar(8000))
As
Begin
Declare #dLLen TinyInt -- Length of delimiter
Declare #sWin VarChar(8000) -- Will Contain Window into text string
Declare #wLen Integer -- Length of Window
Declare #wLast TinyInt -- Boolean to indicate processing Last Window
Declare #wPos Integer -- Start Position of Window within Text String
Declare #sVal VarChar(8000) -- String Data to insert into output Table
Declare #BtchSiz Integer -- Maximum Size of Window
Set #BtchSiz = 7900 -- (Reset to smaller values to test routine)
Declare #dPos Integer -- Position within Window of next Delimiter
Declare #Strt Integer -- Start Position of each data value within Window
-- -------------------------------------------------------------------------
If #delim is Null Set #delim = '|'
If DataLength(#S) = 0 Or
Substring(#S, 1, #BtchSiz) = #delim Return
-- ---------------------------
Select #dLLen = Len(#delim),
#Strt = 1, #wPos = 1,
#sWin = Substring(#S, 1, #BtchSiz)
Select #wLen = Len(#sWin),
#wLast = Case When Len(#sWin) = #BtchSiz
Then 0 Else 1 End,
#dPos = CharIndex(#delim, #sWin, #Strt)
-- ------------------------------------
While #Strt <= #wLen
Begin
If #dPos = 0 -- No More delimiters in window
Begin
If #wLast = 1 Set #dPos = #wLen + 1
Else
Begin
Set #wPos = #wPos + #Strt - 1
Set #sWin = Substring(#S, #wPos, #BtchSiz)
-- ----------------------------------------
Select #wLen = Len(#sWin), #Strt = 1,
#wLast = Case When Len(#sWin) = #BtchSiz
Then 0 Else 1 End,
#dPos = CharIndex(#delim, #sWin, 1)
If #dPos = 0 Set #dPos = #wLen + 1
End
End
-- -------------------------------
Set #sVal = LTrim(Substring(#sWin, #Strt, #dPos - #Strt))
Insert #tOut (sVal) Values (#sVal)
-- -------------------------------
-- Move #Strt to char after last delimiter
Set #Strt = #dPos + #dLLen
Set #dPos = CharIndex(#delim, #sWin, #Strt)
End
Return
End
First create a view
CREATE VIEW ViewArticles
AS
SELECT id1, id2, similarity_byArticle
FROM similarity
WHERE (id1 = #id or id2 = #id)
and similarity_byArticle != 0
In your code populate all the needed ids into a table.
Create a function which takes all the ids table as parameter.
CREATE FUNCTION
SelectArticles
(
#Ids TABLE
)
RETURNS TABLE
AS
RETURN
(
SELECT id1, id2, similarity_byArticle FROM ViewArticles
INNER JOIN #Ids I ON I.Id = id1
UNION
SELECT id1, id2, similarity_byArticle FROM ViewArticles
INNER JOIN #Ids I ON I.Id = id2
)

SQL Server (2008) Pass ArrayList or String to SP for IN()

I was wondering how I can pass either an ArrayList, List<int> or StringBuilder comma delimited list to a stored procedure such that I find a list of IDs using IN():
#myList varchar(50)
SELECT *
FROM tbl
WHERE Id IN (#myList)
In C# I am currently building the list as a string which is comma delimeted; however when using nvarchar(50) for example, as the type for the param in the stored procedure - I get an error as it can't convert '1,2,3' to int which it expects between the IN().
Any ideas? Much appreciated.
Pete
You could use a User Defined function such as
CREATE function [dbo].[csl_to_table] ( #list nvarchar(MAX) )
RETURNS #list_table TABLE ([id] INT)
AS
BEGIN
DECLARE #index INT,
#start_index INT,
#id INT
SELECT #index = 1
SELECT #start_index = 1
WHILE #index <= DATALENGTH(#list)
BEGIN
IF SUBSTRING(#list,#index,1) = ','
BEGIN
SELECT #id = CAST(SUBSTRING(#list, #start_index, #index - #start_index ) AS INT)
INSERT #list_table ([id]) VALUES (#id)
SELECT #start_index = #index + 1
END
SELECT #index = #index + 1
END
SELECT #id = CAST(SUBSTRING(#list, #start_index, #index - #start_index ) AS INT)
INSERT #list_table ([id]) VALUES (#id)
RETURN
END
Which accepts an nvarchar comma separated list of ids and returns a table of those ids as ints. You can then join on the returned table in your stored procedure like so -
DECLARE #passed_in_ids TABLE (id INT)
INSERT INTO #passed_in_ids (id)
SELECT
id
FROM
[dbo].[csl_to_table] (#your_passed_in_csl)
SELECT *
FROM
myTable
INNER JOIN
#passed_in_ids ids
ON
myTable.id = ids.id
In SQL 2008 there are table-valued-parameters, that make a friendly alternative to parsing CSV; see here for an example.
Otherwise, another option is xml - the xml data type in SQL Server allows you to read this pretty easily (although it takes more transfer bytes).

Categories

Resources