How to write a recursive query with 2 tables in SQL Server

How to write a recursive query with 2 tables in SQL Server - c#

I have a table with the following structure.
Table name: Table0
The structure is as below
Select process from Table0 where Name like '%Aswini%'
Process
-------
112
778
756
All these process must go into the below table
Table name: Table1
The structure is as below
Select Exec, stepid, condition
from Table1
where Exec = 112
Exec stepid condition
-----------------------
112 2233 0
112 2354 0
445 3455 0
The second table 'Table 2' structure follows:
Select stepid, processid
from Table2
where stepid = 2233
Stepid processid
-----------------
2233 445
2354 566
3455 556
The Table1 stepid is input to Table2 stepid and Table2 Processid is input to Table1 Exec. I have to recursively get processID until the condition is 0 else the table returns no rows and the final processid is the parent ID.
I have not worked on CTE. So I have used a simple join to get the following result.
select b.processid
from Table1 a
inner join Table2 b on a.stepid = b.stepid
where a.condition = 0
and a.exec = 112(parent from table0)
The above query will give me parent of Exec 112 if it satisfies the condition.
I have to again input the parent to the query and execute it.
I can achieve this with the help of C# by putting it in a loop. But I want it in SQL Server alone. Is this achievable?
Edited
When I execute the CTE I get the below result
Process Parent
112 445
112 566
112 445
112 566
If the initial process has 2 exec then the final process parent structure is duplicated twice( number of exec). Why is this happening. It has to display the result only once.

A solution without a cursor (which I personally prefer):
WITH [CTE] AS
(
SELECT
T1.[Exec] AS [process],
1 AS [n],
T1.[Exec],
T1.[Exec] AS [parent]
FROM
[Table1] AS T1
UNION ALL
SELECT
C.[process],
C.[n] + 1,
T1.[Exec],
T2.[processid]
FROM
[CTE] AS C
INNER JOIN [Table1] AS T1 ON T1.[Exec] = C.[parent]
INNER JOIN [Table2] AS T2 ON T2.[stepid] = T1.[stepid]
)
SELECT C.[process], C.[parent]
FROM [CTE] AS C
WHERE C.[n] = (SELECT MAX([n]) FROM [CTE] WHERE [process] = C.[process])
Explanation:
The anchor part of the common table expression (the SELECT query before the UNION ALL) defines the starting point of the operation. In this case, it simply selects all data from Table1 and it has four fields:
process will contain the value of the process (Exec value) of which the parent should be determined.
n will contain a sequence number, starting with 1.
Exec will contain a "shifting" value for joining records in the next "recursive" part of the common table expression.
parent will contain the corresponding processid field from Table2, which represents the direct parent of the Exec value.
This anchor expression will produce the following data:
process n Exec parent
112 1 112 112
445 1 445 445
The recursive part of the common table expression (the SELECT query after the UNION ALL) keeps adding records to the CTE from Table1 (where its Exec value equals the parent value of the previous CTE record) and Table2 (related with Table1 on the stepid fields). Those newly added records in the CTE will have the following field values:
process will be copied from the previous CTE record.
n will be increased by 1.
Exec will get the Exec value of the joined Table1's Exec value (equal to the previous CTE record's parent value).
parent will - again - get the corresponding processid value from Table2 where its stepid value equals Table1's stepid value.
The entire CTE will yield the following results:
process n Exec parent
112 1 112 112
112 2 112 445
112 3 445 556
445 1 445 445
445 2 445 556
The main query (below the CTE) will select only the process and parent fields for each "last" record in the CTE (where the value of n is the largest value for that specific process value, which is determined using a subquery).
This produces the following end result:
process parent
445 556
112 556
Hope this helps a little.
Edit regarding the update in the question regarding 3rd table Table0:
Assuming that your query SELECT [process] FROM [Table0] WHERE [Name] LIKE '%Aswini%' will contain valid processes for the query above to return, only the WHERE-clause of the main query above needs to be changed.
Previous WHERE-clause:
WHERE C.[n] = (SELECT MAX([n]) FROM [CTE] WHERE [process] = C.[process])
Updated WHERE-clause:
WHERE
C.[n] = (SELECT MAX([n]) FROM [CTE] WHERE [process] = C.[process]) AND
C.[process] IN (SELECT [process] FROM [Table0] WHERE [Name] LIKE '%Aswini%')
Edit regarding possible duplicates when processes have more than one parent
In case a process has more than one parent (??), the above query produces duplicates. To eliminate the duplicates and to provide a more robust way for determining the topmost parent of a process, the following modifications were made:
The anchor part of the CTE puts the actual parent of a process in the parent field by joining Table1 to Table2. This join should be a left join, so that processes without parents (if possible) will be included in the results too; their parent value will be equal to their own process id.
The recursive part of the CTE should only add parents for processes that have an actual parent (where field process is not equal to parent). This is to avoid infinite loops in recursivity (if possible).
The main query should filter out all records where the value of the parent field is also used in another result record as the value of the exec field for the same base process (the value in the process field). Because in that case, the parent field is not the final parent value, and that other result record might be a more fitting candidate for containing the actual parent.
In other words: if process A has parent B, and process B has parent C, there are three related results in the CTE: (A, A, B), (A, B, C), and (B, B, C). Result (A, A, B) is invalid, because a more fitting candidate (A, B, C) is available in the results too. The final results should include (A, C) and (B, C), but not (A, B).
This logic is implemented using a subquery in an EXISTS operator in the WHERE clause, but it could also be realized using a LEFT JOIN on the CTE itself as well, of course.
Because of the upgraded logic described in point 3, the column n of the CTE is not used anymore and has been removed.
To avoid duplicates in case of a "diamond pattern" in the data (process A has parents B and C, and both processes B and C have parent D), a DISTINCT is used in the main query's SELECT clause to avoid duplicates (A, D).
The final query would look like this:
WITH [CTE] AS
(
SELECT
T1.[exec] AS [process],
T1.[exec],
COALESCE(T2.[processid], T1.[exec]) AS [parent]
FROM
[Table1] AS T1
LEFT JOIN [Table2] AS T2 ON T2.[stepid] = T1.[stepid]
UNION ALL
SELECT
C.[process],
T1.[exec],
T2.[processid]
FROM
[CTE] AS C
INNER JOIN [Table1] AS T1 ON T1.[exec] = C.[parent]
INNER JOIN [Table2] AS T2 ON T2.[stepid] = T1.[stepid]
WHERE
C.[parent] <> C.[process]
)
SELECT DISTINCT C.[process], C.[parent]
FROM [CTE] AS C
WHERE
NOT EXISTS (SELECT 1 FROM [CTE]
WHERE [process] = C.[process] AND [exec] = C.[parent])
AND C.[process] IN (SELECT [process] FROM [Table0] WHERE [name] LIKE '%Aswini%')
I hope this works well enough for you.

Would you try using a cursor and storing the first table in cursor and get the process id by going till the end of the cursor.
DECLARE f_cursor CURSOR FOR
Select Exec, stepid
from Table1
OPEN f_cursor
FETCH NEXT FROM f_cursor
INTO #exec,#stepid
WHILE ##FETCH_STATUS = 0
BEGIN
select b.processid
from Table1 a
inner join Table2 b on a.stepid = b.stepid
where a.condition = 0
and a.exec =#exec
//store the processid somewhere for later use.
END
CLOSE f_cursor;
DEALLOCATE f_cursor;

Related

How do I retrieve non-empty & non-duplicate data from the database?

I have this nitpicked columns on my table (cause the rest are irrelevant in the problem).
ID | Generic Name
-----+---------------
001 | Cetirizine
002 | Cetirizine
003 |
004 | Paracetamol
I want my combo box to display only a single entry Cetirizine (or any data that has been duplicated) and no empty generic names (some data have no generic names).
I've tried:
select
Item_GenName
from
ItemMasterlistTable
where
nullif(convert(varchar, Item_GenName), '') is not null
but it only achieves the no empty data part.
I've tried using DISTINCT, but it doesn't work and somebody suggested JOIN but I don't think it works since I'm only using 1 table.
I've also tried:
SELECT
MIN(Item_ID) AS Item_ID, Item_GenName
FROM
ItemMasterlistTable
GROUP BY
Item_GenName
but there's always an error:
The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

The following query should return only distinct, non-empty Item_GenNames:
SELECT DISTINCT Item_GenName
FROM ItemMasterlistTable
// because Item_GenName is of type *text*, the below in lieu of `is not null` and `!= ''`
WHERE datalength(Item_GenName) != 0
You said you tried DISTINCT and it did not work so I want to clarify,
The DISTINCT keyword will return unique records over the complete domain of your select statement. If you include the ID column in your select statement, even a distinct selection will return your duplicate Item_GenNames b/c the combined ID / Item_GenName record would be unique. Include only Item_GenName in your select clause to guarantee distinct values for this column.

The following query might be useful.
declare #tab table (ID varchar(10), Generic_Name varchar(100))
insert into #tab
select '001', 'Cetirizine'
union
select '002', 'Cetirizine'
union
select '003', ''
union
select '004', 'Paracetamol'
select MIN(substring(ID, 1, 10)) ID, substring(Generic_Name, 1, 1000) Generic_Name
from #tab
where substring(Generic_Name, 1, 1) <> ''
group by substring(Generic_Name, 1, 1000)

You can try this query
Select distinct Item_GenName FROM(
Select * FROM ItemMasterlistTable where Item_GenName <> ''
)t
Inner query remove non-empty records and outer query get the distinct record from the inner output

How to insert Duplicated data only into an Event Log?

I need help regarding a SQL query problem. I have a query where I am able to delete the duplicates but I also need to create records of the duplicated data being deleted into a EventLog in which I am clueless about it. Below is an example of my Student Table. From the table below, you can see only Alpha and Bravo are duplicated
id Name Age Group
-----------------------
1 Alpha 11 A
2 Bravo 12 A
3 Alpha 11 B
4 Bravo 12 B
5 Delta 11 B
As I am copying data from Group A to Group B, I need to find & delete the duplicated data in group B. Below is my query on deleting duplicates from Group B.
DELETE Student WHERE id
IN (SELECT tb.id
FROM Student AS ta
JOIN Student AS tb ON ta.name=tb.name AND ta.age=tb.age
WHERE ta.GroupName='A' AND tb.GroupName='B')
Here is an example of my eventlog and how I want the query that I execute to like.
id Name Age Group Status
------------------------------------------
1 Alpha 11 B Delete
2 Bravo 11 B Delete
Instead of inserting the entire Group B data into the eventlog, is there any query that can just insert the Duplicated Data into the event log?

If we are speaking about Microsoft sql, key is output clause, more details here https://msdn.microsoft.com/en-us/library/ms177564.aspx
declare #Student table
( id int, name nvarchar(20), age int,"groupname" char(1))
insert into #student values (1, 'Alpha' , 11, 'A' ),
(2, 'Bravo' , 12, 'A'),
(3 ,'Alpha' , 11 , 'B'),
(4 ,'Bravo' ,12 , 'B'),
(5 ,'Delta' ,11 , 'B')
declare #Event table
( id int, name nvarchar(20), age int,"groupname" char(1),"Status" nvarchar(20))
select * from #Student
DELETE #Student
output deleted.*, 'Deleted' into #Event
WHERE id
IN (SELECT tb.id
FROM #Student AS ta
JOIN #Student AS tb ON ta.name=tb.name AND ta.age=tb.age
WHERE ta.GroupName='A' AND tb.GroupName='B')
select * from #event

Run this before the Delete above. Not sure how you decide what one is the duplicate but you can use Row_Number to list them with the non duplicate at as 1 and and then insert everything with a row_Number > 1
; WITH cte AS
(
SELECT Name
,Age
,[Group]
,STATUS = 'Delete'
,RID = ROW_NUMBER ( ) OVER ( PARTITION BY Name,Age ORDER BY Name)
FROM Student AS ta
JOIN Student AS tb ON ta.name=tb.name AND ta.age=tb.age
)
INSERT INTO EventLog
SELECT Name,Age,[Group],'Delete'
FROM cte
WHERE RID > 1

you need to create basic trigger after delete in student table, this query will be executed after any deletion process in student table and will insert deleted record into log_table
create trigger deleted_records
on student_table
after delete
as
begin
insert into log_table
select d.id, d.Name, d.Age, d.Group, 'DELETED'
from DELETED d;
end

Strange order of line insertion

I have a stored procedure that inserts a line in a table. This table has an auto incremented int primary key and a datetime2 column named CreationDate. I am calling it in a for loop via my C# code, and the loop is inside a transaction scope.
I run the program twice, first time with a for loop that turned 6 times and second time with a for loop that turned 2 times. When I executed this select on sql server I got a strange result
SELECT TOP 8
RequestId, CreationDate
FROM
PickupRequest
ORDER BY
CreationDate DESC
What I didn't get is the order of insertion: for example the line with Id=58001 has to be inserted after that with Id=58002 but this is not the case. Is that because I put my loop in a transaction scoope? or the precision in the datetime2 is not enough?

It is a question of speed and statement scope as well...
Try this:
--This will create a #numbers table with 1 mio numbers:
DECLARE #numbers TABLE(Nbr BIGINT);
WITH N(N) AS
(SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1)
,MoreN(N) AS
(SELECT 1 FROM N AS N1 CROSS JOIN N AS N2 CROSS JOIN N AS N3 CROSS JOIN N AS N4 CROSS JOIN N AS N5 CROSS JOIN N AS N6)
INSERT INTO #numbers(Nbr)
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM MoreN;
--This is a dummy table for inserts:
CREATE TABLE Dummy(ID INT IDENTITY,CreationDate DATETIME);
--Play around with the value for #Count. You can insert 1 mio rows in one go. Although this runs a while, all will have the same datetime value:
--Use a small number here and below, still the same time value
--Use a big count here and a small below will show a slightly later value for the second insert
DECLARE #Count INT = 1000;
INSERT INTO Dummy (CreationDate)
SELECT GETDATE()
FROM (SELECT TOP(#Count) 1 FROM #numbers) AS X(Y);
--A second insert
SET #Count = 10;
INSERT INTO Dummy (CreationDate)
SELECT GETDATE()
FROM (SELECT TOP(#Count) 1 FROM #numbers) AS X(Y);
SELECT * FROM Dummy;
--Clean up
GO
DROP TABLE Dummy;

You did your insertions pretty fast so the actual CreationDate values inserted in one program run had the same values. In case you're using datetime type, all the insertions may well occur in one millisecond. So ORDER BY CreationDate DESC by itself does not guarantee the select order to be that of insertion.
To get the desired order you need to sort by the RequestId as well:
SELECT TOP 8 RequestId, CreationDate
FROM PickupRequest
ORDER BY CreationDate DESC, RequestId DESC

How to set an integer value to one if a record exist in database C# Sql Query

getName_as_Rows is an array which contains some names.
I want to set an int value to 1 if record found in data base.
for(int i = 0; i<100; i++)
{
using (var command = new SqlCommand("select some column from some table where column = #Value", con1))
{
command.Parameters.AddWithValue("#Value", getName_as_Rows[i]);
con1.Open();
command.ExecuteNonQuery();
}
}
I am looking for:
bool recordexist;
if the above record exist then bool = 1 else 0 with in the loop.
If have to do some other stuff if the record exist.

To avoid making N queries to the database, something that could be very expensive in terms of processing, network and so worth, I suggest you to Join only once using a trick I learned. First you need a function in your database that splits a string into a table.
CREATE FUNCTION [DelimitedSplit8K]
--===== Define I/O parameters
(#pString VARCHAR(8000), #pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(#pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(#pString,t.N,1) = #pDelimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(#pString,s.N1,ISNULL(NULLIF(CHARINDEX(#pDelimiter,#pString,s.N1),0)-s.N1,8000))
FROM cteStart s
GO
Second, concatenate your 100 variables into 1 string:
"Value1", "Value 2", "Value 3"....
In Sql Server you can just join the values with your table
SELECT somecolumn FROM sometable t
INNER JOIN [DelimitedSplit8K](#DelimitedString, ',') v ON v.Item = t.somecolumn
So you find 100 strings at a time with only 1 query.

Use var result = command.ExecuteScalar() and check if result != null
But a better option than to loop would be to say use a select statement like
SELECT COUNT(*) FROM TABLE WHERE COLUMNVAL >= 0 AND COLUMNVAL < 100,
and run ExecuteScalar on that, and if the value is > 0, then set your variable to 1.

Selecting multiple row from one row in SQL

I have the following output with me from multiple tables
id b c b e b g
abc 2 123 3 321 7 876
abd 2 456 3 452 7 234
abe 2 0 3 123 7 121
abf 2 NULL 3 535 7 1212
Now I want to insert these values into another table and the insert query for a single command is as follows:
insert into resulttable values (id,b,c), (id,b,e) etc.
For that I need to do a select such that it gives me
id,b,c
id,b,e etc
I dont mind getting rid of b too as it can be selected using c# query.
How can I achieve the same using a single query in sql. Again please note its not a table its an output from different tables
My query should look as follows: from the above I need to do something like
select b.a, b.c
union all
select b.d,b.e from (select a,c,d,e from <set of join>) b
But unfortunately that does not work

INSERT resulttable
SELECT id, b, c
FROM original
UNION
SELECT id, b, e
FROM original
Your example has several columns named 'b' which isn't allowed...

Here, #tmporigin refers to your original query that produces the data in the question. Just replace the table name with a subquery.
insert into resulttable
select
o.id,
case a.n when 1 then b1 when 2 then b2 else b3 end,
case a.n when 1 then c when 2 then e else g end
from #tmporigin o
cross join (select 1n union all select 2 union all select 3) a
The original answer below, using CTE and union all requiring CTE evaluation 3 times
I have the following output with me from multiple tables
So set that query up as a Common Table Expression
;WITH CTE AS (
-- the query that produces that output
)
select id,b1,c from CTE
union all
select id,b2,e from CTE
union all
select id,b3,g from CTE
NOTE - Contrary to popular belief, your CTE while conveniently written once, is run thrice in the above query, once for each of the union all parts.
NOTE ALSO that if you actually name 3 columns "b" (literally), there is no way to identify which b you are referring to in anything that tries to reference the results - in fact SQL Server will not let you use the query in a CTE or subquery.
The following example shows how to perform the above, as well as (if you show the execution plan) revealing that the CTE is run 3 times! (the lines between --- BELOW HERE and --- ABOVE HERE is a mock of the original query that produces the output in the question.
if object_id('tempdb..#eav') is not null drop table #eav
;
create table #eav (id char(3), b int, v int)
insert #eav select 'abc', 2, 123
insert #eav select 'abc', 3, 321
insert #eav select 'abc', 7, 876
insert #eav select 'abd', 2, 456
insert #eav select 'abd', 3, 452
insert #eav select 'abd', 7, 234
insert #eav select 'abe', 2, 0
insert #eav select 'abe', 3, 123
insert #eav select 'abe', 7, 121
insert #eav select 'abf', 3, 535
insert #eav select 'abf', 7, 1212
;with cte as (
---- BELOW HERE
select id.id, b1, b1.v c, b2, b2.v e, b3, b3.v g
from
(select distinct id, 2 as b1, 3 as b2, 7 as b3 from #eav) id
left join #eav b1 on b1.b=id.b1 and b1.id=id.id
left join #eav b2 on b2.b=id.b2 and b2.id=id.id
left join #eav b3 on b3.b=id.b3 and b3.id=id.id
---- ABOVE HERE
)
select b1, c from cte
union all
select b2, e from cte
union all
select b3, g from cte
order by b1
You would be better off storing the data into a temp table before doing the union all select.

Instead of this which does not work as you know
select b.a, b.c
union all
select b.d,b.e from (select a,c,d,e from <set of join>) b
You can do this. Union with repeated sub-select
select b.a, b.c from (select a,c,d,e from <set of join>) b
union all
select b.d, b.e from (select a,c,d,e from <set of join>) b
Or this. Repeated use of cte.
with cte as
(select a,c,d,e from <set of join>)
select b.a, b.c from cte b
union all
select b.d, b.e from cte b
Or use a temporary table variable.
declare #T table (a int, c int, d int, e int)
insert into #T values
select a,c,d,e from <set of join>
select b.a, b.c from #T b
union all
select b.d, b.e from #T b
This code is not tested so there might be any number of typos in there.

I'm not sure if I understood Your problem correctly, but i have been using something like this for some time:
let's say we have a table
ID Val1 Val2
1 A B
2 C D
to obtain a reslut like
ID Val
1 A
1 B
2 C
2 D
You can use a query :
select ID, case when i=1 then Val1 when i=2 then Val2 end as Val
from table
left join ( select 1 as i union all select 2 as i ) table_i on i=i
which will simply join the table with a subquery containing two values and create a cartesian product. In effect, all rows will be doubled (or multiplied by how many values the subquery will have). You can vary the number of values depending on how many varsions of row You'll need. Depending on the value of i, Val will be Val1 or Val2 from original table. If you'll see the execution plan, there will be a warning that the join has no join predicates (because of i=i), but it is ok - we want it.
This makes queries a bit large (in terms of text) because of all the case when, but are quite easy to read if formatted right. I needed it for stupid tables like "BigID, smallID1, smallID2...smallID11" that was spread across many columns I don't know why.
Hope it helps.
Oh, I use a static table with 10000 numbers, so i just use
join tab10k on i<=10
for 10x row.
I apologize for stupid formatting, I'm new here.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.