Matching all columns with all search phrases

Matching all columns with all search phrases - c#

I want to let a user search through all the columns in a table for a set of phrases defined in a textbox (split terms with whitespace).
So what first came to mind is finding a way in SQL to concatenate all the columns and just use the LIKE operator (for each phrase) in this result.
The other solution I thought of is writing an algorithm which takes all the phrases searched, and match them with all the columns.
So I ended up with the following:
String [] columns = {"col1", "col2", "col3", "col4"};
String [] phrases = textBox.Text.Split(' ');
I then took all the possible combinations of columns and phrases, and put that into a where-clause-format for sql and then the result was
"(col1 LIKE '%prase1%' AND col1 LIKE '%phrase2%') OR
(col1 LIKE '%phrase1%' AND col2 LIKE '%phrase2%') OR
(col1 LIKE '%phrase2%' AND col2 LIKE '%phrase1%') OR
(col2 LIKE '%phrase1%' AND col3 LIKE '%phrase2%')"
The above is just an example snippet of the output, the amount of conditions being created in this algorith is measured by
conditions=columns^(phrases+1)
So I observed that having 2 search phrases can still give good performance, but more than that will certainly decrease performance drastically.
What is the best practise when searching all the columns for the same data?

Edwin,
I didn't know you was using ORACLE. My solution is using SQL Server. Hopefully you will get the gist of the solution and translate into PL/SQL.
Hopefully this is useful to you.
I am manually populating the #search temp table. You will need to somehow do that. Or look for some Split Function that will take the delimited string and return a Table.
IF OBJECT_ID('tempdb..#keywords') IS NOT NULL
DROP TABLE #keywords;
IF OBJECT_ID('tempdb..#search') IS NOT NULL
DROP TABLE #search;
DECLARE #search_count INT
-- Populate # search with all my search strings
SELECT *
INTO #search
FROM (
SELECT '%ST%' AS Search
UNION ALL
SELECT '%CL%'
) T1
SELECT #search_count = COUNT(*)
FROM #search;
PRINT #search_count
-- Populate my #keywords table with all column values from my table with table id and values
-- I just did a select id, value union with all fields
SELECT *
INTO #keywords
FROM (
SELECT client_id AS id
,First_name AS keyword
FROM [CLIENT]
UNION
SELECT client_id
,last_name
FROM [CLIENT]
) AS T1
-- see what is in there
SELECT *
FROM #search
SELECT *
FROM #keywords
-- I am doing a count(distinct #search.Search). This will get me a count,
--so if I put in 3 search values my count should equal 3 and that tells me all search strings have been found
SELECT #keywords.id
,COUNT(DISTINCT #search.Search)
FROM #keywords
INNER JOIN #search ON #keywords.keyword LIKE #search.Search
GROUP BY #keywords.id
HAVING COUNT(DISTINCT #search.Search) = #search_count
SELECT *
FROM [CLIENT]
WHERE [CLIENT].client_id IN (
SELECT #keywords.id
FROM #keywords
INNER JOIN #search ON #keywords.keyword LIKE #search.Search
GROUP BY #keywords.id
HAVING COUNT(DISTINCT #search.Search) = #search_count
)

You could create a stored procedure or function in PL/SQL to dynamically search the table for the search terms and then bring back the primary key and column of any matches. The code sample below should be enough to tailor to your requirements.
create table text_table(
col1 varchar2(32),
col2 varchar2(32),
col3 varchar2(32),
col4 varchar2(32),
col5 varchar2(32),
pk varchar2(32)
);
insert into text_table(col1, col2, col3, col4, col5, pk)
values ('the','quick','brown','fox','jumped', '1');
insert into text_table(col1, col2, col3, col4, col5, pk)
values ('over','the','lazy','dog','!', '2');
commit;
declare
rc sys_refcursor;
cursor_num number;
col_count number;
desc_tab dbms_sql.desc_tab;
vs_column_value varchar2(4000);
search_terms dbms_sql.varchar2a;
matching_cols dbms_sql.varchar2a;
empty dbms_sql.varchar2a;
key_value varchar2(32);
begin
--words to search for (i.e. from the text box)
search_terms(1) := 'fox';
search_terms(2) := 'box';
open rc for select * from text_table;
--Get the cursor number
cursor_num := dbms_sql.to_cursor_number(rc);
--Get the column definitions
dbms_sql.describe_columns(cursor_num, col_count, desc_tab);
--You must define the columns first
for i in 1..col_count loop
dbms_sql.define_column(cursor_num, i, vs_column_value, 4000);
end loop;
--loop through the rows
while ( dbms_sql.fetch_rows(cursor_num) > 0 ) loop
matching_cols := empty;
for i in 1 .. col_count loop --loop across the cols
--Get the column value
dbms_sql.column_value(cursor_num, i, vs_column_value);
--Get the value of the primary key based on the column name
if (desc_tab(i).col_name = 'PK') then
key_value := vs_column_value;
end if;
--Scan the search terms array for a match
for j in 1..search_terms.count loop
if (search_terms(j) like '%'||vs_column_value||'%') then
matching_cols(nvl(matching_cols.last,0) + 1) := desc_tab(i).col_name;
end if;
end loop;
end loop;
--Print the result matches
if matching_cols.last is not null then
for i in 1..matching_cols.last loop
dbms_output.put_line('Primary Key: '|| key_value||'. Matching Column: '||matching_cols(i));
end loop;
end if;
end loop;
end;

Related

How can I sort counted DataTable rows according to specific parameter

I am trying to use dataTable.Rows.Count, but sort the result based on a specific parameter.
That parameter being "Column1" in my DataTable. So that the output gives me the rows pertaining to that distinct value only.
My DataTable is in a View, but I need to use the sorted int values in a ViewModel.
I am able to count the rows with public static int o { get; set; } and a
dt.Rows.Count in my DataTable. I then grab the value by instantiating my View in my ViewModel, with int numberOfRows = ViewName.o;.
But that gives me the total number of rows, whereas I need the number of rows per distinct value in "Column1".
My question is, where and how can I do the required sorting?
Because when I've gone as far as to count them and add them to an int (in my ViewModel), there's no way to know what row they used to represent, right?
And If I try to sort in the DataTable (in the View) somehow, I don't know how to reference the distinct values.
They might vary from time to time as the program is used, so I can't hard-code it.
Comment suggested using query instead, adding my stored procedure for help to implement solution:
SET NOCOUNT ON
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[myProcedure]
#param myUserDefinedTableType readonly
AS
BEGIN TRANSACTION
INSERT INTO [dbo].[myTable] (/* list of columns */)
SELECT [Column1], /* this is the column I need to sort by */
-- more columns
-- the rest of the columns
/* I do aggregations to my columns here, I am adding several thousands row to 20 summarized rows, so after this line, I can no longer get ALL the rows per "Column1", but only the summarized rows. How can I count the rows BEFORE I do the aggregations? */
FROM #param
GROUP BY [Column2], [Column1]
ORDER BY [Column2], [Column1]
// Some UPDATE clauses
COMMIT TRANSACTION

I believe the wisest choice is to act in the db side.
Assuming you're using SQL Server, the query should be:
SELECT *, COUNT(*) OVER (PARTITION BY Column1) AS c
FROM Table1
ORDER BY c
This query returns the data on your table "Table1", plus the column "c" that represents the count of the value of "Column1", per each row.
Finally, it sorts rows by the column "c", as you request.
EDIT
To complete this task, I will use a Common Table Expression:
-- Code before INSERT...
;WITH CTE1 AS (
SELECT *,
COUNT(*) OVER (PARTITION BY [Column1]) AS c
FROM #param
)
INSERT INTO [dbo].[myTable] (/* list of columns - must add the column c */)
SELECT [Column1],
[Column2],
[c],
-- aggregated columns
FROM CTE1
GROUP BY [Column2], [Column1], c
-- Code after INSERT...
In the Common Table Expression "CTE1" I select all the values in #param, adding a column "c" with the count per Column1.
Note: if you have 5 rows with the same value of Column1, but two different values in Column2, in myTable you will have two rows (because of the GROUP BY [Column2], [Column1]), both with c=5.
If you want instead obtain the count grouped by Column1 and Column2, you have to declare c as follows: COUNT(*) OVER (PARTITION BY [Column1], [Column2]) AS c.
I hope I was clear, If not I'm available to explain it in a different way.
EXAMPLE
CREATE TABLE myTable (col1 VARCHAR(50), col2 INT, col3 INT, c INT)
CREATE TYPE myUserDefinedTableType AS TABLE (column1 VARCHAR(50), column2 INT, column3 INT)
DECLARE #param myUserDefinedTableType
INSERT INTO #param VALUES ('A', 1, 4), ('A', 2, 3), ('A', 2, 6), ('B', 2, 3)
;WITH CTE1 AS (
SELECT *, COUNT(*) OVER (PARTITION BY [Column1]) AS c
FROM #param
)
INSERT INTO [myTable]([col1], [col2], [col3], [c])
SELECT [column1], [column2],
-- aggregated columns
MAX([column3]),
-- count
[c]
FROM CTE1
GROUP BY [column2], [column1], c

DataView dv = dt.DefaultView;
dv.Sort = "SName ASC"; -- your column name
DataTable dtsorted = dv.ToTable();
DataTable dtsorted = dv.ToTable(true, "Sname","Surl" ); //return distinct rows

Batch delete operation procedure not working

I have a stored procedure which looks like following:
alter procedure [dbo].[zsp_deleteEndedItems]
(
#ItemIDList nvarchar(max)
)
as
delete from
SearchedUserItems
WHERE EXISTS (SELECT 1 FROM dbo.SplitStringProduction(#ItemIDList,',') S1 WHERE ItemID=S1.val)
The parameter IDList is passed like following:
124125125,125125125...etc etc
And the split string function look like following:
ALTER FUNCTION [dbo].[SplitStringProduction]
(
#string nvarchar(max),
#delimiter nvarchar(5)
) RETURNS #t TABLE
(
val nvarchar(500)
)
AS
BEGIN
declare #xml xml
set #xml = N'<root><r>' + replace(#string,#delimiter,'</r><r>') + '</r></root>'
insert into #t(val)
select
r.value('.','varchar(500)') as item
from #xml.nodes('//root/r') as records(r)
RETURN
END
This is supposed to delete all items from table "SearcheduserItems" under the IDs:
124125125 and 125125125
But for some reason after I do a select to check it out:
select * from SearchedUserItems
where itemid in('124125125','125125125')
The records are still there...
What am I doing wrong here? Can someone help me out?

As mentioned in the comments, a different option would be to use a table type parameter. This makes a couple of assumptions (some commented), however, should get you on the right path:
CREATE TYPE dbo.IDList AS TABLE (ItemID int NOT NULL); --Assumed int datatype;
GO
ALTER PROC dbo.zsp_deleteEndedItems #ItemIDList dbo.IDList READONLY AS
DELETE SUI
FROM dbo.SearchedUserItems SUI
JOIN #ItemIDList IDL ON SUI.ItemID = IDL.ItemID;
GO
--Example of usage
DECLARE #ItemList dbo.IDList;
INSERT INTO #ItemList
VALUES(123456),(123457),(123458);
EXEC dbo.zsp_deleteEndedItems #ItemList;
GO
In regards to the question of an inline table value function, one such example is the below, which I quickly wrote up, that provides a tally table of the next 1000 numbers:
CREATE FUNCTION dbo.NextThousand (#Start int)
RETURNS TABLE
AS RETURN
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)
)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 + #Start AS I
FROM N N1 --10
CROSS JOIN N N2 --100
CROSS JOIN N N3; --1,000
GO
The important thing about an iTVF is that it has only one statement, and that is the RETURN statement. Declaring the table as a return type variable, inserting data into it, and returning that variable turns it into a multi-line TVF; which perform far slower.

Many SQL rows into one

I've got a stored procedure which joins a number of tables to produce a large resultset which is then returned to my application. The application in turn loops through the results and combines rows on a particular ID and chooses data per row to include in a new object. This is perhaps easiest to explain using an example:
Inspection, Desc, Value
1, Description1, 3
1, Description2, 2
1, Description3, 5
This is in code turned into
Inspection, Description1, Description2, Description3
1, 3, 2, 5
The point of this is to have one row per inspection item with item description as headers and value as the cell value for inspection row and header. This is then exported to Excel.
The question is: how do I do this in SQL Server, as in expanding my SP to return a lot fewer but "wider" rows with a lot more columns?
Another complication is that one inspection may have rows which another one lacks, in that case the solution is to add an empty value or a '-'.
P.S. This is using Sql Server 2012.

If you are using mssql 2005+. You can use a pivot like this:
Test data
DECLARE #tbl TABLE(Inspection INT, [Desc] VARCHAR(100),Value INT)
INSERT INTO #tbl
VALUES
(1,'Description1', 3),
(1,'Description2', 2),
(1,'Description3', 5)
Query
SELECT
*
FROM
(
SELECT
tbl.Inspection,
tbl.[Desc],
tbl.Value
FROM
#tbl AS tbl
) AS tbl
PIVOT
(
SUM(Value)
FOR [Desc] IN ([Description1],[Description2],[Description3])
)AS pvt
Result:
Inspection, Description1, Description2, Description3
1 3 2 5
Edit
As juharr said in the comment:
The resulting column names (values in the table) are when building the query. Which might require another initial query to get
Edit 2
If you are not using mssql 2005+. Or want to have and alternitive explanation. Please see the following query:
SELECT
tbl.Inspection,
SUM(CASE WHEN [Desc]='Description1' THEN tbl.Value ELSE 0 END) AS Description1,
SUM(CASE WHEN [Desc]='Description2' THEN tbl.Value ELSE 0 END) AS Description2,
SUM(CASE WHEN [Desc]='Description3' THEN tbl.Value ELSE 0 END) AS Description3
FROM
#tbl AS tbl
GROUP BY
tbl.Inspection
This do not requiere a pivot and can be use on most of RDMS out there

You should use Sql Server Pivot. It converts rows into columns. You can have an easiest start by this example.

If you'd like to do this dynamically, without having to know what all of the Desc values are, you can build your pivot query and use Exec() or Execute sp_executesql
DECLARE #Columns NVARCHAR(MAX),
#Sql NVARCHAR(MAX)
--Build your column headers based on Distinct Desc values
SELECT #Columns = COALESCE(#Columns + ',', '') + QUOTENAME([Desc])
FROM (SELECT DISTINCT [Desc] FROM tbl) t
ORDER BY [Desc]
--Build your pivot query
SET #Sql = '
SELECT
*
FROM
tbl
PIVOT
(
MAX([Value])
FOR [Desc] IN (' + #Columns + ')
) p
'
EXEC(#Sql)
If you want - for null values, you'll need to create another variable to hold the conversion scripts for the Select part of your sql.
DECLARE #Columns NVARCHAR(MAX),
#Sql NVARCHAR(MAX),
#ColumnAliases NVARCHAR(MAX)
--Build your pivot columns based on Distinct Desc values
SELECT #Columns = COALESCE(#Columns + ',', '') + QUOTENAME([Desc])
FROM (SELECT DISTINCT [Desc] FROM tbl) t
ORDER BY [Desc]
--Build your column headers, replacing NULL with -
SELECT #ColumnAliases = COALESCE(#ColumnAliases + ',', '')
+ 'COALESCE(CONVERT(VARCHAR,' + QUOTENAME([Desc]) + '),''-'') AS ' + QUOTENAME([Desc])
FROM (SELECT DISTINCT [Desc] FROM tbl) t
ORDER BY [Desc]
--Build your pivot query
SET #Sql = '
SELECT
Inspection,'
+ #ColumnAliases + '
FROM
tbl
PIVOT
(
MAX([Value])
FOR [Desc] IN (' + #Columns + ')
) p
'
EXEC(#Sql)

IN Clause with WHERE clause not getting proper results in SQL Server

SELECT Col1, Col2, Col3, Col4
FROM Table1
WHERE User1 = #Owner
AND group1 = #Group
AND date1 BETWEEN #startDate AND #endDate
AND Mail LIKE #email
AND def IN (CASE #InvoiceMethod //Problem is Here
WHEN ''
THEN def
ELSE (#InvoiceMethod)
END)
A piece of code from the stored procedure. If am executing this, it's not returning any rows, even though it has some to return. Problem is with the IN clause, if I didn't pass anything to IN clause i.e #InvoiceMethod is null, then I'm getting rows.
If I pass anything to #InvoiceMethod, I'm not getting any rows.
The value in #InvoiceMethod is = 'A','B'
I tried many combinations like 'A','B' or "A","B" without any results.
How to pass values to IN clause please? In which format?
Please help me out of this.
Modified the stored procedure to the following,
Declare #tmpt table (value nvarchar(5) not null)
SET #InvoiceCount=(select COUNT(*) from dbo.fnSplit(#InvoiceMethod, ','))
SET #tempVar=1;
WHILE #tempVar<=(#InvoiceCount)
BEGIN
INSERT INTO #tmpt (value)
VALUES (#InvoiceMethod);//Here i need to insert array of values to temp table.like invoicemethod[0],invoicemethod[1]&invoicemethod[2] depends on #InvoiceCount
SET #tempVar=#tempVar+1;
END
--DECLARE #tmpt TABLE (value NVARCHAR(5) NOT NULL)
--INSERT INTO #tmpt (value) VALUES (#InvoiceMethod);
SELECT Col1,Col2,Col3,Col4
FROM Table1
WHERE User1 = #Owner
AND group1 = #Group
AND date1 between #startDate AND #endDate
AND Mail LIKE #email
AND def IN (SELECT value FROM #tmpt)
But not getting the results as expected :(

IMO this isn't a good way to approach this problem, by passing a list of filter values for a column in a comma separated string, as this is almost encouraging a Dynamic Sql approach to the problem (i.e. where you EXEC a built Sql string which pastes in the #InvoiceMethod as a string).
Instead, Sql 2008 has Table Valued Parameters, (and prior to this, you could use Xml), which allows you to pass structured data into a procedure in a table format.
You then just need to join to this table parameter to effect the 1..N valued IN () filtering.
CREATE TYPE ttInvoiceMethods AS TABLE
(
Method VARCHAR(20)
);
GO
CREATE PROCEDURE dbo.SomeProc
(
#InvoiceMethod ttInvoiceMethods READONLY, -- ... Other Params here
)
AS
begin
SELECT Col1, Col2, ...
FROM Table1
INNER JOIN #InvoiceMethod
ON Table1.def = #InvoiceMethod.Method -- Join here
WHERE User1 = #Owner
... Other Filters here
END
Have a look here for a similar solution with a fiddle.
Edit
The optional parameter (#InvoiceMethod = '') can be handled by changing the JOIN to the TVP with a subquery:
WHERE
-- ... Other filters
AND (Table1.def IN (SELECT Method FROM #InvoiceMethod))
OR #InvoiceMethod IS NULL)
To Initialize a TVP to NULL, just don't bind to it in C# at all.

I think a variable represetning multiple values with comma is not allowed in the in clause. You should either use string fiunctions (split and join) or go with the temp table solution. I prefer the second.
Use a temporary table to store your values and then pass it to your in statement
DECLARE #tmpt TABLE (value NVARCHAR(5) NOT NULL)
INSERT INTO #tmpt .........
...
...
SELECT Col1,Col2,Col3,Col4
FROM Table1
WHERE User1 = #Owner
AND group1 = #Group
AND date1 BETWEEN #startDate AND #endDate
AND Mail LIKE #email
AND def IN (SELECT value FROM #tmpt)

Used Splitfunctions to resolve the issue,Modified SQL Query
SELECT Col1, Col2, Col3, Col4
FROM Table1
WHERE User1 = #Owner
AND group1 = #Group
AND date1 BETWEEN #startDate AND #endDate
AND Mail LIKE #email
AND def IN (SELECT * FROM sptFunction(#InvoiceMethod,',')) //Problem is Here (Solved by using split functions)

How to use a set of strings in a WHERE statement of SQL?

Sorry i am not sure how to titled the question well. I want to select few records in sql where a particular column is a set of strings.
Example . I have a table student and has columns ID and name. ID has records 1,2,3,4,5,6 . NAme has A,B,C,D,E,F.
I want to return C,D,E WHERE ID=[3,4,5].
I tried
SELECT FROM student WHERE ID=2,3,4
it gives error, ID=2,3,4 ='2,3,4' and it reads ID as a single columns. I am confused.
Also in my case, ID set are returned in a storedprocedure variable. that is like #ID
SELECT * FROM STUDENT WHERE ID=#ID
#ID above is a variable of a string type holding the set {1,2,3}. Please any help would be appreciated.

Try this:
SELECT * FROM student WHERE ID IN (2,3,4)
Syntax:
test_expression IN
( subquery | expression [ ,...n ]
)
Read more about IN operator here.

WHERE ID=2,3,4 and WHERE ID='2,3,4' are invalid syntax for SQL.
Looks like you can use IN (Transact-SQL) on your situation.
Determines whether a specified value matches any value in a subquery
or a list.
SELECT FROM student WHERE ID IN (2, 3, 4)
Also you might take a look Jeff's question Parameterize an SQL IN clause

If you are passing #ID as a variable with comma separated list of ids, WHERE IN (#ID) will not work.
I think best thing would be to use a Table Valued function to split them first and then query the table. Please check here for a Split() function.
Usage:
SELECT * FROM STUDENT
WHERE ID IN (
SELECT items FROM dbo.Split(#ID, ',') --Split function here
)

If you want to fitler multiple values in Select, you should use "in ()":
SELECT * FROM student WHERE ID in (2,3,4)
OR
SELECT * FROM student WHERE ID between 2 and 4
OR
SELECT * FROM student WHERE ID = 2 OR ID = 3 OR ID = 4
In this case take the first one.
The last one is very slow and not recommended in this scenario.

Please check this out
Select * from Student where Id IN ('2','3','4')
and check this out
Select Username from Student where ID IN ' + '('+ #Id +')'
where #Id=2,3,4

Select * from Student where Id='2'
union all
Select * from Student where Id='3'
union all
Select * from Student where Id='4'

Based on your comment below, you don't want to convert ID to an int. Instead, use LIKE to compare:
SELECT * from STUDENT
WHERE ', '+#ID+', ' LIKE ', '+CAST(ID as NVARCHAR(255)) + ', ';
However, the query will not be indexed. If you want the query to be indexed, then use dynamic SQL:
DECLARE #query NVARCHAR(max) = 'SELECT * FROM STUDENT WHERE ID IN ('+ #ID +')';
EXEC sp_executesql #query;

Since you are using Stored Procedure, that also has only equality compare i.e. id = 1, so either you have too execute three queries by splitting the input by comma separated values.
OR you can add a new procedure with a custom function to server with the SQL
CREATE FUNCTION dbo.myparameter_to_list (#parameter VARCHAR(500)) returns #myOutput TABLE (mytempVal VARCHAR(40))
AS
begin
DECLARE #TempTable table
(
mytempVal VARCHAR(40)
)
DECLARE #MySplittedValue varchar(40), #PositionOfComma int
SET #par = LTRIM(RTRIM(#parameter))+ ','
SET #PositionOfComma = CHARINDEX(',', #parameter, 1)
IF REPLACE(#parameter, ',', '') <> ''
BEGIN
WHILE #PositionOfComma > 0
BEGIN
SET #MySplittedValue = LTRIM(RTRIM(LEFT(#par, #PositionOfComma - 1)))
IF #MySplittedValue <> ''
BEGIN
INSERT INTO #TempTable (mytempVal) VALUES (#MySplittedValue) --Use conversion if needed
END
SET #par = RIGHT(#par, LEN(#par) - #PositionOfComma)
SET #PositionOfComma = CHARINDEX(',', #par, 1)
END
END
INSERT #myOutput
SELECT mytempVal
FROM #TempTable
RETURN
END
In your stored procedure you would use it like this:
Create Procedure StudentSelectFromSet
#Ids VARCHAR(MAX)
AS
SELECT * FROM student Stud
WHERE Stud.Id IN(SELECT value FROM dbo.myparameter_to_list (#Ids))
and then execute this new procedure as you were accessing earlier.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Matching all columns with all search phrases - c#

Related

How can I sort counted DataTable rows according to specific parameter

Batch delete operation procedure not working

Many SQL rows into one

IN Clause with WHERE clause not getting proper results in SQL Server

How to use a set of strings in a WHERE statement of SQL?

Categories

Resources