Preventing duplication from SQL Query, ASP.net

Preventing duplication from SQL Query, ASP.net - c#

I have the below SQL query using the Query Builder in Visual Studio. As you can see the same user is duplicated 3 times, this is due to the user having 3 different skills. How can I merge the 3 skills together in the SQL query or in a ListView control so that it only displays one result instead of 3 and that the user has their 3 skills listed?
SELECT users.role_id, users.username, users.first_name, users.last_name, users.description, roles.role_id, roles.role, skills.skill_id, skills.user_id, skills.skill
FROM users
INNER JOIN roles ON users.role_id = roles.role_id
INNER JOIN skills ON users.user_id = skills.user_id
WHERE (users.role_id = 3)

Use For XML Path(''), Type. It is a bit of a hack, because you're really creating an XML string without a root and fashioning odd elements, but it works well. Be sure to include the Type bit, otherwise the XML trick will attempt to convert special characters, like < and & into their XML escape sequences (here is an example).
Here is a simplified version of your problem in a SQL Fiddle. Below is the relevant Select snippet.
SELECT users.user_id, users.first_name,
STUFF(
(SELECT ', ' + skill
FROM skills
WHERE users.user_id = skills.user_id
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
, 1, 2, '') AS skill_list
FROM users

Try using Stuff and For Xml
Here's the Fiddle:
http://sqlfiddle.com/#!6/fcf71/5
See if it helps, it's just a sample so you will have to change the column names.

You can use PIVOT on the Skill then group those skills into one column.
To make it simple, I test it with some sample data like the following:
CREATE SCHEMA _Test
CREATE TABLE _Test.SkillSet(SkillId INT IDENTITY(1,1) PRIMARY KEY, SkillName NVARCHAR(64))
INSERT INTO _Test.SkillSet(SkillName) VALUES('C/C++')
INSERT INTO _Test.SkillSet(SkillName) VALUES('C#')
INSERT INTO _Test.SkillSet(SkillName) VALUES('Java')
CREATE TABLE _Test.Employees(EmpId INT IDENTITY(1,1) PRIMARY KEY, FullName NVARCHAR(256))
INSERT INTO _Test.Employees(FullName) VALUES('Philip Hatt')
INSERT INTO _Test.Employees(FullName) VALUES('John Rosh')
CREATE TABLE _Test.Employee_Skill(EmpId INT FOREIGN KEY REFERENCES _Test.Employees(EmpId), SkillId INT FOREIGN KEY REFERENCES _Test.SkillSet(SkillId))
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(1, 1)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(1, 2)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(1, 3)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(2, 2)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(2, 3)
WITH tEmpSkill
AS
(SELECT A.EmpId, A.FullName, C.SkillName
FROM _Test.SkillSet C RIGHT JOIN
(
_Test.Employees A LEFT JOIN _Test.Employee_Skill B
ON A.EmpId = B.EmpId
)
ON B.SkillId = C.SkillId
)
SELECT * FROM tEmpSkill
PIVOT(COUNT(SkillName) FOR SkillName IN([C/C++], [C#], [Java])) AS Competency
The query above gives me an intermediate result
PIVOT RESULT
Now you can easily make a string containing all the skills needed for each employee. You can also search for some articles to use the PIVOT with unknown number of columns (skill sets), which may better serve your need.
Hope this can help.

Related

What is a fastest way to get words from T-SQL datatable?

I have a SQL Server 2008 R2 datatable dbo.Forum_Posts with columns Subject (nvarchar(255)) andBody (nvarchar(max)).
I would like to get all words with length >= 3 from columns Subject and Body and insert them into datatable dbo.Search_Word (column Word, nvarchar(100)) and datatable dbo.SearchItem (column Title (nvarchar(200)).
I also want to get new generated SearchWordsID (primary key, autoincrement, int) from dbo.Search_Word, and SearchItemID (primary key, autoincrement,int) from dbo.SearchItem, and insert them into datatable dbo.SearchItemWord (columns SearchWordsID (foreign key,int, not null) and SearchItemID (foreign key,int,not null).
What is a fastest way to do this in T-SQL? Or I have to use C#? Thank you in advance for any help.

As requested, this will keep the ID's. So you will get a DISTINCT list of works BY id.
Slightly different approach than the first answer, but easily achieved via the Outer Apply
**
You must edit the initial query Select KeyID=[YourKeyID],Words=[YourField1]+' '+[YourField2] from [YourTable]
**
Declare #String varchar(max) = ''
Declare #Delimeter varchar(25) = ' '
-- Generate and Strip special characters
Declare #StripChar table (Chr varchar(10));Insert Into #StripChar values ('.'),(','),('/'),('('),(')'),(':') -- Add/Remove as needed
-- Generate Base Data and Expand via Outer Apply
Declare #XML xml
Set #XML = (
Select A.KeyID
,B.Word
From ( Select KeyID=[YourKeyID],Words=[YourField1]+' '+[YourField2] from [YourTable]) A
Outer Apply (
Select Word=split.a.value('.', 'varchar(150)')
From (Select Cast ('<x>' + Replace(A.Words, #Delimeter, '</x><x>')+ '</x>' AS XML) AS Data) AS A
Cross Apply data.nodes ('/x') AS Split(a)
) B
For XML RAW)
-- Convert XML to varchar(max) for Global Search & Replace (could be promoted to Outer Appy)
Select #String = Replace(Replace(cast(#XML as varchar(max)),Chr,' '),' ',' ') From #StripChar
Select #XML = cast(#String as XML)
Select Distinct
KeyID = t.col.value('#KeyID', 'int')
,Word = t.col.value('#Word', 'varchar(150)')
From #XML.nodes('/row') AS t (col)
Where Len(t.col.value('#Word', 'varchar(150)'))>3
Order By 1
Returns
KetID Word
0 UNDEF
0 Undefined
1 HIER
1 System
2 Control
2 UNDEF
3 JOBCONTROL
3 Market
3 Performance
...
87 Analyitics
87 Market
87 UNDEF
88 Branches
88 FDIC
88 UNDEF
...

You're going to need T-SQL to do the inserting into your tables. Your biggest challenge is going to be splitting the posts into words.
My suggestion would be to read the posts into C#, split each post into words (you can use the Split method to split on spaces or punctuation), filter the collection of words, and then execute your inserts from C#.
You can avoid using T-SQL directly if you use Entity Framework or a similar ORM.
Don't try to use T-SQL to split your posts into words unless you really want a totally SQL solution and are willing to take time to perfect it. And, yes, it will be slow: T-SQL is not fast at string operations.
You can also investigate full text indexing, which I believe has support for search keywords.

Perhaps this will help
Declare #String varchar(max) = ''
Declare #Delimeter varchar(25) = ' '
Select #String = #String + ' '+Words
From (
Select Words=[YourField1]+' '+[YourField2] from [YourTable]
) A
-- Generate and Strip special characters
Declare #StripChar table (Chr varchar(10));Insert Into #StripChar values ('.'),(','),('/'),('('),(')'),(':') -- Add/Remove as needed
Select #String = Replace(Replace(#String,Chr,' '),' ',' ') From #StripChar
-- Convert String into XML and Split Delimited String
Declare #Table Table (RowNr int Identity(1,1), String varchar(100))
Declare #XML xml = Cast('<x>' + Replace(#String,#Delimeter,'</x><x>')+'</x>' as XML)
Insert Into #Table Select String.value('.', 'varchar(max)') From #XML.nodes('x') as T(String)
-- Generate Final Resuls
Select Distinct String
From #Table
Where Len(String)>3
Order By 1
Returns (sample)
String
------------------
Access
Active
Adminstrators
Alternate
Analyitics
Applications
Branches
Cappelletti
City
Class
Code
Comments
Contact
Control
Daily
Data
Date
Definition
Deleted
Down
Email
FDIC
Variables
Weekly

Getting the first unused Id from a table?

In our database we have a table that lacks an identity column. There is an Id column, but it is manually populated when a record is inputted. Any item with an Id over 90,000 is reserved and is populated globally across all customer databases.
I'm building a tool to handle bulk insertions into this table using Entity Framework. I need to figure out what the most efficient method of finding the first available Id is (under 90,000) on the fly without iterating over every single row. It is highly likely that in many of the databases, someone has simply selected a random number that wasn't taken and used it to insert the row.
What is my best recourse?
Edit
After seeing some of the solutions listed, I attempted to replicate the SQL logic in Linq. I doubt it's perfect, but it seems incredibly fast and efficient.
var availableIds = Enumerable.Range(1, 89999)
.Except(db.Table.Where(n => n.Id <= 89999)
.Select(n => n.TagAssociationTypeID))
.ToList();

Have you considered something like:
SELECT
min(RN) AS FirstAvailableID
FROM (
SELECT
row_number() OVER (ORDER BY Id) AS RN,
Id
FROM
YourTable
) x
WHERE
RN <> Id

To answer your implied question of how do you get a list of available numbers to use: Easy, make a list of all possible numbers then delete the ones that are in use.
--This builds a list of numbers from 1 to 89999
SELECT TOP (89999) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO #AvialableNumbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
CREATE UNIQUE CLUSTERED INDEX n ON #AvialableNumbers(n)
--Start a seralizeable transaction so we can be sure no one uses a number
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
begin transaction
--Remove numbers that are in use.
delete #AvialableNumbers where n in (select Id from YourTable)
/*
Do your insert here using numbers from #AvialableNumbers
*/
commit transaction
Here is how you would do it via Entity framework
using(var context = new YourContext(connectionString))
using(var transaction = context.Database.BeginTransaction(IsolationLevel.Serializable))
{
var query = #"
SELECT TOP (89999) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO #AvialableNumbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
CREATE UNIQUE CLUSTERED INDEX n ON #AvialableNumbers(n)
--Remove numbers that are in use.
delete #AvialableNumbers where n in (select Id from YourTable)
--Select the numbers out to the result set.
select n from #AvialableNumbers order by n
drop table #AvialableNumbers
";
List<int> availableIDs = context.Database.SqlQuery<int>(query).ToList();
/*
Use the list of IDs here
*/
context.SaveChanges();
transaction.Commit();
}

Need an approach to get the record id from a table with hierarchy data

I need some help in getting a record id from a table. Basically I'm working on project where the folders and files of a particular path are stored in database.
It includes a desktop and windows service applications and I'm using a SDF file for database and handling data operations using ADO.NET using C#
This is my folders table
As you can see, its a hierarchy table and FolderId is an identity column.
Now suppose my data is as
And I have a path as "E:\Books\WCF\Examples.pdf". Now how can I get the FolderId of "Examples.pdf" file from above table.
I came up with following approaches
Approach 1:
To get all the records from Folders table which match with the folder name as "WCF" along with their complete hierarchy by writing a recursive method. So I will get following data
Now in my code, I will be comparing this hierarchy column with above folder path of the pdf to get the FolderId.
Approach 2:
Taking each and every folder from the pdf path, I will generate dynamic query which looks something like this
select FolderId from Folders where Name='WCF' and
ParentFolderId in (select FolderId from Folders where Name='Books' and
ParentFolderId in (select FolderId from Folders where Name='E:'))
Based on my two approaches, which one should I prefer. Performance is a crucial factor and the Folders table may have more than a million records. Feel free to suggest any better approach.

Test Data
CREATE TABLE Table_Name50
(ID INT, Name nvarchar(100), ParentID INT)
GO
INSERT INTO Table_Name50(ID, Name,ParentID)
VALUES
( 1 ,N'E:\',NULL),
( 2, N'Books', 1),
( 3, N'History', 2),
( 4, N'Biology', 2),
( 5, N'Vidoes', 1)
GO
Query
to pull the full path for a given ID
DECLARE #ID INT = 3
;with CompleteData
as
(
Select ID, ParentId from Table_Name50
UNION
Select Child.ParentID Id, Parent.ParentID ParentId From Table_Name50 Child
Left Outer Join Table_Name50 Parent
on Child.ParentID = parent.ID
WHERE
parent.ParentID IS NULL
),
ChildHierarchyData(ID,ParentID, Level)
as
(
Select ID,ParentID, 0 as Level from CompleteData Where ID = #ID --<-- Your Parameter
union all
Select CompleteData.ID, CompleteData.ParentID, ChildHierarchyData.Level +1 from CompleteData
INNER Join ChildHierarchyData
on ChildHierarchyData.ParentID = CompleteData.ID
),
Concatinated(result)
as
(
Select Cast((select Cast(Name as nvarchar) + '\' [data()]
from ChildHierarchyData CD INNER JOIN Table_Name50 tbl
ON CD.ID = tbl.ID
Order By Level Desc
FOR XML Path('')) as Nvarchar(max))
)
select Left(result, len(result)-1) as Result from Concatinated
Result
E:\\ Books\ History

EF6 fails to import stored procedure

This is a simplified version of a stored procedure
ALTER PROCEDURE [dbo].[StoredProc1]
(
#PageIndex INT = 1,
#RecordCount INT = 20,
#Gender NVARCHAR(10) = NULL
)
AS
BEGIN
SET NOCOUNT ON ;
WITH tmp1 AS
(
SELECT u.UserId, MIN(cl.ResultField) AS BestResult
FROM [Users] u
INNER JOIN Table1 tbl1 ON tbl1.UserId = u.UserId
WHERE (#Gender IS NULL OR u.Gender = #Gender)
GROUP BY u.UserID
ORDER BY BestResult
OFFSET #PageIndex * #RecordCount ROWS
FETCH NEXT #RecordCount ROWS ONLY
)
SELECT t.UserId, t.BestResult, AVG(cl.ResultField) AS Average
INTO #TmpAverage
FROM tmp1 t
INNER JOIN Table1 tbl1 ON tbl1.UserId = t.UserId
GROUP BY t.UserID, t.BestResult
ORDER BY Average
SELECT u.UserId, u.Name, u.Gender, t.BestResult, t.Average
FROM #tmpAverage t
INNER JOIN Users u on u.UserId = t.UserId
DROP TABLE #TmpAverage
END
When I use EF6 to load the stored procedure, and then go to the "Edit Function Import" dialog, no columns are displayed there. Even after I ask to Retrieve the Columns, I get the message that the SP does not return columns. When I execute the SP from SMMS, I get the expected [UserId, Name, Gender, BestResult, Average] list of records.
Any idea how can I tweak the stored procedure or EF6 to make it work?
Thanks in advance

Thanks to the comments above, the answer is that unfortunately EF6 does not cope well with TMP tables on stored procedures.
One way around is the following:
1) Comment out all temp table calls inside the Stored Procedure.
2) Change the Stored Procedure to return a Fake a result with the same exact column's names that match the expected result
3) Import the Stored Procedure into EF6
4) Double Click on the Function Imports/Stored procedure name
5) On the Edit Function Import dialog, retrieve the Columns and the create a New Complex Type that will match the fake columns
6) CTRL+Save in order to generate all the C# code
7) Re-Update the Stored Procedure by removing the fake result set and un-comment the code with the Temp tables.
That should do the job.
P.S. Special thanks for the helpers that pointed me to the right place !!!

Sometimes it works to add the following statement to the stored proc:
set fmtonly off
But still, dont leave this statement in - only use it while generating the result set

special select query

I have 3 tables in my sql database like these :
Documents : (DocID, FileName) //list of all docs that were attached to items
Items : (ItemID, ...) //list of all items
DocumentRelation : (DocID, ItemID) //the relation between docs and items
In my winform application I have showed all records of Items table in a grid view and let user to select several rows of it and then if he press EditAll button another grid view should fill by file name of documents that are related to these selected items but not all of them,
Just each of documents which have relation with ALL selected items
Is there any query (sql or linq) to select these documents?

Try something like:
string query;
foreach (Item in SelectedItems)
{
query += "select DocID from DocumentRelation where ItemID =" + Item.Id;
query += "INTERSECT";
}
query -= "INTERSECT";
And exec the Query;

Take one string and keep on adding itemid comma separated in that,like 1,2,3 and then write query like
declare ItemID varchar(50);
set ItemID='1,2,3';
select FileName
from documents
Left Join DocumentRelation on Documents.DocId = DocumentRelation.DocId
where
DocumentRelation.ItemID in (select * from > dbo.SplitString(ItemID))
and then make one function in database like below
ALTER FUNCTION [dbo].[SplitString] (#OrderList varchar(1000))
RETURNS #ParsedList table (OrderID varchar(1000) )
AS BEGIN
IF #OrderList = ''
BEGIN
set #OrderList='Null'
end
DECLARE #OrderID varchar(1000), #Pos int
SET #OrderList = LTRIM(RTRIM(#OrderList))+ ','
SET #Pos = CHARINDEX(',', #OrderList, 1)
IF REPLACE(#OrderList, ',', '') <''
BEGIN
WHILE #Pos 0
BEGIN
SET #OrderID = LTRIM(RTRIM(LEFT(#OrderList, #Pos - 1)))
IF #OrderID < ''
BEGIN
INSERT INTO #ParsedList (OrderID)
VALUES (CAST(#OrderID AS varchar(1000)))
--Use Appropriate conversion
END
SET #OrderList = RIGHT(#OrderList, LEN(#OrderList) - #Pos)
SET #Pos = CHARINDEX(',', #OrderList, 1)
END
END
RETURN
END

Linq
var td =
from s in Items
join r in DocumentRelation on s.ItemID equals r.ItemID
join k in Documents on k.DocID equals r.DocID
where Coll.Contains (s.ItemID) //Here Coll is the collection of ItemID which you can store when the users click on the grid view row
select new
{
FileName=k.FileName,
DocumentID= k.DocId
};
You can loop through td collection and bind to your grid view
SQL
create a stored proc to get the relevant documents for the itemID selected from the grid view and paramterize your in clause
select k.FileName,k.DocId from Items as s inner join
DocumentRelation as r on
s.ItemID=r.ItemID and r.ItemId in (pass the above coll containing selected ItemIds as an input the SP)
inner join Documents as k
on k.DocId=r.DocIk
You can get the information on how to parametrize your sql query

Here's one approach. I'll let you figure out how you want to supply the list of items as arguments. And I also assume that (DocID, ItemID) is a primary key in the relations table. The having condition is what enforces your requirement that all select items are related to the list of documents you're seeking.
;with ItemsSelected as (
select i.ItemID
from Items as i
where i.ItemID in (<list of selected ItemIDs>)
)
select dr.DocID
from DocumentRelation as dr
where dr.ItemID in (select ItemID from ItemsSelected)
group by dr.DocID
having count(dr.ItemID) = (select count(*) from ItemsSelected);
EDIT
As far as I can tell, the accepted answer is equivalent to the solution here despite OP's comment below.
I did some quick tests with a very long series of intersect queries and confirmed that you can indeed expect that approach to become gradually slower with an increasing number of selected items. But a much worse problem was the time taken just to compile the queries. I tried this on a very fast server and found that that step took about eight seconds when roughly one hundred intersects were concatenated.
SQL Fiddle didn't let me do anywhere near as many before producing this error (and taking more than ten seconds in the process): The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
There are several possible methods of passing a list of arguments to SQL Server. Assuming that you prefer the dynamic query solution I'd argue that this version is still better while also noting that there is a SQL Server limit on the number of values inside the in.
There are plenty of ways to have this stuff blow up.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Preventing duplication from SQL Query, ASP.net - c#

Try using Stuff and For Xml Here's the Fiddle: http://sqlfiddle.com/#!6/fcf71/5 See if it helps, it's just a sample so you will have to change the column names.

Related

What is a fastest way to get words from T-SQL datatable?

Getting the first unused Id from a table?

Need an approach to get the record id from a table with hierarchy data

EF6 fails to import stored procedure

special select query

Categories

Resources