I have 3 tables in my sql database like these :
Documents : (DocID, FileName) //list of all docs that were attached to items
Items : (ItemID, ...) //list of all items
DocumentRelation : (DocID, ItemID) //the relation between docs and items
In my winform application I have showed all records of Items table in a grid view and let user to select several rows of it and then if he press EditAll button another grid view should fill by file name of documents that are related to these selected items but not all of them,
Just each of documents which have relation with ALL selected items
Is there any query (sql or linq) to select these documents?
Try something like:
string query;
foreach (Item in SelectedItems)
{
query += "select DocID from DocumentRelation where ItemID =" + Item.Id;
query += "INTERSECT";
}
query -= "INTERSECT";
And exec the Query;
Take one string and keep on adding itemid comma separated in that,like 1,2,3 and then write query like
declare ItemID varchar(50);
set ItemID='1,2,3';
select FileName
from documents
Left Join DocumentRelation on Documents.DocId = DocumentRelation.DocId
where
DocumentRelation.ItemID in (select * from > dbo.SplitString(ItemID))
and then make one function in database like below
ALTER FUNCTION [dbo].[SplitString] (#OrderList varchar(1000))
RETURNS #ParsedList table (OrderID varchar(1000) )
AS BEGIN
IF #OrderList = ''
BEGIN
set #OrderList='Null'
end
DECLARE #OrderID varchar(1000), #Pos int
SET #OrderList = LTRIM(RTRIM(#OrderList))+ ','
SET #Pos = CHARINDEX(',', #OrderList, 1)
IF REPLACE(#OrderList, ',', '') <''
BEGIN
WHILE #Pos 0
BEGIN
SET #OrderID = LTRIM(RTRIM(LEFT(#OrderList, #Pos - 1)))
IF #OrderID < ''
BEGIN
INSERT INTO #ParsedList (OrderID)
VALUES (CAST(#OrderID AS varchar(1000)))
--Use Appropriate conversion
END
SET #OrderList = RIGHT(#OrderList, LEN(#OrderList) - #Pos)
SET #Pos = CHARINDEX(',', #OrderList, 1)
END
END
RETURN
END
Linq
var td =
from s in Items
join r in DocumentRelation on s.ItemID equals r.ItemID
join k in Documents on k.DocID equals r.DocID
where Coll.Contains (s.ItemID) //Here Coll is the collection of ItemID which you can store when the users click on the grid view row
select new
{
FileName=k.FileName,
DocumentID= k.DocId
};
You can loop through td collection and bind to your grid view
SQL
create a stored proc to get the relevant documents for the itemID selected from the grid view and paramterize your in clause
select k.FileName,k.DocId from Items as s inner join
DocumentRelation as r on
s.ItemID=r.ItemID and r.ItemId in (pass the above coll containing selected ItemIds as an input the SP)
inner join Documents as k
on k.DocId=r.DocIk
You can get the information on how to parametrize your sql query
Here's one approach. I'll let you figure out how you want to supply the list of items as arguments. And I also assume that (DocID, ItemID) is a primary key in the relations table. The having condition is what enforces your requirement that all select items are related to the list of documents you're seeking.
;with ItemsSelected as (
select i.ItemID
from Items as i
where i.ItemID in (<list of selected ItemIDs>)
)
select dr.DocID
from DocumentRelation as dr
where dr.ItemID in (select ItemID from ItemsSelected)
group by dr.DocID
having count(dr.ItemID) = (select count(*) from ItemsSelected);
EDIT
As far as I can tell, the accepted answer is equivalent to the solution here despite OP's comment below.
I did some quick tests with a very long series of intersect queries and confirmed that you can indeed expect that approach to become gradually slower with an increasing number of selected items. But a much worse problem was the time taken just to compile the queries. I tried this on a very fast server and found that that step took about eight seconds when roughly one hundred intersects were concatenated.
SQL Fiddle didn't let me do anywhere near as many before producing this error (and taking more than ten seconds in the process): The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
There are several possible methods of passing a list of arguments to SQL Server. Assuming that you prefer the dynamic query solution I'd argue that this version is still better while also noting that there is a SQL Server limit on the number of values inside the in.
There are plenty of ways to have this stuff blow up.
Related
I have a situation where on a dashboard, for pending approvals I am trying to show certain items as follows
Item 1 [Count]
Item 2 [Count]
Item 3 [Count]
The [Count] shows a numeric value of items pending approval. On click of each of these items, there is an associated table where the records are being shown.
The way of deriving these counts is very complex and I wish to avoid making duplicate queries for count for example query #1 as
SELECT COUNT(*)
FROM tableName
and then query #2 as
SELECT ColumnA, ColumnB, ColumnC
FROM tableName
Since these queries are being read into my C# application, until now I've been doing the following
var onlyCount = true;
var subQuery = onlyCount? "COUNT(*)": "ColumnA, ColumnB, ColumnC";
var query = $"SELECT {subQuery} FROM tableName";
But with an ever-growing list of columns that needs to be managed, this makes the code look ugly. With calculated data in the select list, Case(s), IIF(s) in the statement the above-said solution is no longer a "maintainable" solution. With the select query is something as demonstrated below even possible?
DECLARE #CountOnly AS BIT = 1
SELECT
CASE
WHEN #CountOnly = 1
THEN COUNT(*)
ELSE ColumnA, ColumnB, ColumnC
END
FROM
tableName
Have any one ever faced such a scenario? Or if you could point me in a direction where this can be handled better?
Side note: The above query is being passed into a SqlDataReader to fetch the data and show to the end user.
You may want to use something like this:
DECLARE #CountOnly AS BIT = 1
IF (#CountOnly = 1)
BEGIN
SELECT ColumnA, ColumnB, ColumnC
FROM MyTable
ELSE
SELECT COUNT(*)
FROM MyTable
END
I have a code where i do sql query by casting the table model like this:
string sql = string.Format("SELECT * FROM {0}...", tableName...);
and then:
IEnumerable<T> r = dbConn.Connection.Query<T>(sql...);
the thing is if i want to get total rowsCount(of course i can get count on the "r" but if there is a where clause its not possible because i want total count) i have to another query without where.
so i want to remove the second query. i did this in sql query to get rowsCount:
string sql = string.Format("SELECT *, count(*) over() rowsCount FROM {0}...", tableName...);
i can get the rowsCount with this query but since neither one of models has rowsCount i cant access it, is there any suggestions on how i should do it?
Edit:
first query has paging filter by using offset and limit, so i want totalcount not the count of filtered query.
I'm looking to see if there is a way to not use two seperate queries, and get results and also rowsCount by just one query.
You will have to do 2 SQL queries.
Nothing stopping you running them in one SQL block or calling a stored proc with output parameters. So you don't have to make 2 calls but you will need 2 queries at least.
https://www.sqlservertutorial.net/sql-server-stored-procedures/stored-procedure-output-parameters/
If you are worried about performance of a total count just make sure you have an index on the smallest column in that table and it should be mega fast.
The below example return a dataset and an output in one call
DROP PROCEDURE IF EXISTS dbo.DataSetAndOutput
GO
CREATE PROCEDURE dbo.DataSetAndOutput
#YourId BIGINT,
#CountRecords INT OUTPUT
AS
BEGIN
SELECT * FROM YourTable
WHERE Id = #YourId
SET #CountRecords = (SELECT COUNT(YourId) FROM YourTable)
END
GO
-- Test the output
DECLARE #ResultCount INT
EXEC dbo.DataSetAndOutput #YourId= 252452, #CountRecords = #ResultCount OUTPUT
SELECT #ResultCount AS TheCount
I have a method in my controller which returns a PagedList to my category-page View that contains Products (based current Page-Number and Page-Size which user has selected) from SQL Server stored-procedure like blow :
var products = _dbContext.EntityFromSql<Product>("ProductLoad",
pCategoryIds,
pManufacturerId,
pOrderBy,
pageIndex,
pageSize).ToList(); // returning products based selected Ordering by.
var totalRecords = pTotalRecords.Value != DBNull.Value ? Convert.ToInt32(pTotalRecords.Value) : 0;
var allPrd= new PagedList<Product>(products, pageIndex, pageSize, totalRecords);
An Example of sending parameters to db stored-procedure is :
("ProductLoad",
[1,10,13],
[653],
"Lowest Price",
2,
64) // so it returns second 64 products with those category-ids and Brand-ids sorting by lowest to highest price
It's working fine , but what i am trying to do is always sending products with 0 quantity to the end of list.
For example :
if i had 10k products which 2k of them have 0 quantity , i need to show this 8k products first and then 2k unavailable products in the end of list)
what i have tried so far is always loading All products without page-size and page-index first then send zero qty products to the end of the list by this and finally Pagedlist with fixing page size :
("ProductLoad",
[1,10,13],
[653],
"Lowest Price",
0,
10000) // fixed page size means loading all products
var zeroQty= from p in products
where p.StockQuantity==0
select p;
var zeroQtyList= zeroQty.ToList();
products = products.Except(zeroQtyList).ToList();
products.AddRange(zeroQtyList);
var totalRecords = pTotalRecords.Value != DBNull.Value ? Convert.ToInt32(pTotalRecords.Value) : 0;
var allPrd= new PagedList<Product>(products, pageIndex, 64, totalRecords);
It cause all zero qty Products goes to the end of list.
But it always loads all products that is not a good idea and for sure not an optimized way , sometime users get page loading time-out,
(because category-page show 64 products in every page-index-number) every time user opens a page in the website, all products would loads and it cause delay in loading page.
Is there anyway to solve this problem (have a PagedList which
contains all more than zero qty products first and 0 qty products
second) without changing stored-procedure? (fixing loading page
delays)
P.S : The reason i avoid changing stored-procedure is it has already too much join,temp-table Union and Order by.
Any help would be appreciated.
You will need to the use the ROW_NUMBER function in your stored procedure.
This is an example of how I have done this before. Hopefully you will be able to adapt it to your SP.
--temp table to hold the message IDs we are looking for
CREATE TABLE #ids (
MessageId UNIQUEIDENTIFIER
,RowNum INT PRIMARY KEY
);
--insert all message IDs that match the search criteria, with row number
INSERT INTO #ids
SELECT m.[MessageId]
,ROW_NUMBER() OVER (ORDER BY CreatedUTC DESC)
FROM [dbo].[Message] m WITH (NOLOCK)
WHERE ....
DECLARE #total INT;
SELECT #total = COUNT(1) FROM #ids;
--determine which records we want to select
--note: #skip and #take are parameters of the procedure
IF #take IS NULL
SET #take = #total;
DECLARE #startRow INT, #endRow INT;
SET #startRow = #skip + 1;
SET #endRow = #skip + #take;
-- select the messages within the requested range
SELECT m.* FROM [dbo].[Message] WITH (NOLOCK)
INNER JOIN #ids i ON m.MessageId = i.MessageId
WHERE i.RowNum BETWEEN #startRow AND #endRow;
OrderByDescending could be useful to fix it. Like below:
List<Product> SortedList = products.OrderByDescending(o => o.StockQuantity).ToList();
In our database we have a table that lacks an identity column. There is an Id column, but it is manually populated when a record is inputted. Any item with an Id over 90,000 is reserved and is populated globally across all customer databases.
I'm building a tool to handle bulk insertions into this table using Entity Framework. I need to figure out what the most efficient method of finding the first available Id is (under 90,000) on the fly without iterating over every single row. It is highly likely that in many of the databases, someone has simply selected a random number that wasn't taken and used it to insert the row.
What is my best recourse?
Edit
After seeing some of the solutions listed, I attempted to replicate the SQL logic in Linq. I doubt it's perfect, but it seems incredibly fast and efficient.
var availableIds = Enumerable.Range(1, 89999)
.Except(db.Table.Where(n => n.Id <= 89999)
.Select(n => n.TagAssociationTypeID))
.ToList();
Have you considered something like:
SELECT
min(RN) AS FirstAvailableID
FROM (
SELECT
row_number() OVER (ORDER BY Id) AS RN,
Id
FROM
YourTable
) x
WHERE
RN <> Id
To answer your implied question of how do you get a list of available numbers to use: Easy, make a list of all possible numbers then delete the ones that are in use.
--This builds a list of numbers from 1 to 89999
SELECT TOP (89999) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO #AvialableNumbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
CREATE UNIQUE CLUSTERED INDEX n ON #AvialableNumbers(n)
--Start a seralizeable transaction so we can be sure no one uses a number
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
begin transaction
--Remove numbers that are in use.
delete #AvialableNumbers where n in (select Id from YourTable)
/*
Do your insert here using numbers from #AvialableNumbers
*/
commit transaction
Here is how you would do it via Entity framework
using(var context = new YourContext(connectionString))
using(var transaction = context.Database.BeginTransaction(IsolationLevel.Serializable))
{
var query = #"
SELECT TOP (89999) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO #AvialableNumbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
CREATE UNIQUE CLUSTERED INDEX n ON #AvialableNumbers(n)
--Remove numbers that are in use.
delete #AvialableNumbers where n in (select Id from YourTable)
--Select the numbers out to the result set.
select n from #AvialableNumbers order by n
drop table #AvialableNumbers
";
List<int> availableIDs = context.Database.SqlQuery<int>(query).ToList();
/*
Use the list of IDs here
*/
context.SaveChanges();
transaction.Commit();
}
Take a look at this psuedo schema (please note this is a simplification so please try not to comment too heavily on the "advisability" of the schema itself). Assume Indexes are inplace on the FKs.
TABLE Lookup (
Lookup_ID int not null PK
Name nvarchar(255) not null
)
TABLE Document (
Document_ID int not null PK
Previous_ID null FK REFERENCES Document(Document_ID)
)
TABLE Document_Lookup (
Document_ID int not null FK REFERENCES Document(Document_ID)
Lookup_ID int not null FK REFERENCES Lookup(Lookup_ID)
)
Volumes: Document, 4 Million rows of which 90% have a null Previous_ID field value; Lookup, 6000 rows, Average lookups attached to each document 20 giving Document_Lookup 80 Millions rows.
Now in a .NET Service have structure to represent a Lookup row like this:-
struct Lookup
{
public int ID;
public string Name;
public List<int> DocumentIDs;
}
and that lookup rows are stored in a Dictionary<int, Lookup> where the key is the lookup ID. An important point here is that this dictionary should contain entries where the Lookup is referenced by at least one document, i.e., the list DocumentIDs should have Count > 0.
My task is populate this dictionary efficiently. So the simple approach would be:-
SELECT dl.Lookup_ID, l.Name, dl.Document_ID
FROM Document_Lookup dl
INNER JOIN Lookup l ON l.Lookup_ID = dl.Lookup_ID
INNER JOIN Document d ON d.Document_ID = dl.Lookup_ID
WHERE d.Previous_ID IS NULL
ORDER BY dl.Lookup_ID, dl.Document_ID
This could then be used to populate a the dictionary fairly efficiently.
The Question: Does the underlying rowset delivery (TDS?) perform some optimization? It seems to me that queries that de-normalise data are very common hence the possiblity that field values don't change from one row to the next is high, hence it would make sense to optomise the stream by not sending field values that haven't changed. Does anyone know whether such an optomisation is in place? (Optomisation does not appear to exist).
What more sophisticated query could I use to eliminate the duplication (I'm think specifically of repeating the name value)? I've heard of such a thing a "nested rowset", can that sort of thing be generated? Would it be more performant? How would I access it in .NET?
I would perform two queries; one to populate the Lookup dictionary then a second to populate the ditionary lists. I would then add code to knock out the unused Lookup entires. However imagine I got my predictions wrong and Lookup ended up being 1 Million rows with only a quarter actually referenced by any document?
As long as the names are relatively short in practice, the optimisation may not be necessary.
The easiest optimisation is to split it into two queries, one to get the names, the other to get the Document_ID list. (can be in the other order if it makes it easier to populate your data structures).
Example:
/*First get the name of the Lookup*/
select distinct dl.Lookup_ID, l.Name
FROM Document_Lookup dl
INNER JOIN Lookup l ON l.Lookup_ID = dl.Lookup_ID
INNER JOIN Document d ON d.Document_ID = dl.Lookup_ID
WHERE d.Previous_ID IS NULL
ORDER BY dl.Lookup_ID, dl.Document_ID
/*Now get the list of Document_IDs for each*/
SELECT dl.Lookup_ID, dl.Document_ID
FROM Document_Lookup dl
INNER JOIN Lookup l ON l.Lookup_ID = dl.Lookup_ID
INNER JOIN Document d ON d.Document_ID = dl.Lookup_ID
WHERE d.Previous_ID IS NULL
ORDER BY dl.Lookup_ID, dl.Document_ID
There are also various tricks you could use to massage these into a single table but I suggest these are not worthwile.
The heirarchical rowsets you are thinking of are the MSDASHAPE OLEDB provider. They can do what you are suggesting but would restrict you to using the OLEDB provider for SQL which may not be what you want.
Finally consider careful XML
For example:
select
l.lookup_ID as "#l",
l.name as "#n",
(
select dl.Document_ID as "node()", ' ' as "node()"
from Document_Lookup dl where dl.lookup_ID = l.lookup_ID for xml path(''), type
) as "*"
from Lookup l
where l.lookup_ID in (select dl.lookup_ID from Document_Lookup dl)
for xml path('dl')
returns:
<dl l="1" n="One">1 2 </dl>
<dl l="2" n="Two">2 </dl>
When you're asking about "nested rowsets" are you referring to using the DbDataReader.NextResult() method?
if your query has two "outputs" (two select statements which return a separate resultsets), you can loop through the first using DbDataReader.Next() and when that returns "false" then you can call DbDataReader.NextResult() and then use DbDataReader.Next() again to continue.
var reader = cmd.ExecuteReader();
while(reader.Read()){
// load data
}
if(reader.NextResult()){
while(reader.Read()){
// lookup record from first result
// load data from second result
}
}
I've done this frequently to reduce duplicate data in a similar situation and it works really well:
SELECT * FROM tableA WHERE [condition]
SELECT * FROM tableB WHERE EXISTS (SELECT * FROM tableA WHERE [condition] AND tableB.FK = tableA.PK)
Disclaimer: I haven't tried this with a resultset as large as you're describing.
The downside of this is you'll need a way to map the second resultset to the first, using a hashtable or order list.