In our database we have a table that lacks an identity column. There is an Id column, but it is manually populated when a record is inputted. Any item with an Id over 90,000 is reserved and is populated globally across all customer databases.
I'm building a tool to handle bulk insertions into this table using Entity Framework. I need to figure out what the most efficient method of finding the first available Id is (under 90,000) on the fly without iterating over every single row. It is highly likely that in many of the databases, someone has simply selected a random number that wasn't taken and used it to insert the row.
What is my best recourse?
Edit
After seeing some of the solutions listed, I attempted to replicate the SQL logic in Linq. I doubt it's perfect, but it seems incredibly fast and efficient.
var availableIds = Enumerable.Range(1, 89999)
.Except(db.Table.Where(n => n.Id <= 89999)
.Select(n => n.TagAssociationTypeID))
.ToList();
Have you considered something like:
SELECT
min(RN) AS FirstAvailableID
FROM (
SELECT
row_number() OVER (ORDER BY Id) AS RN,
Id
FROM
YourTable
) x
WHERE
RN <> Id
To answer your implied question of how do you get a list of available numbers to use: Easy, make a list of all possible numbers then delete the ones that are in use.
--This builds a list of numbers from 1 to 89999
SELECT TOP (89999) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO #AvialableNumbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
CREATE UNIQUE CLUSTERED INDEX n ON #AvialableNumbers(n)
--Start a seralizeable transaction so we can be sure no one uses a number
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
begin transaction
--Remove numbers that are in use.
delete #AvialableNumbers where n in (select Id from YourTable)
/*
Do your insert here using numbers from #AvialableNumbers
*/
commit transaction
Here is how you would do it via Entity framework
using(var context = new YourContext(connectionString))
using(var transaction = context.Database.BeginTransaction(IsolationLevel.Serializable))
{
var query = #"
SELECT TOP (89999) n = CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id]))
INTO #AvialableNumbers
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
CREATE UNIQUE CLUSTERED INDEX n ON #AvialableNumbers(n)
--Remove numbers that are in use.
delete #AvialableNumbers where n in (select Id from YourTable)
--Select the numbers out to the result set.
select n from #AvialableNumbers order by n
drop table #AvialableNumbers
";
List<int> availableIDs = context.Database.SqlQuery<int>(query).ToList();
/*
Use the list of IDs here
*/
context.SaveChanges();
transaction.Commit();
}
Related
Given this query, if I want to pull the rank of a specific individual where I know there $name and $score and return the rows above/below that rank (say +/- 4), how would I go about doing that?
$query = "SELECT #curRank := #curRank + 1 AS Rank,
uniqueID,
name,
score
FROM scores, (SELECT #curRank := 0) r
ORDER by score DESC";
I'm coding in php, using MySQL and C# in Unity. My game is making a call to the server and running the php code. Goal is to echo the information and parse the information back in the game.
Any help would be much appreciated :)
Based off of your :=, I'm assuming you are using PostgreSQL, correct? I'm more familiar with the T-SQL syntax; but regardless, both PostgreSQL and T-SQL have windowing functions. You could implement something similar to the following (I left out variables for you to fill-in):
$query = "WITH scoreOrder
AS
(
SELECT uniqueID,
name,
score,
ROW_NUMBER() OVER (ORDER BY score DESC, uniqueID DESC) AS RowNum
FROM scores
ORDER BY uniqueID DESC
)
SELECT ns.*
FROM scoreOrder ms --Your matching score
INNER JOIN scoreOrder ns --Your nearby scores
ON ms.name = /* your name variable */
AND ms.score = /* your score variable */
AND ns.RowNum BETWEEN ms.RowNum - /* your offset */ and ms.RowNum + /* your offset */";
Explanation: First, we're creating a common table expression called scoreOrder, and projecting a RowNum column for your scores. In short, ROW_NUMBER() OVER (ORDER BY score DESC, uniqueID DESC) is just saying, "I am returning the row number of this record ordered by score and uniqueID, both descending and in that order." Then, you join that CTE with itself... ms will be your score that you match with, and you join that with ns where the ns.RowNum will be between your ms.RowNum, plus or minus your offset.
There are a ton of other windowing functions. Here are some others that could be more or less appropriate for your scenario:
ROW_NUMBER() - the rownumber of the record
RANK() - the rank of the record, duplicating in ties and includes
gaps (i.e., if 2nd place ties, you would have 1st, 2nd, 2nd,
4th)
DENSE_RANK() - same as rank, except that it fills in the gaps
(i.e., if 2nd place ties, you would have 1st, 2nd, 2nd, 3rd)
For more info, check the PostgreSQL documentation on windowing functions and their tutorial
Update:
Unfornately, MySQL does not support windowing functions or common table expressions. In your scenario, you will have to put the results of your previous query into a temp table, then doing a similar join as demonstrated above. So for example...
CREATE TEMPORARY TABLE IF NOT EXISTS allRankings AS
(
SELECT #curRank := #curRank + 1 AS Rank,
uniqueID,
name,
score
FROM scores, (SELECT #curRank := 0) r
ORDER by score DESC, uniqueID
);
SELECT r.*
FROM allRankings r
INNER JOIN allRankings myRank
ON r.Rank BETWEEN myRank.Rank - <your offset> AND myRank.Rank + <your offset>
AND myRank.name = <your name>
AND myRank.score = <your score>
ORDER by r.Rank;
Here is a SQLFiddle link for an example. (I'm not using a temp table on SQLFiddle because you have to build tables in the Build Schema window).
I have a stored procedure that inserts a line in a table. This table has an auto incremented int primary key and a datetime2 column named CreationDate. I am calling it in a for loop via my C# code, and the loop is inside a transaction scope.
I run the program twice, first time with a for loop that turned 6 times and second time with a for loop that turned 2 times. When I executed this select on sql server I got a strange result
SELECT TOP 8
RequestId, CreationDate
FROM
PickupRequest
ORDER BY
CreationDate DESC
What I didn't get is the order of insertion: for example the line with Id=58001 has to be inserted after that with Id=58002 but this is not the case. Is that because I put my loop in a transaction scoope? or the precision in the datetime2 is not enough?
It is a question of speed and statement scope as well...
Try this:
--This will create a #numbers table with 1 mio numbers:
DECLARE #numbers TABLE(Nbr BIGINT);
WITH N(N) AS
(SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1)
,MoreN(N) AS
(SELECT 1 FROM N AS N1 CROSS JOIN N AS N2 CROSS JOIN N AS N3 CROSS JOIN N AS N4 CROSS JOIN N AS N5 CROSS JOIN N AS N6)
INSERT INTO #numbers(Nbr)
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM MoreN;
--This is a dummy table for inserts:
CREATE TABLE Dummy(ID INT IDENTITY,CreationDate DATETIME);
--Play around with the value for #Count. You can insert 1 mio rows in one go. Although this runs a while, all will have the same datetime value:
--Use a small number here and below, still the same time value
--Use a big count here and a small below will show a slightly later value for the second insert
DECLARE #Count INT = 1000;
INSERT INTO Dummy (CreationDate)
SELECT GETDATE()
FROM (SELECT TOP(#Count) 1 FROM #numbers) AS X(Y);
--A second insert
SET #Count = 10;
INSERT INTO Dummy (CreationDate)
SELECT GETDATE()
FROM (SELECT TOP(#Count) 1 FROM #numbers) AS X(Y);
SELECT * FROM Dummy;
--Clean up
GO
DROP TABLE Dummy;
You did your insertions pretty fast so the actual CreationDate values inserted in one program run had the same values. In case you're using datetime type, all the insertions may well occur in one millisecond. So ORDER BY CreationDate DESC by itself does not guarantee the select order to be that of insertion.
To get the desired order you need to sort by the RequestId as well:
SELECT TOP 8 RequestId, CreationDate
FROM PickupRequest
ORDER BY CreationDate DESC, RequestId DESC
I have an SQL query which I want to call from LINQ to SQL in asp.net application.
SELECT TOP 5 *
FROM (SELECT SongId,
DateInserted,
ROW_NUMBER()
OVER(
PARTITION BY SongId
ORDER BY DateInserted DESC) rn
FROM DownloadHistory) t
WHERE t.rn = 1
ORDER BY DateInserted DESC
I don't know whether its possible or not through linq to sql, if not then please provide any other way around.
I think you'd have to change the SQL partition to a Linq group-by. (Effectively all the partition does is group by song, and select the newest row for each group.) So something like this:
IEnumerable<DownloadHistory> top5Results = DownloadHistory
// group by SongId
.GroupBy(row => row.SongId)
// for each group, select the newest row
.Select(grp =>
grp.OrderByDescending(historyItem => historyItem.DateInserted)
.FirstOrDefault()
)
// get the newest 5 from the results of the newest-1-per-song partition
.OrderByDescending(historyItem => historyItem.DateInserted)
.Take(5);
Although McGarnagle answer solves the problem, but when i see the execution plan for the two queries, it was really amazing to see that linq to sql was really too slow as compare to native sql queries. See the generated query for the above linq to sql:
--It took 99% of the two execution
SELECT TOP (5) [t3].[SongId], [t3].[DateInserted]
FROM (
SELECT [t0].[SongId]
FROM [dbo].[DownloadHistory] AS [t0]
GROUP BY [t0].[SongId]
) AS [t1]
OUTER APPLY (
SELECT TOP (1) [t2].[SongId], [t2].[DateInserted]
FROM [dbo].[DownloadHistory] AS [t2]
WHERE [t1].[SongId] = [t2].[SongId]
ORDER BY [t2].[DateInserted] DESC
) AS [t3]
ORDER BY [t3].[DateInserted] DESC
--It took 1% of the two execution
SELECT TOP 5 t.SongId,t.DateInserted
FROM (SELECT SongId,
DateInserted,
ROW_NUMBER()
OVER(
PARTITION BY SongId
ORDER BY DateInserted DESC) rn
FROM DownloadHistory) t
WHERE t.rn = 1
ORDER BY DateInserted DESC
I have 3 tables in my sql database like these :
Documents : (DocID, FileName) //list of all docs that were attached to items
Items : (ItemID, ...) //list of all items
DocumentRelation : (DocID, ItemID) //the relation between docs and items
In my winform application I have showed all records of Items table in a grid view and let user to select several rows of it and then if he press EditAll button another grid view should fill by file name of documents that are related to these selected items but not all of them,
Just each of documents which have relation with ALL selected items
Is there any query (sql or linq) to select these documents?
Try something like:
string query;
foreach (Item in SelectedItems)
{
query += "select DocID from DocumentRelation where ItemID =" + Item.Id;
query += "INTERSECT";
}
query -= "INTERSECT";
And exec the Query;
Take one string and keep on adding itemid comma separated in that,like 1,2,3 and then write query like
declare ItemID varchar(50);
set ItemID='1,2,3';
select FileName
from documents
Left Join DocumentRelation on Documents.DocId = DocumentRelation.DocId
where
DocumentRelation.ItemID in (select * from > dbo.SplitString(ItemID))
and then make one function in database like below
ALTER FUNCTION [dbo].[SplitString] (#OrderList varchar(1000))
RETURNS #ParsedList table (OrderID varchar(1000) )
AS BEGIN
IF #OrderList = ''
BEGIN
set #OrderList='Null'
end
DECLARE #OrderID varchar(1000), #Pos int
SET #OrderList = LTRIM(RTRIM(#OrderList))+ ','
SET #Pos = CHARINDEX(',', #OrderList, 1)
IF REPLACE(#OrderList, ',', '') <''
BEGIN
WHILE #Pos 0
BEGIN
SET #OrderID = LTRIM(RTRIM(LEFT(#OrderList, #Pos - 1)))
IF #OrderID < ''
BEGIN
INSERT INTO #ParsedList (OrderID)
VALUES (CAST(#OrderID AS varchar(1000)))
--Use Appropriate conversion
END
SET #OrderList = RIGHT(#OrderList, LEN(#OrderList) - #Pos)
SET #Pos = CHARINDEX(',', #OrderList, 1)
END
END
RETURN
END
Linq
var td =
from s in Items
join r in DocumentRelation on s.ItemID equals r.ItemID
join k in Documents on k.DocID equals r.DocID
where Coll.Contains (s.ItemID) //Here Coll is the collection of ItemID which you can store when the users click on the grid view row
select new
{
FileName=k.FileName,
DocumentID= k.DocId
};
You can loop through td collection and bind to your grid view
SQL
create a stored proc to get the relevant documents for the itemID selected from the grid view and paramterize your in clause
select k.FileName,k.DocId from Items as s inner join
DocumentRelation as r on
s.ItemID=r.ItemID and r.ItemId in (pass the above coll containing selected ItemIds as an input the SP)
inner join Documents as k
on k.DocId=r.DocIk
You can get the information on how to parametrize your sql query
Here's one approach. I'll let you figure out how you want to supply the list of items as arguments. And I also assume that (DocID, ItemID) is a primary key in the relations table. The having condition is what enforces your requirement that all select items are related to the list of documents you're seeking.
;with ItemsSelected as (
select i.ItemID
from Items as i
where i.ItemID in (<list of selected ItemIDs>)
)
select dr.DocID
from DocumentRelation as dr
where dr.ItemID in (select ItemID from ItemsSelected)
group by dr.DocID
having count(dr.ItemID) = (select count(*) from ItemsSelected);
EDIT
As far as I can tell, the accepted answer is equivalent to the solution here despite OP's comment below.
I did some quick tests with a very long series of intersect queries and confirmed that you can indeed expect that approach to become gradually slower with an increasing number of selected items. But a much worse problem was the time taken just to compile the queries. I tried this on a very fast server and found that that step took about eight seconds when roughly one hundred intersects were concatenated.
SQL Fiddle didn't let me do anywhere near as many before producing this error (and taking more than ten seconds in the process): The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
There are several possible methods of passing a list of arguments to SQL Server. Assuming that you prefer the dynamic query solution I'd argue that this version is still better while also noting that there is a SQL Server limit on the number of values inside the in.
There are plenty of ways to have this stuff blow up.
i am trying to show the last order for the a specific customer on a grid view , what i did is showing all orders for the customer but i need the last order
here is my SQL code
SELECT orders.order_id, orders.order_date,
orders.payment_type, orders.cardnumber, packages.Package_name,
orders.package_id, packages.package_price
FROM orders INNER JOIN packages ON orders.package_id = packages.Package_ID
WHERE (orders.username = #username )
#username get its value from a cookie , now how can i choose the last order only for a cookie value " Tony " for example ?
To generalize (and fix a little bit) Mitch's answer, you need to use SELECT clause embellished with TOP(#N) and ORDER BY ... DESC. Note that I use TOP(#N), not TOP N, which means you can pass it as an argument to the stored procedure and return, say, not 1 but N last orders:
CREATE STORED PROCEDURE ...
#N int
...
SELECT TOP(#N) ...
ORDER BY ... DESC
SELECT top 1
orders.order_id,
orders.order_date,
orders.payment_type,
orders.cardnumber,
packages.Package_name,
orders.package_id,
packages.package_price
FROM orders
INNER JOIN packages ON orders.package_id = packages.Package_ID
WHERE (orders.username = #username )
ORDER BY orders.order_date DESC
In fact assuming orders.order_id is an Identity column:
SELECT top 1
orders.order_id,
orders.order_date,
orders.payment_type,
orders.cardnumber,
packages.Package_name,
orders.package_id,
packages.package_price
FROM orders
INNER JOIN packages ON orders.package_id = packages.Package_ID
WHERE (orders.username = #username )
ORDER BY orders.order_id DESC