I have a time series that with null values. I want to be replace each null value with the most recent non-non value. From what I've researched, Oracle SQL can easily accomplish this using Last_value with IGNORE NULLS. Is there a similar way to accomplish this using SQL Server 2016? Otherwise I'm just going to code it using C#, but felt using SQL would be faster, cleaner, and easier.
Sec SCORE
1 Null
2 Null
3 5
4 Null
5 8
6 7
7 Null
Should be replaced with:
Sec SCORE
1 Null
2 Null
3 5
4 5
5 8
6 7
7 7
You can do this with two cumulative operations:
select t.*,
coalesce(score, max(score) over (partition by maxid)) as newscore
from (select t.*,
max(case when score is not null then id end) over (order by id) as maxid
from t
) t;
The innermost subquery gets the most recent id where there is a value. The outermost one "spreads" that value to the subsequent rows.
If you actually want to update the table, you can incorporate this easily into an update. But, Oracle cannot do that (easily), so I'm guessing this is not necessary....
If performance is an issue, I suggest the solution from this article:
The Last non NULL Puzzle
His final solution, while dense, does perform excellently with a linear query plan without any joins. Here is an example implementation I've used which carries the last customer name through a type2 scd staging table. In this staging table, NULL represents no update, and '*** DELETED ***' represents an explicit set to NULL. The following cleans this up to resemble an actual SCD record excellently:
WITH [SampleNumbered] AS (
SELECT *, ROW_NUMBER() OVER ( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [RowNumber]
FROM [dbo].[SampleDimStage]
), [SamplePrep] AS (
SELECT [SampleId]
, [StartDatetime]
, CAST([RowNumber] AS BINARY(8)) + CAST([SampleGroupId] AS VARBINARY(255)) AS [BinarySampleGroupId]
, CAST([RowNumber] AS BINARY(8)) + CAST([SampleStatusCode] AS VARBINARY(255)) AS [BinarySampleStatusCode]
FROM [SampleNumbered]
), [SampleCleanUp] AS (
SELECT [SampleId]
, [StartDatetime]
, CAST(SUBSTRING(MAX([BinarySampleGroupId]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
, 9, 255) AS VARCHAR(255)) AS [LastSampleGroupId]
, CAST(SUBSTRING(MAX([BinarySampleStatusCode]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
, 9, 255) AS VARCHAR(255)) AS [LastSampleStatusCode]
, LEAD([StartDatetime]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [EndDatetime]
FROM [SamplePrep]
)
SELECT CAST([SampleId] AS NUMERIC(18)) AS [SampleId]
, CAST(NULLIF([sc].[LastSampleGroupId],'*** DELETED ***') AS NUMERIC(18)) AS [GroupId]
, CAST(NULLIF([sc].[LastSampleStatusCode],'*** DELETED ***') AS CHAR(3)) AS [SampleStatusCode]
, [StartDatetime]
, [sc].[EndDatetime]
FROM [SampleCleanUp] [sc];
If your sort key is some sort of integer, you can completely skip the first CTE and cast that directly to binary.
Related
I have been working on a ASP.NET MVC project where we use Informix DB and Entity Framework for our queries. The thing is that depending on which DB the application is connected to, some LINQ queries are translated to different SQL queries.
That is I connect to DB 1 and the query is working and is translated more or less like this:
Opened connection asynchronously at 26/9/2019 12:48:27 +03:00
SELECT SKIP 0 FIRST 25
...
FROM ( SELECT ...
FROM ( SELECT
...
FROM LATERAL (SELECT
... ) AS Project1
LEFT OUTER JOIN LATERAL (SELECT FIRST 1 Project2.C1 AS C1
FROM LATERAL ( SELECT
...
) AS Project2
ORDER BY ... ASC ) AS Limit1 ON 1 = 1
) AS Project3
) AS Project3
ORDER BY ...
-- p__linq__0: 'M' (Type = String, Size = 1)
-- p__linq__1: '1/1/2018 00:00:00' (Type = DateTime, Size = 16)
-- p__linq__2: '1/1/2019 00:00:00' (Type = DateTime, Size = 16)
-- Executing asynchronously at 26/9/2019 12:48:27 +03:00
using the same exact code I restart the application and connect to DB 2 and the same LINQ expression is translated to the following SQL query that fails:
Opened connection asynchronously at 26/9/2019 12:41:00 +03:00
SELECT SKIP 0 FIRST 25
...
FROM ( SELECT ...
FROM ( SELECT
...
FROM (SELECT
... ) AS Project1
LEFT OUTER JOIN (SELECT FIRST 1 Project2.C1 AS C1
FROM ( SELECT
...
) AS Project2
ORDER BY ... ASC ) AS Limit1 ON 1 = 1
) AS Project3
) AS Project3
ORDER BY ...
-- p__linq__0: 'M' (Type = String, Size = 1)
-- p__linq__1: '1/1/2018 00:00:00' (Type = DateTime, Size = 16)
-- p__linq__2: '1/1/2019 00:00:00' (Type = DateTime, Size = 16)
-- Executing asynchronously at 26/9/2019 12:41:00 +03:00
-- Failed in 403 ms with error: ERROR [IX000] [IBM][IDS/UNIX64] Column (...) not found in any table in the query (or SLV is undefined).
You can notice that the second query is missing the LATERAL keyword. Is it possible that just the DB connected, affects the LINQ translation to SQL?
edit to answer questions:
#Fildor the DBs are not the exact same version:
- DB1 is IBM Informix Dynamic Server Version 12.10.FC6WE
- DB2 is IBM Informix Dynamic Server Version 12.10.FC10
#Corak as far as I know the DB schema regarding the missing column is the same in both DBs. Since I cannot be 100% sure though, could that be the case? All columns are there though, if there is any difference it will be in foreign keys i.e. The thing is that the two queries are exactly the same with the only difference being the LATERAL keyword. This is the documentation of IBM regarding LATERAL keyword. It makes sense to me that without it the "missing" column cannot be found in the subquery.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have a database table which has columns with values 1 till 999
But it has some spaces e.g. 1,2,3,4,5,6,11,15 etc...
What would be the best to get the "next number" from this table?
Thanks in advance for your help
one way to do this is to get for every row the prior row, and then check where you are making a step.
This will not perform great, and it is NOT SAFE when more then 1 user is adding new rows !
declare #t table (number int)
insert into #t values (1), (2), (3), (4), (5), (6), (11), (12)
select top 1
(select top 1 t2.number + 1 from #t t2 where t2.number < t.number order by t2.number desc) as prior
from #t t
where number <> (select top 1 t2.number + 1 from #t t2 where t2.number < t.number order by t2.number desc)
order by t.number
The result would be 7
Another option is this
select top 1
t.number + 1
from #t t
left join #t t2 on t.number = t2.number - 1
where t2.number is null
order by t.number
This method might even be faster then the solution of Robin
EDIT
As Daniel pointed out in a comment, this will never return 1 in case the gap happens to be the first row.
To fix this, we can retrieve a value for the first missing row, and add it to our result by use of a union.
select top 1 number
from ( select top 1
t.number + 1 as number
from #t t
left join #t t2 on t.number = t2.number - 1
where t2.number is null
union
select 1 as number
from #t t
where not exists (select 1 from #t t3 where t3.number = 1)
) t
order by t.number
Since the extra query can only retrieve exact one row by an index, this should not affect performance much
You can use a CTE to generate the numbers and then get the first one that does not match with a record....
This work fine as you mentioned that it is not a large table
I have a datatable which has columns with values 1 till 999
Regard the other answers, both are too much faster than this with large tables, but none of them will return the correct value (1) if your input starts on 2 or greater.
I don't know the purpose of this request, but be aware that calculating values this way two users working at same time can get the same value. It can be an issue specially if you want to use this value to be part of a primary key or unique index
;with numbers as (
SELECT 1 as nrstart, MAX(yourcolumn) as nrend FROM yourTable
UNION ALL
SELECT nrstart+1, nrend FROM numbers
WHERE nrstart <= nrend
)
SELECT TOP 1 nrstart
FROM numbers
WHERE NOT EXISTS (SELECT 1 FROM yourTable WHERE yourcolumn = numbers.nrstart)
ORDER BY nrStart
OPTION (MAXRECURSION 0);
You want the first number, where that number plus one is not in the table.
SELECT TOP 1 (Number + 1) FROM myTable a WHERE NOT EXISTS
(SELECT * FROM myTable b WHERE b.Number = a.Number + 1)
ORDER By Number
As mentioned in various comments, this sort of thing should be done in a transaction if there's any risk of a second user filling the gap while the first is looking for it.
I'm working on a project where efficiency of the search functionality is critical.
I have several flag columns (like enum flags in c#). Searching on this data is super fast (3 milliseconds round trip) but I've come-a-cropper now I have to perform group counts.
So, I have an item 'A' that contains Red (1), White (8) and blue (64) so the 'Colours' column holds the number 73.
To search I can search for items with red with this
Declare #colour int
set #colour = 1
Select *
From Items
Where (Colour & #colour) > 0
That works great. Now I have to group it (also super fast)
So if I have 8 items in total, 5 contain red, 3 contain white and 7 contain blue the results would look like:
Colour Qty
------------------
1 5
8 3
64 7 ( I don't have to worry about the name )
So: Is there any way I can take the number 73 and bitwise split it into groups?
(Part 2: How do I translate that into Linq to SQL?)
Any advise would be appreciated
Thanks ^_^
Ok - I think I've worked out the best solution:
I tried a view with a cte:
with cte as (
select cast(1 as bigint) as flag, 1 pow
union all
select POWER(cast(2 as bigint),pow), pow + 1
from cte
where flag < POWER(cast(2 as bigint),62)
)
, cte2 as (
select flag from cte
union select -9223372036854775808
)
but that was too slow so now I have made it into a static table. I join with a bitwise '&':
select Flag, Count(*)
From FlagValues fv
inner join Items i on (fv.Flag & i.Colour)
Much faster ^_^
I am being asked to build a table with a bunch of bit fields to toggle a series of options. This is a perfect place for a c# flag enum to just combine all of these bits into a single int "options" field in the table.
Problem is that there are things other than c# that will need to read and query off these flags in the sql queries. I could find nothing that seems to be able to deconstruct an int into a series of flag values. What I'm looking for (I think) is something that converts the int back to binary and then picks the n'th value and reports it as a bool of a specific option. I could do this manually but I'm worried about the performance hit plus the "going around your #ss to get to your elbow" effect of just trying to avoid a litany of bit columns with an over complicated solution.
This is a perfect place for a c# flag enum to just combine all of
these bits into a single int "options" field in the table.
No, it is not. It is the perfect place to show you are not aware that 16 bit fields in SQL are optimized in storage. And it is the perfect place for you to create a maintenance nightmare and demonstrate you can use an antipattern.
I could find nothing that seems to be able to deconstruct an int into a series of flag
values
There is nothing for that. It is an antipattern.
Bit columns. That is how SQL wants it.
I Think what you mean is you have say 10 options on set in a Flag Enum in sql and you want to save this to the database as a int value
e.g.
[Flags]
public enum Options
{
0 = None
1 = OP1
2 = OP2
4 = OP3
...
}
If you need to check this with something other then C# you would just need to use a Bit Wise operation to see what is checked. This can be done in SQL using the & operator(check here), I guess most languages will have some implantation of bit wise also.
For fun, I'm going to answer your question directly. Over large data, this will not perform well, but neither will many bit columns.
I think the answer for your overall design to have correct database design, not have a bunch of bit columns, and still perform well is to create a 1 to many table off of your main table, where a record existence means that the option is enabled. This can then be indexed well and perform well. You can also follow a similar construct of lookup table that I'm using below, combined with SUM() across the multiple rows in the child table to convert the multiple records back to your C# enum. Updating the table can also be accomplished with a well crafted MERGE statement. The code for that solution is a little drawn out however and outside the scope of your original question.
CREATE TABLE dbo.Data
(
ID int IDENTITY(1,1) NOT NULL PRIMARY KEY,
Name varchar(50) NOT NULL,
Flags int NOT NULL DEFAULT(0)
);
CREATE TABLE dbo.Flag
(
Value int NOT NULL PRIMARY KEY,
Name varchar(50) NOT NULL
);
INSERT INTO dbo.Flag
SELECT 1, 'Option A'
UNION ALL
SELECT 2, 'Option B'
UNION ALL
SELECT 4, 'Option C';
INSERT INTO dbo.Data (Name, Flags)
SELECT 'George', 0
UNION ALL
SELECT 'Bob', 1
UNION ALL
SELECT 'Bill', 3;
-- for C#
SELECT
d.ID,
d.Name,
d.Flags
FROM
dbo.Data d;
-- for other callers vertical
SELECT
d.ID,
d.Name,
d.Flags,
f.Name AS Flag,
f.Value
FROM
dbo.Data d
LEFT JOIN dbo.Flag f ON
d.Flags & f.Value = f.Value;
-- for other callers horizontal
SELECT
p.Name,
p.[Option A],
p.[Option B],
p.[Option C]
FROM
(
SELECT
d.Name,
f.Name AS Flag,
f.Value
FROM
dbo.Data d
LEFT JOIN dbo.Flag f ON
d.Flags & f.Value = f.Value
) d
PIVOT
(
COUNT(d.Value)
FOR d.Flag IN ([Option A], [Option B], [Option C])
) p;
SELECT
d.ID,
d.Name,
d.Flags
FROM
dbo.Data d
WHERE
EXISTS
(
SELECT *
FROM
dbo.Flag f
WHERE
d.Flags & f.Value = f.Value AND
f.Name IN ('Option A', 'Option B')
);
I have a table as follows,
TypeID Name Date
-------------------------------
1 Carrot 1-1-2013
1 Beetroot 1-1-2013
1 Beans 1-1-2013
2 cabbage 1-1-2013
2 potato 1-1-2013
2 tomato 1-1-2013
2 onion 1-1-2013
If need 2 rows then it should return 2 rows from TypeId 1 and 2 rows from TypeId 2.If need the only 4 rows, means I have to get 4 rows from TypeId 1 and 4 rows from TypeId 2
but TypeId 1 has only 3 rows so we need to get only 3 rows for typeId 1
How to do that? Shall I add RowNumber?
For SQL Server;
EDIT: Your question changed slightly;
If you want want a maximum of x items per category, you can use ROW_NUMBER();
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY TypeID ORDER BY Name) rn FROM Table1
)
SELECT TypeID, Name, [Date] FROM cte
WHERE rn <=3 -- here is where your x goes
ORDER BY TypeID;
An SQLfiddle to test with.
You can write your query to order by the TypeID.
Then, if you're using SQL, you could use SELECT TOP N or LIMIT N (depending on the DB), or with TSQL and SQL Server, use TOP(N) to take the top N rows.
If you're using a LINQ based ORM from your C# code, then you can use Take(N), which automatically creates the appropriate query based on the provider details to limit the number of results.
I think you should use a query to select your 3 rows from type 1.....and then the however many rows from type 2 and then add the results together.
;With CTE(TypeID,Name,Date,RowNo)
AS
(
select *, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID) from TableVEG
)
Select Top(#noofRows*2) * from CTE where RowNo<=#noofRows order by rowno
The above query worked.. Thank u all... :-)