How to Convert Sub Query to Joins for Fast Result? - c#

I want to convert Sub Queries into joins to improve performance.
The following sub-queries take to long to load.
SELECT c.tank_name, c.fuel_type, c.capacity, c.tank_id,
(SELECT TOP 1 b.Level
from Microframe.dbo.TrackMessages b
where b.IMEI = a.IMEI
AND b.Timestamp >= #Start
order by b.Timestamp ) AS Level,
(select top 1 b.Timestamp
from Microframe.dbo.TrackMessages b
where b.IMEI = a.IMEI
AND b.Timestamp >= #Start
order by b.Timestamp ) AS TimeStamp,
(SELECT top 1 b.Temp
from Microframe.dbo.TrackMessages b
where b.IMEI = a.IMEI
AND b.Timestamp >= #Start
order by b.Timestamp ) AS Temp
FROM GatexServerDB.dbo.device as a
JOIN GatexReportsDB.dbo.tbl_static_tank_info as c ON c.tank_id = a.owner_id
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN ({Tanks})

You can move the subquery to the FROM clause and use CROSS APPLY. Since you seem to be dealing with IoT data though, you should investigate T-SQL's ranking, windowing and analytic functions. Performance will depend heavily on the table's indexes.
Given these tables :
create table #TrackMessages (
Message_ID bigint primary key,
imei nvarchar(50) ,
[timestamp] datetime2,
Level int,
temp numeric(5,2)
);
create table #device (
imei nvarchar(50) primary key,
owner_id int
);
create table #tbl_static_tank_info (
tank_id int not null primary key,
tank_name nvarchar(20),
fuel_type nvarchar(20),
capacity numeric(9,2),
owner_id int,
client_id int
)
And indexes :
create nonclustered index IX_MSG_IMEI_Time on #TrackMessages (imei,timestamp) include(level,temp) ;
create INDEX IX_Device_OwnerID on #device (Owner_ID)
create INDEX IX_Tank_Client on #tbl_static_tank_info (Client_ID);
create INDEX IX_Tank_Owner on #tbl_static_tank_info (Owner_ID);
The TOP 1 query would look like this :
SELECT c.tank_name, c.fuel_type, c.capacity, c.tank_id,
Level,
TimeStamp,
Temp
FROM #device as a
inner JOIN #tbl_static_tank_info as c ON c.tank_id = a.owner_id
cross apply (SELECT top 1 imei,Temp,Level,timestamp
from #TrackMessages b
where b.IMEI = a.imei
AND b.Timestamp >= #start
order by b.Timestamp ) msg
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN (1,5,7)
If there is a 1-M relation between tanks, devices and messages, the FIRST_VALUE analytic function can be used to return the first record ber device, without using a subquery :
SELECT c.tank_name, c.fuel_type, c.capacity, c.tank_id,
first_value(Temp) over (partition by b.imei order by timestamp) as temp,
first_value(Level) over (partition by b.imei order by timestamp) as level,
min(timestamp) over (partition by b.imei) as timestamp
from #TrackMessages b
inner join #device as a on b.IMEI = a.imei
inner JOIN #tbl_static_tank_info as c ON c.tank_id = a.owner_id
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN (1,5,7)
Performance will depend heavily on the indexes, the table statistics and whether the index and OVER order matches.
This query can be modified to return both the first and last value per device using LAST_VALUE :
SELECT c.tank_name, c.fuel_type, c.capacity, c.tank_id,
first_value(Temp) over (partition by b.imei order by timestamp) as StartTemp,
first_value(Level) over (partition by b.imei order by timestamp) as StartLevel,
min(timestamp) over (partition by b.imei) as StartTime,
last_value(Temp) over (partition by b.imei order by timestamp) as EndTemp,
lastt_value(Level) over (partition by b.imei order by timestamp) as EndLevel,
max(timestamp) over (partition by b.imei) as EndTime
from #TrackMessages b
inner join #device as a on b.IMEI = a.imei
inner JOIN #tbl_static_tank_info as c ON c.tank_id = a.owner_id
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN (1,5,7)
the server would have to sort the measurements both by ascending timestamp order (that's what the IX_MSG_IMEI_Time index already does) and descending order.

Here's a solution with CROSS APPLY which is like a function you can declare on the go and use it as a joining clause. You can change CROSS APPLY to OUTER APPLY if the returning set might not exist, in this case if there might not be any record on TrackMessages for a particular IMEI (will return NULL values).
SELECT
c.tank_name,
c.fuel_type,
c.capacity,
c.tank_id,
T.Level,
T.Timestamp,
T.Temp
FROM GatexServerDB.dbo.device as a
JOIN GatexReportsDB.dbo.tbl_static_tank_info as c ON c.tank_id = a.owner_id
CROSS APPLY (
SELECT TOP 1 -- Retrieve only the first record
-- And return as many columns as you need
b.Level,
b.Timestamp,
b.Temp
FROM
Microframe.dbo.TrackMessages AS b
WHERE
a.IMEI = b.IMEI AND -- With matching IMEI
b.Timestamp >= #Start
ORDER BY
b.Timestamp) T -- Ordered by Timestamp
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN ({Tanks})
However I believe the key point here would be indexes on your tables. If you are already sure that the problem is the subquery, then make sure that TrackMessages has the following index:
CREATE NONCLUSTERED INDEX NCI_TrackMessages_IMEI_TimeStamp ON Microframe.dbo.TrackMessages (IMEI, Timestamp)
Indexes have pros and cons, make sure to check them out before creating or dropping one.

Not having the structures, my guees for the solution is:
WITH CTE AS
(SELECT B.IMEI,
b.Level,
b.Timetamp,
b.Temp,
ROW_NUMBER() OVER (PARTITION BY b.IMEI ORDER BY Timestamp) AS Row
FROM Microframe.dbo.TrackMessages b
WHERE b.Timestamp >= #Start
)
SELECT c.tank_name, c.fuel_type, c.capacity, c.tank_id,
CTE.Level, CTE.Timestamp, CTE.Temp
FROM GatexServerDB.dbo.device as a
INNER JOIN GatexReportsDB.dbo.tbl_static_tank_info as c ON c.tank_id = a.owner_id
INNER JOIN CTE ON CTE.IMEI = a.IMEI
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN ({Tanks})
AND CTE.Row = 1;
I cannot test it but it should be very close to the solution. Please, confirm if it works.

You can compare and go with either of the below solutions
The JOIN way where ordering is done via row-number windowing function
SELECT * FROM
(
SELECT
c.tank_name,
c.fuel_type,
c.capacity,
c.tank_id,
Level=b.Level,
TimeStamp=b.Timestamp,
Temp=b.Temp,
r=Row_number() over ( order by b.timestamp)
FROM GatexServerDB.dbo.device as a
JOIN GatexReportsDB.dbo.tbl_static_tank_info as c
ON c.tank_id = a.owner_id
JOIN Microframe.dbo.TrackMessages as b
ON b.IMEI = a.IMEI AND b.Timestamp >= #Start
WHERE c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN ({Tanks})
)T
where r=1
or the CROSS APPLY way like below
SELECT * FROM
(
SELECT
c.tank_name, c.fuel_type, c.capacity, c.tank_id
FROM GatexServerDB.dbo.device as a
JOIN GatexReportsDB.dbo.tbl_static_tank_info as c
ON c.tank_id = a.owner_id
AND c.client_id = 65
AND a.IMEI IS NOT NULL
AND c.tank_id IN ({Tanks})
) A
CROSS APPLY
(
SELECT
TOP 1
b.Level, b.Timestamp,b.Temp
FROM Microframe.dbo.TrackMessages b
WHERE b.IMEI = a.IMEI
AND b.Timestamp >= #Start
ORDER BY b.Timestamp
)D

Related

Output transactions that exceed certain amount within 3 days

In a month, seperate by AccountNo, all transaction that exceeds 10,000 amount within 3 days must be outputted.
This is a sample table:
AccountNo-----Date------------Amount
1-------------5/1/17----------8000
1-------------5/3/17----------1000
1-------------5/4/17----------1000
1-------------5/6/17----------1000
2-------------5/7/17----------3000
2-------------5/10/17---------2000
2-------------5/13/17---------2000
2-------------5/13/17---------3000
3-------------5/14/17---------3000
3-------------5/15/17---------3000
3-------------5/16/17---------9000
4-------------5/17/17---------1000
5-------------5/18/17---------1000
5-------------5/19/17---------1000
5-------------5/20/17---------1000
The result must be:
AccountNo-----Date------------Amount
1-------------5/1/17----------8000
1-------------5/3/17----------1000
1-------------5/4/17----------1000
2-------------5/7/17----------3000
2-------------5/10/17---------2000
2-------------5/13/17---------2000
2-------------5/13/17---------3000
3-------------5/14/17---------3000
3-------------5/15/17---------3000
3-------------5/16/17---------9000
There is a code given to me, but it's not completely working yet.
Select A.*
From YourTable A
Cross Apply (
Select RT1 = sum(case when [Date] <= B2.TstDate then [Amount] else 0 end)
,RT2 = sum(case when [Date] >= B2.TstDate then [Amount] else 0 end)
From YourTable B1
Cross Join (Select TstDate=A.[Date]) B2
Where [Date] between DateAdd(DAY,-2,A.[Date]) and DateAdd(DAY,2,A.[Date])
and Year([Date])=Year(TstDate)
and Month([Date])=Month(TstDate)
) B
Where RT1>=10000 or RT2>=10000
Common table expression: TransInfo
TransInfo is a table with account, year, month, day, amount
The main query:
The [Tmiddle] query adds a row_number based on amount per day per month per accountno
The [Tmiddle] is joined again with TransInfo to limit the results to top 3 of a month AS [Touter] where the top 3 is >= 10.000
The outer most query combines the result with a table again to gain the complete transaction information again
In short:
Filter data on top 3 amounts per day per month per year per account
Check if sum of amount >= 10.000
Show results
Query:
WITH [TransInfo] ([AccountNo], [Year], [Month], [Day], [Amount], [Rownumber])
AS
(
SELECT [AccountNo]
,[Year]
,[Month]
,[Day]
,[Amount]
,ROW_NUMBER() OVER
(
PARTITION BY [AccountNo],
[Year],
[Month]
ORDER BY [Amount] DESC
) AS [Rownumber]
FROM
(
SELECT [AccountNo]
,DATEPART(YEAR, [Date]) AS [Year]
,DATEPART(MONTH, [Date]) AS [Month]
,DATEPART(DAY, [Date]) AS [Day]
,SUM([Amount]) AS [Amount]
FROM [Test].[dbo].[Data]
GROUP BY [AccountNo],
DATEPART(MONTH, [Date]),
DATEPART(YEAR, [Date]),
DATEPART(DAY, [Date])
) AS [Tinner]
)
SELECT [Data].[AccountNo]
,[Data].[Date]
,[Data].[Amount]
FROM [Test].[dbo].[Data]
INNER JOIN
(
SELECT [TransInfo].[AccountNo]
,[TransInfo].[Year]
,[TransInfo].[Month]
,[TransInfo].[Day]
,[TransInfo].[Amount]
FROM [TransInfo]
INNER JOIN
(
SELECT [AccountNo]
,[Year]
,[Month]
FROM [TransInfo]
WHERE [TransInfo].[Rownumber] <= 3
GROUP BY [TransInfo].[AccountNo],
[TransInfo].[Year],
[TransInfo].[Month]
HAVING SUM ([TransInfo].[Amount]) >= 10000
) AS [Tmiddle]
ON [Tmiddle].[AccountNo] = [TransInfo].[AccountNo]
AND [Tmiddle].[Year] = [TransInfo].[Year]
AND [Tmiddle].[Month] = [TransInfo].[Month]
WHERE [TransInfo].[Rownumber] <= 3
) AS [Touter]
ON [Data].[AccountNo] = [TOuter].[AccountNo]
AND DATEPART(YEAR, [Data].[Date]) = [TOuter].[Year]
AND DATEPART(MONTH, [Data].[Date]) = [TOuter].[Month]
AND DATEPART(DAY, [Data].[Date]) = [TOuter].[Day]
Result: Left is query result, right is complete table
Does this work as expected? You may need to change the 2's to 3's depending on what you mean by 'within 3 days'.
SELECT DISTINCT C.*
FROM YourTable C
INNER JOIN (
SELECT a.AccountNo, a.Date
FROM YourTable a
INNER JOIN YourTable b
ON a.AccountNo = b.AccountNo
AND DATEADD(DAY, 2, a.Date) <= b.Date AND b.Date >= a.Date
GROUP BY a.AccountNo, a.Date
HAVING SUM(b.Amount) > 10000
) d
ON C.AccountNo = d.AccountNo
AND DATEADD(DAY, 2, D.Date) <= C.Date AND C.Date >= D.Date

how to compare a string delimited string to a column value in sql without considering sequence

How to compare a string delimited string to a column value in sql without considering sequence?
Suppose I have a value in sql column [fruits] - mango, apple, cherry... I have list in asp.net C# cherry, mango, apple... I want to write sql query such that it can match sql table without order.
I suggest that you look at the fabulous answers in this SO question
How to split a comma-separated value to columns
That said, your solution should be pass each column which contains words to this function and then store it in a table along with a column ID.
So "mango,apple,cherry" becomes a table with values
ColdID Value
_______________
1 mango
1 apple
1 cherry
Now order the tables by ColID ASC, Value ASC and compare both the tables.
This should do it.
DECLARE #str NVARCHAR(MAX)
, #Delim NVARCHAR(255)
SELECT #str = 'cherry,mango,peach,apple'
SELECT #Delim = ','
CREATE TABLE #Fruits ( Fruit VARCHAR(255) )
INSERT INTO #Fruits
( Fruit )
VALUES ( 'cherry' ),
( 'Mango' ),
( 'Apple' ) ,
( 'Banana' )
;WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0)
,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4
,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16
,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256
,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536
,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296
,Tally_CTE (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
SELECT SUBSTRING(#str, N, CHARINDEX(#Delim, #str + #Delim, N) - N) AS Item
INTO #StrTable
FROM Tally_CTE
WHERE N BETWEEN 1 AND DATALENGTH(#str) + DATALENGTH(#Delim)
AND SUBSTRING(#Delim + #str, N, LEN(#Delim)) = #Delim;
--#############################################################################
-- in both
--#############################################################################
SELECT *
FROM #Fruits F
JOIN #StrTable ST ON F.Fruit = ST.Item
--#############################################################################
-- in table but not string
--#############################################################################
SELECT *
FROM #Fruits F
LEFT JOIN #StrTable ST ON ST.Item = F.Fruit
WHERE ST.Item IS NULL
--#############################################################################
-- in string but not table
--#############################################################################
SELECT *
FROM #StrTable ST
LEFT JOIN #Fruits F ON ST.Item = F.Fruit
WHERE F.Fruit IS NULL
GO
DROP TABLE #Fruits
DROP TABLE #StrTable
You can use string_split function to do this. I tested this on SQL Server 2017 ctp 2.0 but it should work on 2016 too.
drop table if exists dbo.Fruits;
create table dbo.Fruits (
Fruits varchar(100)
);
insert into dbo.Fruits (Fruits)
values ('cherry,mango,apple'), ('peanut,cherry,mango'),
('apple,cherry,mango')
declare #str varchar(100) = 'apple,mango,cherry';
select
tt.Fruits
, COUNT(tt.value) as Value01
, COUNT(app.value) as Value02
from (
select
*
from dbo.Fruits f
outer apply string_split (f.Fruits, ',') t
) tt
left join string_split (#str, ',') app on tt.value = app.value
group by tt.Fruits

Entity Framework SUM CASE not optimized

I'm trying to write a simple SQL query in LinQ, and no matter how hard I try, I always get a complex query.
Here is the SQL I am trying to achieve (this is not what I'm getting):
SELECT
ClearingAccounts.ID,
SUM(CASE WHEN Payments.StatusID = 1 THEN Payments.TotalAmount ELSE 0 END) AS Sum1,
SUM(CASE WHEN DirectDebits.StatusID = 2 THEN DirectDebits.TotalAmount ELSE 0 END) AS Sum2,
SUM(CASE WHEN Payments.StatusID = 2 THEN Payments.TotalAmount ELSE 0 END) AS Sum3,
SUM(CASE WHEN DirectDebits.StatusID = 1 THEN DirectDebits.TotalAmount ELSE 0 END) AS Sum4
FROM ClearingAccounts
LEFT JOIN Payments ON Payments.ClearingAccountID = ClearingAccounts.ID
LEFT JOIN DirectDebits ON DirectDebits.ClearingAccountID = ClearingAccounts.ID
GROUP BY ClearingAccounts.ID
Here is the code:
from clearingAccount in clearingAccounts
let payments = clearingAccount.Payments
let directDebits = clearingAccount.DirectDebits
select new
{
ID = clearingAccount.ID,
Sum1 = payments.Sum(p => p.StatusID == 1 ? p.TotalAmount : 0),
Sum2 = directDebits.Sum(p => p.StatusID == 2 ? p.TotalAmount : 0),
Sum3 = payments.Sum(p => p.StatusID == 2 ? p.TotalAmount : 0),
Sum4 = directDebits.Sum(p => p.StatusID == 1 ? p.TotalAmount : 0),
}
The generated query gets the data from the respective table for each sum, so four times. I'm not sure if it's even possible to optimize this?
EDIT Here the is generated query:
SELECT
[Project5].[ID] AS [ID],
[Project5].[C1] AS [C1],
[Project5].[C2] AS [C2],
[Project5].[C3] AS [C3],
[Project5].[C4] AS [C4]
FROM ( SELECT
[Project4].[ID] AS [ID],
[Project4].[C1] AS [C1],
[Project4].[C2] AS [C2],
[Project4].[C3] AS [C3],
(SELECT
SUM([Filter5].[A1]) AS [A1]
FROM ( SELECT
CASE WHEN (1 = [Extent5].[StatusID]) THEN [Extent5].[TotalAmount] ELSE cast(0 as decimal(18)) END AS [A1]
FROM [dbo].[DirectDebits] AS [Extent5]
WHERE [Project4].[ID] = [Extent5].[ClearingAccountID]
) AS [Filter5]) AS [C4]
FROM ( SELECT
[Project3].[ID] AS [ID],
[Project3].[C1] AS [C1],
[Project3].[C2] AS [C2],
(SELECT
SUM([Filter4].[A1]) AS [A1]
FROM ( SELECT
CASE WHEN (2 = [Extent4].[StatusID]) THEN [Extent4].[TotalAmount] ELSE cast(0 as decimal(18)) END AS [A1]
FROM [dbo].[Payments] AS [Extent4]
WHERE [Project3].[ID] = [Extent4].[ClearingAccountID]
) AS [Filter4]) AS [C3]
FROM ( SELECT
[Project2].[ID] AS [ID],
[Project2].[C1] AS [C1],
(SELECT
SUM([Filter3].[A1]) AS [A1]
FROM ( SELECT
CASE WHEN (2 = [Extent3].[StatusID]) THEN [Extent3].[TotalAmount] ELSE cast(0 as decimal(18)) END AS [A1]
FROM [dbo].[DirectDebits] AS [Extent3]
WHERE [Project2].[ID] = [Extent3].[ClearingAccountID]
) AS [Filter3]) AS [C2]
FROM ( SELECT
[Project1].[ID] AS [ID],
(SELECT
SUM([Filter2].[A1]) AS [A1]
FROM ( SELECT
CASE WHEN (1 = [Extent2].[StatusID]) THEN [Extent2].[TotalAmount] ELSE cast(0 as decimal(18)) END AS [A1]
FROM [dbo].[Payments] AS [Extent2]
WHERE [Project1].[ID] = [Extent2].[ClearingAccountID]
) AS [Filter2]) AS [C1]
FROM ( SELECT
[Extent1].[ID] AS [ID]
FROM [dbo].[ClearingAccounts] AS [Extent1]
WHERE ([Extent1].[CustomerID] = 3) AND ([Extent1].[Deleted] <> 1)
) AS [Project1]
) AS [Project2]
) AS [Project3]
) AS [Project4]
) AS [Project5]
Edit
Note that as per #usr's comment, that your original Sql Query is broken. By LEFT OUTER joining on two independent tables, and then grouping on the common join key, as soon as one of the DirectDebits or Payments tables returns more than one row, you will erroneously duplicate the TotalAmount value in the 'other' SUMmed colums (and vice versa). e.g. If a given ClearingAccount has 3 DirectDebits and 4 Payments, you will get a total of 12 rows (whereas you should be summing 3 and 4 rows independently for the two tables). A better Sql Query would be:
WITH ctePayments AS
(
SELECT
ClearingAccounts.ID,
-- Note the ELSE 0 projection isn't required as nulls are eliminated from aggregates
SUM(CASE WHEN Payments.StatusID = 1 THEN Payments.TotalAmount END) AS Sum1,
SUM(CASE WHEN Payments.StatusID = 2 THEN Payments.TotalAmount END) AS Sum3
FROM ClearingAccounts
INNER JOIN Payments ON Payments.ClearingAccountID = ClearingAccounts.ID
GROUP BY ClearingAccounts.ID
),
cteDirectDebits AS
(
SELECT
ClearingAccounts.ID,
SUM(CASE WHEN DirectDebits.StatusID = 2 THEN DirectDebits.TotalAmount END) AS Sum2,
SUM(CASE WHEN DirectDebits.StatusID = 1 THEN DirectDebits.TotalAmount END) AS Sum4
FROM ClearingAccounts
INNER JOIN DirectDebits ON DirectDebits.ClearingAccountID = ClearingAccounts.ID
GROUP BY ClearingAccounts.ID
)
SELECT ca.ID, COALESCE(p.Sum1, 0) AS Sum1, COALESCE(d.Sum2, 0) AS Sum2,
COALESCE(p.Sum3, 0) AS Sum3, COALESCE(d.Sum4, 0) AS Sum4
FROM
ClearingAccounts ca
LEFT OUTER JOIN ctePayments p
ON ca.ID = p.ID
LEFT OUTER JOIN cteDirectDebits d
ON ca.ID = d.ID;
-- GROUP BY not required, since we have already guaranteed at most one row
-- per joined table in the CTE's, assuming ClearingAccounts.ID is unique;
You'll want to fix and test this with test cases before you even contemplate conversion to LINQ.
Old Answer(s)
The Sql construct:
SELECT SUM(CASE WHEN ... THEN 1 ELSE 0 END) AS Something
when applied in a SELECT list, is a common hack 'alternative' to pivot data from the 'greater' select into columns which meet the projection criteria (and hence the zero if not matched) . It isn't really a sum at all, its a 'matched' count.
With regards to optimizing the Sql generated, another alternative would be to materialize the data after joining and grouping (and of course, if there is a predicate WHERE clause, apply that in Sql too via IQueryable), and then do the conditional summation in memory:
var result2 = Db.ClearingAccounts
.Include(c => c.Payments)
.Include(c => c.DirectDebits)
.GroupBy(c => c.Id)
.ToList() // or any other means to force materialization here.
.ToDictionary(
grp => grp.Key,
grp => new
{
PaymentsByStatus = grp.SelectMany(x => x.Payments)
.GroupBy(p => p.StatusId),
DirectDebitByStatus = grp.SelectMany(x => x.Payments)
.GroupBy(p => p.StatusId),
})
.Select(ca => new
{
ID = ca.Key,
Sum1 = ca.Value.PaymentsByStatus.Where(pbs => pbs.Key == 1)
.Select(pbs => pbs.Select(x => x.TotalAmount).Sum()),
Sum2 = ca.Value.DirectDebitByStatus.Where(pbs => pbs.Key == 2)
.Select(ddbs => ddbs.Select(x => x.TotalAmount).Sum()),
Sum3 = ca.Value.PaymentsByStatus.Where(pbs => pbs.Key == 2)
.Select(pbs => pbs.Select(x => x.TotalAmount).Sum()),
Sum4 = ca.Value.DirectDebitByStatus.Where(pbs => pbs.Key == 1)
.Select(ddbs => ddbs.Select(x => x.TotalAmount).Sum())
});
However, personally, I would leave this pivot projection directly in Sql, and then use something like SqlQuery to then deserialize the result back from Sql
directly into the final Entity type.
1)
Add AsNoTracking in EF to avoid tracking changes.
Check that you have indexes on the columns you are using for the JOINs. Especially the column that you are using to group by. Profile the query and optimize it. EF has also overhead over a stored procedure.
or
2) If you cannot find a way to make it as fast as you need, create a stored procedure and call it from EF. Even the same query will be faster.

Limit results from multiple individual tables in a single LINQ-to-Entities query. Resultant T-SQL is wrong

I need to query multiple tables with one query, and I need to limit the results from each table individually.
An example ...
I have a ContentItem, Retailer, and Product table.
ContentItem has a Type (int) field that corresponds to an enum of content types like "Retailer" and "Product." I am filtering ContentItem using this field for each sub-subquery.
ContentItem has an Id (pkey) field.
Retailer and Product have an Id (pkey) field. Id is also an FK to ContentItem.Id.
I can select from all three tables with a LEFT JOIN query. From there, I can then limit the total number of rows returned, let's say 6 rows total.
What I want to do is limit the number of rows returned from Retailer and Product individually. This way, I will have 12 rows (max) total: 6 from Retailer, and 6 from Product.
I can already accomplish this with SQL, but I am having a difficult time getting LINQ-to-Entities to "do the right thing."
Here's my SQL
SELECT * From
(
(SELECT * FROM (SELECT * FROM [dbo].[ContentItem] WHERE Type = 0 ORDER BY ContentItem.mtime OFFSET 0 ROWS FETCH NEXT 6 ROWS ONLY) Retailers)
UNION ALL
(SELECT * FROM (SELECT * FROM [dbo].[ContentItem] WHERE Type = 1 ORDER BY ContentItem.mtime OFFSET 0 ROWS FETCH NEXT 6 ROWS ONLY) Brands)
UNION ALL
(SELECT * FROM (SELECT * FROM [dbo].[ContentItem] WHERE Type = 2 ORDER BY ContentItem.mtime OFFSET 0 ROWS FETCH NEXT 6 ROWS ONLY) Products)
UNION ALL
(SELECT * FROM (SELECT * FROM [dbo].[ContentItem] WHERE Type = 3 ORDER BY ContentItem.mtime OFFSET 0 ROWS FETCH NEXT 6 ROWS ONLY) Certifications)
UNION ALL
(SELECT * FROM (SELECT * FROM [dbo].[ContentItem] WHERE Type = 4 ORDER BY ContentItem.mtime OFFSET 0 ROWS FETCH NEXT 6 ROWS ONLY) Claims)
) as ContentItem
LEFT JOIN [dbo].[Retailer] ON (Retailer.Id = ContentItem.Id)
LEFT JOIN [dbo].[Brand] ON (Brand.Id = ContentItem.Id)
LEFT JOIN [dbo].[Product] ON (Product.Id = ContentItem.Id)
LEFT JOIN [dbo].[Certification] ON (Certification.Id = ContentItem.Id)
LEFT JOIN [dbo].[Claim] ON (Claim.Id = ContentItem.Id);
Here's one of my many iterations of LINQ queries (which is not returning the desired result).
var queryRetailers = contentItemModel
.Where(contentItem => contentItem.Type == ContentTypeEnum.Retailer)
.OrderByDescending(o => o.mtime).Skip(skip).Take(take).Select(o => new { Id = o.Id });
var queryBrands = contentItemModel
.Where(contentItem => contentItem.Type == ContentTypeEnum.Brand)
.OrderByDescending(o => o.mtime).Skip(skip).Take(take).Select(o => new { Id = o.Id });
var queryProducts = contentItemModel
.Where(contentItem => contentItem.Type == ContentTypeEnum.Product)
.OrderByDescending(o => o.mtime).Skip(skip).Take(take).Select(o => new { Id = o.Id });
var queryCertifications = contentItemModel
.Where(contentItem => contentItem.Type == ContentTypeEnum.Certification)
.OrderByDescending(o => o.mtime).Skip(skip).Take(take).Select(o => new { Id = o.Id });
var queryClaims = contentItemModel
.Where(contentItem => contentItem.Type == ContentTypeEnum.Claim)
.OrderByDescending(o => o.mtime).Skip(skip).Take(take).Select(o => new { Id = o.Id });
var query = from contentItem in
queryRetailers
.Concat(queryBrands)
.Concat(queryProducts)
.Concat(queryCertifications)
.Concat(queryClaims)
join item in context.Retailer on contentItem.Id equals item.Id into retailerGroup
from retailer in retailerGroup.DefaultIfEmpty(null)
join item in context.Brand on contentItem.Id equals item.Id into brandGroup
from brand in brandGroup.DefaultIfEmpty(null)
join item in context.Product on contentItem.Id equals item.Id into productGroup
from product in productGroup.DefaultIfEmpty(null)
join item in context.Certification on contentItem.Id equals item.Id into certificationGroup
from certification in certificationGroup.DefaultIfEmpty(null)
join item in context.Claim on contentItem.Id equals item.Id into claimGroup
from claim in claimGroup.DefaultIfEmpty(null)
select new
{
contentItem,
retailer,
brand,
product,
certification,
claim
};
var results = query.ToList();
This query returns SQL that essentially "nests" my UNION ALL statements, and the server returns all rows from the database.
SELECT
[Distinct4].[C1] AS [C1],
[Distinct4].[C2] AS [C2],
[Extent6].[Id] AS [Id],
[Extent6].[RowVersion] AS [RowVersion],
[Extent6].[ctime] AS [ctime],
[Extent6].[mtime] AS [mtime],
[Extent7].[Id] AS [Id1],
[Extent7].[Recommended] AS [Recommended],
[Extent7].[RowVersion] AS [RowVersion1],
[Extent7].[ctime] AS [ctime1],
[Extent7].[mtime] AS [mtime1],
[Extent8].[Id] AS [Id2],
[Extent8].[OverrideGrade] AS [OverrideGrade],
[Extent8].[PlantBased] AS [PlantBased],
[Extent8].[Recommended] AS [Recommended1],
[Extent8].[RowVersion] AS [RowVersion2],
[Extent8].[ctime] AS [ctime2],
[Extent8].[mtime] AS [mtime2],
[Extent8].[Brand_Id] AS [Brand_Id],
[Extent8].[Grade_Name] AS [Grade_Name],
[Extent8].[Grade_Value] AS [Grade_Value],
[Extent9].[Id] AS [Id3],
[Extent9].[RowVersion] AS [RowVersion3],
[Extent9].[ctime] AS [ctime3],
[Extent9].[mtime] AS [mtime3],
[Extent9].[Grade_Name] AS [Grade_Name1],
[Extent9].[Grade_Value] AS [Grade_Value1],
[Extent10].[Id] AS [Id4],
[Extent10].[RowVersion] AS [RowVersion4],
[Extent10].[ctime] AS [ctime4],
[Extent10].[mtime] AS [mtime4],
[Extent10].[Grade_Name] AS [Grade_Name2],
[Extent10].[Grade_Value] AS [Grade_Value2]
FROM (SELECT DISTINCT
[UnionAll4].[C1] AS [C1],
[UnionAll4].[C2] AS [C2]
FROM (SELECT
[Distinct3].[C1] AS [C1],
[Distinct3].[C2] AS [C2]
FROM ( SELECT DISTINCT
[UnionAll3].[C1] AS [C1],
[UnionAll3].[C2] AS [C2]
FROM (SELECT
[Distinct2].[C1] AS [C1],
[Distinct2].[C2] AS [C2]
FROM ( SELECT DISTINCT
[UnionAll2].[C1] AS [C1],
[UnionAll2].[C2] AS [C2]
FROM (SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[C2] AS [C2]
FROM ( SELECT DISTINCT
[UnionAll1].[C1] AS [C1],
[UnionAll1].[Id] AS [C2]
FROM (SELECT TOP (1000)
[Project1].[C1] AS [C1],
[Project1].[Id] AS [Id]
FROM ( SELECT [Project1].[Id] AS [Id], [Project1].[mtime] AS [mtime], [Project1].[C1] AS [C1], row_number() OVER (ORDER BY [Project1].[mtime] DESC) AS [row_number]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[mtime] AS [mtime],
1 AS [C1]
FROM [dbo].[ContentItem] AS [Extent1]
WHERE 0 = CAST( [Extent1].[Type] AS int)
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 0
ORDER BY [Project1].[mtime] DESC
UNION ALL
SELECT TOP (1000)
[Project3].[C1] AS [C1],
[Project3].[Id] AS [Id]
FROM ( SELECT [Project3].[Id] AS [Id], [Project3].[mtime] AS [mtime], [Project3].[C1] AS [C1], row_number() OVER (ORDER BY [Project3].[mtime] DESC) AS [row_number]
FROM ( SELECT
[Extent2].[Id] AS [Id],
[Extent2].[mtime] AS [mtime],
1 AS [C1]
FROM [dbo].[ContentItem] AS [Extent2]
WHERE 1 = CAST( [Extent2].[Type] AS int)
) AS [Project3]
) AS [Project3]
WHERE [Project3].[row_number] > 0
ORDER BY [Project3].[mtime] DESC) AS [UnionAll1]
) AS [Distinct1]
UNION ALL
SELECT TOP (1000)
[Project7].[C1] AS [C1],
[Project7].[Id] AS [Id]
FROM ( SELECT [Project7].[Id] AS [Id], [Project7].[mtime] AS [mtime], [Project7].[C1] AS [C1], row_number() OVER (ORDER BY [Project7].[mtime] DESC) AS [row_number]
FROM ( SELECT
[Extent3].[Id] AS [Id],
[Extent3].[mtime] AS [mtime],
1 AS [C1]
FROM [dbo].[ContentItem] AS [Extent3]
WHERE 2 = CAST( [Extent3].[Type] AS int)
) AS [Project7]
) AS [Project7]
WHERE [Project7].[row_number] > 0
ORDER BY [Project7].[mtime] DESC) AS [UnionAll2]
) AS [Distinct2]
UNION ALL
SELECT TOP (1000)
[Project11].[C1] AS [C1],
[Project11].[Id] AS [Id]
FROM ( SELECT [Project11].[Id] AS [Id], [Project11].[mtime] AS [mtime], [Project11].[C1] AS [C1], row_number() OVER (ORDER BY [Project11].[mtime] DESC) AS [row_number]
FROM ( SELECT
[Extent4].[Id] AS [Id],
[Extent4].[mtime] AS [mtime],
1 AS [C1]
FROM [dbo].[ContentItem] AS [Extent4]
WHERE 3 = CAST( [Extent4].[Type] AS int)
) AS [Project11]
) AS [Project11]
WHERE [Project11].[row_number] > 0
ORDER BY [Project11].[mtime] DESC) AS [UnionAll3]
) AS [Distinct3]
UNION ALL
SELECT TOP (1000)
[Project15].[C1] AS [C1],
[Project15].[Id] AS [Id]
FROM ( SELECT [Project15].[Id] AS [Id], [Project15].[mtime] AS [mtime], [Project15].[C1] AS [C1], row_number() OVER (ORDER BY [Project15].[mtime] DESC) AS [row_number]
FROM ( SELECT
[Extent5].[Id] AS [Id],
[Extent5].[mtime] AS [mtime],
1 AS [C1]
FROM [dbo].[ContentItem] AS [Extent5]
WHERE 4 = CAST( [Extent5].[Type] AS int)
) AS [Project15]
) AS [Project15]
WHERE [Project15].[row_number] > 0
ORDER BY [Project15].[mtime] DESC) AS [UnionAll4] ) AS [Distinct4]
LEFT OUTER JOIN [dbo].[Retailer] AS [Extent6] ON [Distinct4].[C2] = [Extent6].[Id]
LEFT OUTER JOIN [dbo].[Brand] AS [Extent7] ON [Distinct4].[C2] = [Extent7].[Id]
LEFT OUTER JOIN [dbo].[Product] AS [Extent8] ON [Distinct4].[C2] = [Extent8].[Id]
LEFT OUTER JOIN [dbo].[Certification] AS [Extent9] ON [Distinct4].[C2] = [Extent9].[Id]
LEFT OUTER JOIN [dbo].[Claim] AS [Extent10] ON [Distinct4].[C2] = [Extent10].[Id]
So my overall questions are:
1) Is there a simpler SQL query I can execute to get the same results? I know that T-SQL doesn't support offsets per table in a subquery, hence the subquery wrapping.
2) If there isn't, what am I doing wrong in my LINQ query? Is this even possible with LINQ?
I wanted to add the SQL from #radar here all nice and formatted. It at least appears to be an elegant solution to avoid the sub-subqueries, and still accomplishes the offset/fetch.
SELECT *
FROM (SELECT
[ContentItem].*,
row_number() OVER ( PARTITION BY Type ORDER BY ContentItem.mtime ) as rn
FROM [dbo].[ContentItem]
LEFT JOIN [dbo].[Retailer] ON (Retailer.Id = ContentItem.Id)
LEFT JOIN [dbo].[Brand] ON (Brand.Id = ContentItem.Id)
LEFT JOIN [dbo].[Product] ON (Product.Id = ContentItem.Id)
LEFT JOIN [dbo].[Certification] ON (Certification.Id = ContentItem.Id)
LEFT JOIN [dbo].[Claim] ON (Claim.Id = ContentItem.Id)
) as x
WHERE x.rn >= a AND x.rn <= b;
a is the lower threshold (offset) and b is the upper threshold (fetch-ish). The only catch is that b now equals fetch + a instead of just fetch. The first set of results would be WHERE x.rn >= 0 AND x.rn <= 6, the second set WHERE x.rn >= 6 AND x.rn <= 12, third WHERE x.rn >= 12 AND x.rn <= 18, and so on.
As you are looking simpler SQL, you can use row_number analytic function, which would be faster
You need to try and see as there are many left joins and also proper index need to exists in these tables.
select *
from (
select *, row_number() over ( partition by Type order by ContentItem.mtime ) as rn
from [dbo].[ContentItem]
LEFT JOIN [dbo].[Retailer] ON (Retailer.Id = ContentItem.Id)
LEFT JOIN [dbo].[Brand] ON (Brand.Id = ContentItem.Id)
LEFT JOIN [dbo].[Product] ON (Product.Id = ContentItem.Id)
LEFT JOIN [dbo].[Certification] ON (Certification.Id = ContentItem.Id)
LEFT JOIN [dbo].[Claim] ON (Claim.Id = ContentItem.Id);
)
where rn <= 6
Well, it appears that I'm an idiot. That TOP(1000) call should have tipped me off. I assumed that my take variable was set to 6 but it was, in fact, set to 1000. Turns out my giant LINQ query works as expected, but the nested UNION ALL statements threw me off.
Still, I'm going to investigate #radar's answer further. It's hard to argue with better performance.

Querying in advance with linq2sql for data with gaps

Hi i have table: Values with ValueId, Timestamp , Value and BelongTo. Each 15 minutes there is insreted new row into that table with new value, current timestamp and specific BelongTo field. And now i want to find gaps i mean values where one after another has timestamp more then 15 minutes.
I was trying this:
var gaps = from p1 in db.T_Values
join p2 in db.T_Values on p1.TimeStamp.AddMinutes(15) equals p2.TimeStamp
into grups where !grups.Any() select new {p1};
and it works but i don't know if this is optimall, what do you think? and i don't know how can i add where p1.BelongTo == 1. Cos this query looks for all data.
Jon told
var gaps = from p1 in db.T_Values
where p1.BelongTo == 1
where !db.T_Values.Any(p2 => p1.TimeStamp.AddMinutes(15) == p2.Timestamp)
select p1;
Jon this last query is translated to:
exec sp_executesql N'SELECT [t0].[ValueID], [t0].[TimeStamp], [t0].[Value],
[t0].[BelongTo], [t0].[Type]
FROM [dbo].[T_Values] AS [t0]
WHERE (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[T_Values] AS [t1]
WHERE DATEADD(ms, (CONVERT(BigInt,#p0 * 60000)) % 86400000,
DATEADD(day, (CONVERT(BigInt,#p0 * 60000)) / 86400000, [t0].[TimeStamp])) = [t1].[TimeStamp]
))) AND ([t0].[BelongTo] = #p1)',N'#p0 float,#p1 int',#p0=15,#p1=1
and it works unless all rows have the same belongTo, when there are rows with BelongTo with many diferent values then i've noticed I need to add to sql:and [t1].BelongTo = 1 which should finally look like this
N'SELECT [t0].[ValueID], [t0].[TimeStamp], [t0].[Value], [t0].[BelongTo], [t0].[Type]
FROM [dbo].[T_Values] AS [t0]
WHERE (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[T_Values] AS [t1]
WHERE DATEADD(ms, (CONVERT(BigInt,#p0 * 60000)) % 86400000,
DATEADD(day, (CONVERT(BigInt,#p0 * 60000)) / 86400000, [t0].[TimeStamp])) = [t1].[TimeStamp]
and [t1].BelongTo = 1
))) AND ([t0].[BelongTo] = #p1)',N'#p0 float,#p1 int',#p0=15,#p1=1
other words:
SELECT TimeStamp
FROM [dbo].[T_Values] AS [t0]
WHERE NOT( (EXISTS (SELECT NULL AS [EMPTY]
FROM [dbo].[T_Values] AS [t1]
WHERE DATEADD(MINUTE, 15, [t0].[TimeStamp]) = [t1].[TimeStamp])))
AND ([t0].[BelongTo] = 1)
shoud change to
SELECT TimeStamp
FROM [dbo].[T_Values] AS [t0]
WHERE NOT( (EXISTS (SELECT NULL AS [EMPTY]
FROM [dbo].[T_Values] AS [t1]
WHERE DATEADD(MINUTE, 15, [t0].[TimeStamp]) = [t1].[TimeStamp] and [t1].BelongTo=1)))
AND ([t0].[BelongTo] = 1)
but I am still thinking how can I add this to linkq
Well adding the extra where clause is easy (and I'll remove the pointless anonymous type at the same time):
var gaps = from p1 in db.T_Values
where p1.BelongTo == 1
join p2 in db.T_Values
on p1.TimeStamp.AddMinutes(15) equals p2.TimeStamp
into grups
where !grups.Any()
select p1;
I'm not sure why you're grouping though... I would have thought this would be simpler:
var gaps = from p1 in db.T_Values
where p1.BelongTo == 1
where !db.T_Values.Any(p2 => p1.TimeStamp.AddMinutes(15) == p2.Timestamp)
select p1;
As for performance - look at the generated SQL and how it looks in SQL profiler.
EDIT: If you need the BelongTo check in both versions (makes sense) I'd suggest this:
var sequence = db.T_Values.Where(p => p.BelongTo == 1);
var gaps = from p1 in sequence
where !sequence.Any(p2 => p1.TimeStamp.AddMinutes(15) == p2.Timestamp)
select p1;
How about
var gaps = dbT_Values.Take(dbT_Values.Count()-1)
.Select((p, index) => new {P1 = p, P2 = dbT_Values.ElementAt(index + 1)})
.Where(p => p.P1.BelongsTo == 1 && p.P1.TimeStamp.AddMinutes(15).Equals(p.P2.TimeStamp)).Select(p => p.P1);

Categories

Resources