Make EF generated query efficient - c#

Currently I need to copy records from Tabel1(SourceTable) to Table2(TargetTable) when record doesn't exist in Table2 along with few other condition .
I am using EntityFramework to do this job. When i used the below query to fetch records from source table i noticed Entity framework is using Left join with target table which is very slow, The same query when i replaced with not exist it worked fast.
so how to implement not exist in the below scenario?
Along with you can see 2 inner join with same table ? Why that so ?
In general how to overwrite EF generated query ?
We can do the above by mapping stored procedure but i am expecting without using SP mapping.
Query i have used to fetch :
var records = dal.SourceTransactions.Where((o) =>
o.Policy.Quote.Type == "1"
&& (o.TransactionType == 1 || o.TransactionType == 2 || o.TransactionType == 3 || o.TransactionType == 4)
&& o.TransactionDate < System.DateTime.Now &&
o.TargetTransaction == null);
generated EF query:
{SELECT
[Filter1].[ID1] AS [ID],
[Filter1].[TransactionDate] AS [TransactionDate],
[Filter1].[TransactionType] AS [TransactionType],
[Filter1].[PolicyId] AS [PolicyId]
FROM (SELECT [Extent1].[ID] AS [ID1], [Extent1].[TransactionDate] AS [TransactionDate], [Extent1].[TransactionType] AS [TransactionType], [Extent1].[PolicyId] AS [PolicyId]
FROM [dbo].[SourceTransactions] AS [Extent1]
INNER JOIN [dbo].[Policies] AS [Extent2] ON [Extent1].[PolicyId] = [Extent2].[ID]
INNER JOIN [dbo].[Quotes] AS [Extent3] ON [Extent2].[QuoteId] = [Extent3].[ID]
INNER JOIN [dbo].[Quotes] AS [Extent4] ON [Extent2].[QuoteId] = [Extent4].[ID]
WHERE ([Extent1].[TransactionType] IN (1,2,3,4)) AND
([Extent1].[TransactionDate] < (SysDateTime())) AND (N'1' = [Extent3].[Type]) AND ([Extent4].[Type] IS NOT NULL) )
AS [Filter1]
LEFT OUTER JOIN [dbo].[TargetTransactions] AS [Extent5] ON [Filter1].[ID1] = [Extent5]. [SourceTransactionID]
WHERE [Extent5].[SourceTransactionID] IS NULL}

You can force an EXISTS query like so:
dal.SourceTransactions
.Where(o => o.Policy.Quote.Type == "1"
&& (o.TransactionType == 1
|| o.TransactionType == 2
|| o.TransactionType == 3
|| o.TransactionType == 4)
&& o.TransactionDate < System.DateTime.Now
&& !dal.TargetTransactions
.Any(t => t.SourceTransactionID == o.ID)
So you explicitly create a subquery for the TargetTransactions. I assume that dal is a context instance, so it also exposes TargetTransactions.
Of course it's too bad that you have to massage EF into the best query. This may be one of the areas where EF 6 is improved, maybe worth a try.

Related

Change alias in ToQueryString() method EF Core 6

I need to use SQL queries and run them without Linq and EF Core on that project database has been written and managed with EF Core & Linq
testdb.Test.Include(a => a.Childs)
.Select(x => new { x.Name, Child = x.Childs.FirstOrDefault()})
.Where(a => a.Name == "" && a.Name == "").ToQueryString();
Result of this command is :
SELECT [t].[Name], [t1].[Id], [t1].[Name], [t1].[ParentId], [t1].[Test]
FROM [rest].[test2] AS [t]
LEFT JOIN (
SELECT [t0].[Id], [t0].[Name], [t0].[ParentId], [t0].[Test]
FROM (
SELECT [t2].[Id], [t2].[Name], [t2].[ParentId], [t2].[Test], ROW_NUMBER()
OVER(PARTITION BY [t2].[ParentId] ORDER BY [t2].[Id]) AS [row]
FROM [rest].[test] AS [t2]
) AS [t0] WHERE [t0].[row] <= 1
) AS [t1] ON [t].[Id] = [t1].[ParentId]
Now that result alias not equals with navigation properties name and I want to use for json auto to map that result in json format then deserialized to C# object.
Problem shown here I can't change alias in EF Core...

LINQ generating Nested/Sub queries

I am using Asp.NET & Entity Framework with SQL Server as Database, somehow I am getting this strange issue
I have this code:
var pricingInfo = (from price in invDB.Pricing.AsNoTracking()
join priceD in invDB.PricingDetail.AsNoTracking() on price.PricingId equals priceDtl.PricingId
join tagD in invDB.PricingTagDetail.AsNoTracking() on priceDtl.PricingDetailId equals tagDtl.PricingDetailId
join it in invDB.Item.AsNoTracking() on tagDtl.ItemId equals item.ItemId
join par in invDB.Party.AsNoTracking() on tagDtl.PartyId equals party.PartyId
join b in invDB.Brand.AsNoTracking() on tagDtl.BrandId equals brd.BrandId into t from brand in t.DefaultIfEmpty()
where tagDtl.AvailableQuantity > 0m && price.PricingNo == printNumber
select new
{
TagNo = tagDtl.TagNo,
SellingRate = tagDtl.SellingRate,
Quantity = tagDtl.AvailableQuantity ?? 0m,
ItemCode = item.Name,
UOMId = priceDtl.UOMId,
Brand = brand.BrandCode,
Supplier = party.PartyCode,
Offer = tagDtl.Offer
}).ToList();
Which generates the below sql query with a sub query, without where condition and it pulls out full records from a large volume data. This results to a heavy memory consumption and performance issues.
SELECT
[Filter1].[PricingId1] AS [PricingId],
[Filter1].[TagNo] AS [TagNo],
[Filter1].[SellingRate1] AS [SellingRate],
CASE WHEN ([Filter1].[AvailableQuantity] IS NULL) THEN cast(0 as decimal(18)) ELSE [Filter1].[AvailableQuantity] END AS [C1],
[Filter1].[Name] AS [Name],
[Filter1].[UOMId 1] AS [UOMId ],
[Extent6].[BrandCode] AS [BrandCode],
[Filter1].[PartyCode] AS [PartyCode],
[Filter1].[Offer] AS [Offer]
FROM
(
SELECT [Extent1].[PricingId] AS [PricingId1], [Extent1].[PricingNo] AS [PricingNo], [Extent2].[UnitOfMeasurementId] AS [UnitOfMeasurementId1], [Extent3].[TagNo] AS [TagNo], [Extent3].[BrandId] AS [BrandId1], [Extent3].[SellingRate] AS [SellingRate1], [Extent3].[AvailableQuantity] AS [AvailableQuantity], [Extent3].[Offer] AS [Offer], [Extent4].[Name] AS [Name], [Extent5].[PartyCode] AS [PartyCode]
FROM [PanERP].[Pricing] AS [Extent1]
INNER JOIN [PanERP].[PricingDetail] AS [Extent2] ON [Extent1].[PricingId] = [Extent2].[PricingId]
INNER JOIN [PanERP].[PricingTagDetail] AS [Extent3] ON [Extent2].[PricingDetailId] = [Extent3].[PricingDetailId]
INNER JOIN [PanERP].[Item] AS [Extent4] ON [Extent3].[ItemId] = [Extent4].[ItemId]
INNER JOIN [PanERP].[Party] AS [Extent5] ON [Extent3].[PartyId] = [Extent5].[PartyId]
WHERE [Extent3].[AvailableQuantity] > cast(0 as decimal(18))
) AS [Filter1]
LEFT OUTER JOIN [PanERP].[Brand] AS [Extent6] ON [Filter1].[BrandId1] = [Extent6].[BrandId]
WHERE ([Filter1].[PricingNo] = #p__linq__0) OR (([Filter1].[PricingNo] IS NULL) AND (#p__linq__0 IS NULL))
But When i change the condition
where tagDtl.AvailableQuantity > 0m
as a variable it creates another SQL query without nested select statement.
Here is the modified code
decimal availableQuantity = 0m;
var pricingInfo = (from price in invDB.Pricing.AsNoTracking()
join priceD in invDB.PricingDetail.AsNoTracking() on price.PricingId equals priceDtl.PricingId
join tagD in invDB.PricingTagDetail.AsNoTracking() on priceDtl.PricingDetailId equals tagDtl.PricingDetailId
join it in invDB.Item.AsNoTracking() on tagDtl.ItemId equals item.ItemId
join par in invDB.Party.AsNoTracking() on tagDtl.PartyId equals party.PartyId
join b in invDB.Brand.AsNoTracking() on tagDtl.BrandId equals brd.BrandId into t from brand in t.DefaultIfEmpty()
where tagDtl.AvailableQuantity > availableQuantity && price.PricingNo == printNumber
select new
{
TagNo = tagDtl.TagNo,
SellingRate = tagDtl.SellingRate,
Quantity = tagDtl.AvailableQuantity ?? availableQuantity,
ItemCode = item.Name,
UOMId = priceDtl.UOMId,
Brand = brand.BrandCode,
Supplier = party.PartyCode,
Offer = tagDtl.Offer
}).ToList();
and here is the SQL query without nested SQL statement.
SELECT
[Extent1].[PricingId] AS [PricingId],
[Extent3].[TagNo] AS [TagNo],
[Extent3].[SellingRate] AS [SellingRate],
CASE WHEN ([Extent3].[AvailableQuantity] IS NULL) THEN cast(0 as decimal(18)) ELSE [Extent3].[AvailableQuantity] END AS [C1],
[Extent4].[Name] AS [Name],
[Extent2].[UOMId ] AS [UOMId ],
[Extent6].[BrandCode] AS [BrandCode],
[Extent5].[PartyCode] AS [PartyCode],
[Extent3].[Offer] AS [Offer]
FROM [PanERP].[Pricing] AS [Extent1]
INNER JOIN [PanERP].[PricingDetail] AS [Extent2] ON [Extent1].[PricingId] = [Extent2].[PricingId]
INNER JOIN [PanERP].[PricingTagDetail] AS [Extent3] ON [Extent2].[PricingDetailId] = [Extent3].[PricingDetailId]
INNER JOIN [PanERP].[Item] AS [Extent4] ON [Extent3].[ItemId] = [Extent4].[ItemId]
INNER JOIN [PanERP].[Party] AS [Extent5] ON [Extent3].[PartyId] = [Extent5].[PartyId]
LEFT OUTER JOIN [PanERP].[Brand] AS [Extent6] ON [Extent3].[BrandId] = [Extent6].[BrandId]
WHERE ([Extent3].[AvailableQuantity] > #p__linq__0) AND (([Extent1].[PricingNo] = #p__linq__1) OR (([Extent1].[PricingNo] IS NULL) AND (#p__linq__1 IS NULL)))
If I move the where condition to the model definition as lambda expression, like this
from price in inventoryDb.Pricing.AsNoTracking().Where(c =>
c.PricingNo == printNumber))
then also it works fine.
Why is LINQ generating a nested Select? How can we avoid this?
Thanks in advance for your answers.
Well, I think you have answered your own question, on your comments. I will just try to clarify what is going on.
When you use a hard-coded constant, like 0m, the framework translates it into SQL keeping the value as a constant:
WHERE [Extent3].[AvailableQuantity] > cast(0 as decimal(18))
When you use a local variable, like “availableQuantity”, the framework creates a parameter:
([Extent3].[AvailableQuantity] > #p__linq__0)
I might be wrong, but, as I see, this is done in order to preserve the programmer’s goal when writing the code (constant = constant, variable = parameter).
And what about the subquery?
This is a query optimization logic (a bad one, probably, at least on this scenario). When you make a query using parameters, you might run it several times, but SQL Server will always use the same execution plan, making the query faster; when you use constants, each query need to be reevaluated (if you check SQL Server Activity Monitor, you will see that queries with parameters are treated as the same query, regardless the parameters values).
This way, in my opinion (sorry, I could not find any documentation about it), Entity Framework is trying to isolate the queries; the outer/generic one, that use parameters, and the inner/specific one, that use constants.
I would be happy if anyone could complement it with some Microsoft documentation about this subject…

LINQ - select statement in the selected column

i am intend to convert the following query into linQ
SELECT TOP 100 S.TxID,
ToEmail,
[Subject],
ProcessedDate,
[Status] = (CASE WHEN EXISTS (SELECT TxID FROM TxBounceTracking
WHERE TxID = S.TxID)
THEN 'Bounced'
WHEN EXISTS (SELECT TxID FROM TxOpenTracking
WHERE TxID = S.TxID)
THEN 'Opened'
ELSE 'Sent' END)
FROM TxSubmissions S
WHERE S.UserID = #UserID
AND ProcessedDate BETWEEN #StartDate AND #EndDate
ORDER BY ProcessedDate DESC
The following code is the linq that i converted.
v = (from a in dc.TxSubmissions
where a.ProcessedDate >= datefrom && a.ProcessedDate <= dateto && a.UserID == userId
let bounce = (from up in dc.TxBounceTrackings where up.TxID == a.TxID select up)
let track = (from up in dc.TxOpenTrackings where up.TxID == a.TxID select up)
select new { a.TxID, a.ToEmail, a.Subject,
Status = bounce.Count() > 0 ? "Bounced" : track.Count() > 0 ? "Opened" : "Sent",
a.ProcessedDate });
However this linq is too slow because the bounce and track table, how should i change the linq query to select one row only to match the SQL query above >>
SELECT TxID FROM TxOpenTracking WHERE TxID = S.TxID
in my selected column, so it can execute faster.
Note that the record contained one million records, thats why it lag
As you don't care about readability because you will end up generating the query via EF you can try to join with those two tables. (it looks that TxID is a FK or a PK/FK)
More about JOIN vs SUB-QUERY here: Join vs. sub-query
Basically your SQL looks like this:
SELECT TOP 100 S.TxID, ToEmail, [Subject], ProcessedDate,
[Status] = (CASE WHEN BT.TxID IS NOT NULL
THEN 'Bounced'
WHEN OP.TxID IS NOT NULL
THEN 'Opened'
ELSE 'Sent' END)
FROM TxSubmissions S
LEFT JOIN TxBounceTracking BT ON S.TxID = BT.TxID
LEFT JOIN TxOpenTracking OP ON S.TxID = OP.TxID
WHERE S.UserID = #UserID
AND ProcessedDate BETWEEN #StartDate AND #EndDate
ORDER BY ProcessedDate DESC
And then, you can try to convert it to LINQ something like:
v = (from subs in dc.TxSubmissions.Where(sub => sub.ProcessedDate >= datefrom && sub.ProcessedDate <= dateto && sub.UserID == userId)
from bts in dc.TxBounceTrackings.Where(bt => bt.TxID == subs.TxID).DefaultIfEmpty()
from ots in dc.TxOpenTrackings.Where(ot => ot.TxID == subs.TxID).DefaultIfEmpty()
select new { });
More about left join in linq here: LEFT JOIN in LINQ to entities?
Also if you remove default if empty you'll get a inner join.
Also you need to take a look at generated SQL in both cases.

Linq to Entities complexity

I'm starting with Linq to entities and maybe someone could shed some light.
I have two tables - Vizite (parent table) and AngajatiVizite (child table). I'm using Database first, so I have created the relationship between them using Vizite.Id and AngajatiVizite.IdVizita.
I need to get the rows from Vizite and one more bit field which must be 0 if the DataStart or DataEnd fields are null or count of child records from AngajatiVizite is zero. That's it, if the Vizite has zero subordinate records or any of those Data#### fields is null, the calculated field is 0.
So far so good, the linq I'm using works properly. The syntax I have used is this one:
var list = ctx.Vizite
.OrderBy(p => p.DataEnd != null && p.DataStart != null && p.AngajatiVizite.Count > 0)
.ThenBy(p => p.Data)
.Select(p => new
{
p.Id,
p.Numar,
p.Data,
p.DataStart,
p.DataEnd,
Programat = p.DataEnd != null && p.DataStart != null && p.AngajatiVizite.Count > 0
})
.ToList();
The sql command generated by Linq is extremely complex and I don't understand why it has to be that complex and what's the difference.
What I'm getting from linq is this:
SELECT
[Project6].[Numar] AS [Numar],
[Project6].[Id] AS [Id],
[Project6].[Data] AS [Data],
[Project6].[DataStart] AS [DataStart],
[Project6].[DataEnd] AS [DataEnd],
[Project6].[C2] AS [C1]
FROM ( SELECT
[Project5].[C1] AS [C1],
[Project5].[Id] AS [Id],
[Project5].[Numar] AS [Numar],
[Project5].[Data] AS [Data],
[Project5].[DataStart] AS [DataStart],
[Project5].[DataEnd] AS [DataEnd],
CASE WHEN ([Project5].[C2] > 0) THEN cast(1 as bit) WHEN ( NOT ([Project5].[C3] > 0)) THEN cast(0 as bit) END AS [C2]
FROM ( SELECT
[Project4].[C1] AS [C1],
[Project4].[Id] AS [Id],
[Project4].[Numar] AS [Numar],
[Project4].[Data] AS [Data],
[Project4].[DataStart] AS [DataStart],
[Project4].[DataEnd] AS [DataEnd],
[Project4].[C2] AS [C2],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[AngajatiVizite] AS [Extent5]
WHERE [Project4].[Id] = [Extent5].[IdVizita]) AS [C3]
FROM ( SELECT
[Project3].[C1] AS [C1],
[Project3].[Id] AS [Id],
[Project3].[Numar] AS [Numar],
[Project3].[Data] AS [Data],
[Project3].[DataStart] AS [DataStart],
[Project3].[DataEnd] AS [DataEnd],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[AngajatiVizite] AS [Extent4]
WHERE [Project3].[Id] = [Extent4].[IdVizita]) AS [C2]
FROM ( SELECT
CASE WHEN ([Project2].[C1] > 0) THEN cast(1 as bit) WHEN ( NOT ([Project2].[C2] > 0)) THEN cast(0 as bit) END AS [C1],
[Project2].[Id] AS [Id],
[Project2].[Numar] AS [Numar],
[Project2].[Data] AS [Data],
[Project2].[DataStart] AS [DataStart],
[Project2].[DataEnd] AS [DataEnd]
FROM ( SELECT
[Project1].[Id] AS [Id],
[Project1].[Numar] AS [Numar],
[Project1].[Data] AS [Data],
[Project1].[DataStart] AS [DataStart],
[Project1].[DataEnd] AS [DataEnd],
[Project1].[C1] AS [C1],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[AngajatiVizite] AS [Extent3]
WHERE [Project1].[Id] = [Extent3].[IdVizita]) AS [C2]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Numar] AS [Numar],
[Extent1].[Data] AS [Data],
[Extent1].[DataStart] AS [DataStart],
[Extent1].[DataEnd] AS [DataEnd],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[AngajatiVizite] AS [Extent2]
WHERE [Extent1].[Id] = [Extent2].[IdVizita]) AS [C1]
FROM [dbo].[Vizite] AS [Extent1]
) AS [Project1]
) AS [Project2]
) AS [Project3]
) AS [Project4]
) AS [Project5]
) AS [Project6]
when all I needed was actually this:
Select
Vizite.Id
, Vizite.Numar
, Vizite.Data
, Vizite.DataStart
, Vizite.DataEnd
, Case
When DataStart != Null And DataEnd != Null And (Select Count(Id) From AngajatiVizite Where Vizite.Id = AngajatiVizite.IdVizita) > 0 Then 1
Else 0
End As Programat
From Vizite
Order By Programat, Data
Can anyone please explain to me why the generated SQL is that complex that's even almost impossible to figure it out by simply reading the sql syntax?
Thank you
Entity Framework doesn't build handsome queries, that's a fact; and it's a nuisance sometimes, because it can be really hard to trace SQL logging back to LINQ statements.
It would be a problem though, if the query plan optimizer wouldn't know how to handle them. Fortunately, when Sql Server is concerned, the EF team has managed to make SQL better optimizable in each release since EF5. So generally you shouldn't worry about it too much and only start looking into it when performance is worse than can reasonably be expected.
There are some rules of the thumb though. One of them is to calculate computed values only once. This is where the let keyword comes in handy:
var list = (from p in ctx.Vizite
let Programat = p.DataEnd != null && p.DataStart != null
&& p.AngajatiVizite.Count > 0
order by Programat, p.Data
select new
{
p.Id,
p.Numar,
p.Data,
p.DataStart,
p.DataEnd,
Programat
}).ToList();
This works well in LINQ query syntax. In fluent (method) syntax you can do the exact same thing, but that requires two subsequent Select statements.
What happens to the complexity of your SQL statement if you do the following?
var list = ctx.Vizite
.Select(p => new
{
p.Id,
p.Numar,
p.Data,
p.DataStart,
p.DataEnd,
Programat =
p.DataEnd != null && p.DataStart != null && p.AngajatiVizite.Count > 0
}
.OrderBy(p => p.Programat)
.ThenBy(p => p.Data)
.ToList();
My hope is that by not repeating p.DataEnd != null && p.DataStart != null && p.AngajatiVizite.Count > 0 and moving the OrderBy and ThenBy after the select, you'll get a simpler query.
Edit
To potentially simplify the SQL even further, you could opt to do some of the work after obtaining the raw data from the database:
var list = ctx.Vizite
.Select(p => new
{
p.Id,
p.Numar,
p.Data,
p.DataStart,
p.DataEnd,
AngajatiViziteCount = p.AngajatiVizite.Count
}
.AsEnumerable() // do the rest of the work using LINQ to objects
.OrderBy(p => p.DataEnd != null && p.DataStart != null && p.AngajatiViziteCount > 0)
.ThenBy(p => p.Data)
.ToList();

How to write linq query to prevent duplicates joins?

I have a query that search for all accommodations in an order, sorted by day. When I check on the sever what query is executed, I see multiple join toward the same table on the same keys
var parcourt = this.DataService.From<OrderItem>()
.Where(i => i.OrderId == orderId && i.Product.ProductTypeId == (int)ProductTypes.Accommodation)
.OrderBy(i => i.DayNumber)
.ThenBy(i => i.OrderItemId)
.Select(i => new
{
i.OrderItemId,
i.DayNumber,
i.Product.Establishment.Address,
i.Product.Establishment.Coordinates
});
If you check the resulting SQL (as show by ToTraceString), you can see two join on the Products and Establishments table.
SELECT
[Project1].[OrderItemId] AS [OrderItemId],
[Project1].[DayNumber] AS [DayNumber],
[Project1].[Address] AS [Address],
[Project1].[EstablishmentId] AS [EstablishmentId],
[Project1].[Latitude] AS [Latitude],
[Project1].[Longitude] AS [Longitude]
FROM ( SELECT
[Extent1].[OrderItemId] AS [OrderItemId],
[Extent1].[DayNumber] AS [DayNumber],
[Extent4].[Address] AS [Address],
[Extent5].[EstablishmentId] AS [EstablishmentId],
[Extent5].[Latitude] AS [Latitude],
[Extent5].[Longitude] AS [Longitude]
FROM [dbo].[OrderItems] AS [Extent1]
INNER JOIN [dbo].[Products] AS [Extent2] ON [Extent1].[ProductId] = [Extent2].[ProductId]
LEFT OUTER JOIN [dbo].[Products] AS [Extent3] ON [Extent1].[ProductId] = [Extent3].[ProductId]
LEFT OUTER JOIN [dbo].[Establishments] AS [Extent4] ON [Extent3].[EstablishmentId] = [Extent4].[EstablishmentId]
LEFT OUTER JOIN [dbo].[Establishments] AS [Extent5] ON [Extent3].[EstablishmentId] = [Extent5].[EstablishmentId]
WHERE (1 = [Extent2].[ProductTypeId]) AND ([Extent1].[OrderId] = #p__linq__0)
) AS [Project1]
ORDER BY [Project1].[DayNumber] ASC, [Project1].[OrderItemId] ASC
How can I prevent this linq-to-entities from joining twice on a table? How can I rewrite the query to avoid this situation?
The table structure goes as follow (simplified):
This is the query
Could you try this query? I think if you call all your joins explicitly, it'll not create joins automatically.
var parcourt = (from i in this.DataService.OrderItem
join p in this.DataService.Product on p.ProductId equals i.ProductId
join e in this.DataService.Establishments on e.EstablishmentId equals p.EstablishmentId
where i.OrderId == orderId && p.ProductTypeId == (int)ProductTypes.Accomodation
orderby i.DayNumber, i.OrderItemId
select new
{
i.OrderItemId,
i.DayNumber,
e.Address,
e.Coordinates
});

Categories

Resources