LINQ to entities slow query - c#

I have a huge problem with Sql Server query execution time which I've debugged for a long time with no success.
Basically I'm generating a report where 'Order' statistics are grouped by order and shown to the user. The problem is, that most of the time the query execution is reasonably fast, but occasionally it plummets and causes timeouts on the server.
What I've gotten myself from it is that the occasional poor query performance seems to be caused by parameter sniffing in the SQL Server. My relations have very mixed amount of related rows; some relation might have 10 000 rows for one parent row but the next row could have only 1 related row. I think this causes the Query optimizer in some cases to ignore indexes completely and cause really poor performance.
Basically I have no idea how to approach with a fix to this problem. I either have to optimize my query below somehow OR come up with some way to force the query optimizer to use indexes every time. Stored procedures are not an option in this project, unfortunately.
What I've tried is to create independent requests for every 'Order', but as theres over 1000 orders in the system, that causes massive slowness and really isn't an option. The closest I've gotten to get it to run within a reasonable execution time is the query below which in turn seems to suffer from the parameter sniffing problem.
result = (from ord in db.Orders
join comm in db.Comments.Where(i =>
i.UserId == userId &&
i.Created >= startDate &&
i.Created < endDate &&
i.UserGroupId == channelId &&
i.Campaign.CampaignCountryId == countryId &&
(i.CommentStatus.Name == CommentStatus.Approved || i.CommentStatus.Name == CommentStatus.Pending))
on ord.OrderId equals comm.OrderId into Comments
join motif in db.Motifs.Where(i =>
i.UserId == userId &&
i.Created > startDate &&
i.Created < endDate &&
i.UserGroupId == channelId && i.Campaign.CampaignCountryId == countryId)
on ord.OrderId equals motif.OrderId into Motifs
where ord.EndDate > startDate
select new ReportRow()
{
OrderName = ord.Name,
OrderId = ord.OrderId,
ChannelId = channelId,
Comments = Comments.Count(c => c.CommentStatus.Name == CommentStatus.Approved),
PendingComments = Comments.Count(c => c.CommentStatu.Name == CommentStatus.Pending),
Motifs = Motifs.Count(),
UniqueMotifs = Motifs.GroupBy(c => c.Uin).Count(),
ApprovedValue = ((decimal?)Motifs.GroupBy(c => c.Uin).Select(c => c.FirstOrDefault()).Sum(c => c.Value) ?? 0) + ((decimal?)Comments.Where(c => c.Commentstatu.Name == Commentstatus.Approved).Sum(c => c.Value) ?? 0),
PendingValue = ((decimal?)Comments.Where(c => c.Commentstatu.Name == Commentstatus.Pending).Sum(c => c.Value) ?? 0)
}).ToList();
return result;
Any help and ideas about how to make my reporting run reasonably fast every time would be greatly appreciated - no matter if it's query optimizing itself or some awesome ideas for reporting in SQL in general.
I'm using Azure SQL if that makes any difference.
Also note that when I'm running the generated query from the LINQ above in SSMS I get good query execution time on every single run so the DB design shouldn't be a problem here albeit it might not be the most efficient solution anyway.

Maybe not an answer for your, just a thought. You could create a view for your report, and then query the view to get your results. This would ensure that the query runs fine in SQL every time, from what you are saying.
You use them similar to tables, and can make any queries on them.
Check this post for some tips on consuming views in EF.

Related

Multiple LINQ to SQL queries. Enumeration optimisation

I have a list of ~ 15,000 'team's that need an individual linq query to return results.
Namely - [Select the last 10 games for 'team']
public IEnumerable<ResultsByDate> SelectLast10Games(DateTime date, string team)
{
return
( from e in db.E0s
where e.DateFormatted < date &&
(e.HomeTeam == team || e.AwayTeam == team)
orderby e.Date descending
select new ResultsByDate
{
Date = e.Date,
HomeTeam = e.HomeTeam,
AwayTeam = e.AwayTeam,
HomeGoals = e.FTHG,
AwayGoals = e.FTAG
}
).Take(10);
}
This query is probably fine, it seems fast enough when called 15,000 times.
My real issue is that I have to enumerate each query and this really kills the performance.
For each of these queries I need to run a method on the 10 results and hence the queries need enumerating.
The question is how can I avoid 15,000 enumerations?
I thought about placing each of the results into a big list and then calling .ToList() or whatever's best, but adding to a List enumerates as it goes along so this doesn't seem viable.
Is there a way to combine all 15,000 LINQ queries into one giant LINQ query such as..
public IEnumerable<ResultsByDate> SelectLast10Games(DateTime date, List<string> Teams)
{
foreach(var team in Teams)
{ var query =
(from e in db.E0s
where e.DateFormatted < date &&
(e.HomeTeam == team || e.AwayTeam == team)
orderby e.Date descending
select new ResultsByDate
{
Date = e.Date,
HomeTeam = e.HomeTeam,
AwayTeam = e.AwayTeam,
HomeGoals = e.FTHG,
AwayGoals = e.FTAG
}
).Take(10);
}
}
So this would return one huge result set that I can then enumerate in one go and work from there?
I have tried but I can't seem to get the LINQ loop correct ( if it's even possible - and the best way to fix my issue).
The whole program takes ~ 29 minutes to complete. Without the enumeration its around 30 seconds which is not amazing but satisfactory given the criteria.
Thanks!
This can be accomplish with using Teams.Select(team => ..)
var query = Teams
.Select(team =>
db.E0s
.Where(e => e.DateFormatted < date && (e.HomeTeam == team || e.AwayTeam == team))
.OrderByDescending(e => e.Date)
.Select(
e =>
new ResultsByDate {
Date = e.Date,
HomeTeam = e.HomeTeam,
AwayTeam = e.AwayTeam,
HomeGoals = e.FTHG,
AwayGoals = e.FTAG
}
)
.Take(10)
)
If you're looking for best performance for heavily querying, you should consider using SQL Stored Procedure and calling it using ADO.NET, Dapper or EntityFramework (The order of choices is from the optimal to the trivial) My recommendation is using Dapper. This will speed up your query, especially if the table is indexed correctly.
To feed 15k parameters efficiently into server, You can use TVP:
http://blog.mikecouturier.com/2010/01/sql-2008-tvp-table-valued-parameters.html
My real issue is that I have to enumerate each query and this really
kills the performance.
Unless You enumerate the result, there is no call to the server. So no wonder it is fast without enumeration. But that does not mean that the enumeration is the problem.

Relatively complex Query in EntitySpaces returns different results than when executed for real

I've got (what I consider to be) a relatively complex SQL query to select some data for a new feature for an app I maintain at work. Here is the query.
SELECT pchtq.[sourceappid],
pchtq.[transactioncode],
pchtq.[accountid],
pchtq.[year],
pchtq.[filingdatetime],
pchtq.[amount],
pchtq.[note],
pchtq.Status,
pchtq.SourceRecordID,
utaq.fullname,
utaq.username
FROM [database].[dbo].[transactions1] pchtq
INNER JOIN [database].[dbo].[useraccounts] utaq
ON pchtq.[accountid] = utaq.[accountid]
WHERE
pchtq.status = 'failed'
AND pchtq.[recordcreatedatetime] >= '01/01/2015'
AND pchtq.[recordcreatedatetime] <= '05/11/2016'
and Not exists
(
select TransactionID
from
Database.dbo.Transactions2 eft
where
CAST(eft.TransactionID as nvarchar(50)) = pchtq.SourceRecordID
and eft.status IN( 'paid', 'audit' )
)
In entityspaces, I have it written like this:
pchtq
.Select(
pchtq.SourceAppID,
pchtq.TransactionCode,
pchtq.AccountID,
pchtq.Year,
pchtq.FilingDateTime,
pchtq.Amount,
pchtq.Note
).InnerJoin(utaq).On(pchtq.AccountID == utaq.AccountID)
.Where(pchtq.RecordCreateDateTime >= request.StartDate
&& pchtq.RecordCreateDateTime <= request.EndDate
&& pchtq.Status == "FAILED")
.NotExists(
eftq.Select(
eftq.EFileTransactionID
).Where(eftq.Status.In("Paid", "Audit")
&& Convert.ToString(eftq.TransactionID) == pchtq.SourceRecordID));
However, when I run it in the ES app (using pchtc.Load(pchtq)), I get about 7500 rows, whereas when I run the SQL query, I get about 1500.
What's going wrong here?
Maybe it's the fact that the strings you have differ in case? "FAILED" vs "failed" and "Audit" vs "audit"?
It's the only difference I can see honestly.

very slow IOrderedQueryable ToList()

I have the query that returns parent with filtered child's:
Context.ContextOptions.LazyLoadingEnabled = false;
var query1 = (from p in Context.Partners
where p.PartnerCategory.Category == "03"
|| p.PartnerCategory.Category == "02"
select new
{
p,
m = from m in p.Milk
where m.Date >= beginDate
&& m.Date <= endDate
&& m.MilkStorageId == milkStorageId
select m,
e = p.ExtraCodes,
ms = from ms in p.ExtraCodes
select ms.MilkStorage,
mp = from mp in p.MilkPeriods
where mp.Date >= beginDate
&& mp.Date <= endDate
select mp
}).Where(
p =>
p.p.ExtraCodes.Select(ex => ex.MilkStorageId).Contains(
milkStorageId) ).OrderBy(p => p.p.Name);
var partners = query1.AsEnumerable().ToList();
Query return 200 records and converting from IOrderedQueryable ToList() is very slow. Why?
After profiling query in sql server management studio i've noticed that query execute's 1 second and returns 2035 records.
There could be a number of reasons for this and without any profiler information it's just guess work and even highly educated guess work by some one that knows the code and domain well is often wrong.
You should profile the code and since it's likely that the bottleneck is in the DB get the command text as #Likurg suggests and profile that in the DB. It's likely that you are missing one or more indexes.
There's a few things you could do to the query it self as well if for nothing else to make it easier to understand and potentially faster
E.g.
p.p.ExtraCodes.Select(ex => ex.MilkStorageId).Contains(milkStorageId)
is really
p.p.ExtraCodes.Any(ex => ex.MilkStorageId == milkStorageId)
and could be moved to the first where clause potentially lowering the number of anonymously typed objects you create. That said the most likely case is that one of the many fields you use in your comparisons are with out an index potentially resulting in a lot of table scans for each element in the result set.
Some of the fields where an index might speed things up are
p.p.Name
m.Date
m.MilkStorageId
mp.Date
PartnerCategory.Category
The reason it is slow is because when you do ToList that is the time when the actual query execution takes place. This is called deferred execution.
You may see: LINQ and Deferred Execution
I don't think you need to do AsEnumerable when converting it to a list, you can do it directly like:
var partners = query1.ToList();
At first, look at the generated query by using this
Context.GetCommand(query1).CommandText;
then invoke this command in db. And check how many records reads by profiler.

A C# Linq to Sql query that uses SUM, Case When, Group by, outer join, aggregate and defaults

I've been searching for possible solutions and attempting this for several hours without luck. Any help will be greatly appreciated.
I've got a Sql statement which I'm trying to put together as a C# LINQ query.
Here is the working SQL:
SELECT up.UserProfileID
,up.FirstName
,up.LastName
,SUM(CASE WHEN ul.CompletionDate IS NULL THEN 0
ELSE ISNULL(ul.Score, 0)
END) AS TotalScore
FROM dbo.UserProfile up
LEFT OUTER JOIN dbo.UserLearning ul ON up.UserProfileID = ul.UserProfileID
WHERE up.ManagerUserProfileID IS NULL
GROUP BY up.UserProfileID, up.FirstName, up.LastName
I've tried several different ways but seem to end up with either a statement that doesn't return what I want or doesn't execute successfully
My current (non-working) code looks something like this:
var pd = from up in db.UserProfiles
join ul in db.UserLearnings on up.UserProfileID equals ul.UserProfileID into temp
from upJOINul in temp.DefaultIfEmpty(new UserLearning() { Score = 0 })
where up.ManagerUserProfileID.Equals(null)
group up by new
{
UserProfileID = up.UserProfileID,
FirstName = up.FirstName,
LastName = up.LastName,
TotalScore = up.UserLearnings.Sum(u => u.Score)
};
Thank you for any help
After several more attempts and further use of google I finally managed to get a working solution. I hope it'll be of use to someone else.
var pd = db.UserProfiles.AsEnumerable()
.Where(up => up.ManagerUserProfileID.Equals(null))
.Select(up => new
{
UserProfileID = up.UserProfileID,
FirstName = up.FirstName,
LastName = up.LastName,
TotalScore = up.UserLearnings
.Where(ul => ul.CompletionDate.HasValue && ul.Score.HasValue)
.DefaultIfEmpty()
.Sum(ul => ul != null && ul.Score.HasValue ? ul.Score : 0)
});
Not what you asked for, but if you have a working complex SQL query, that is fairly static, put it in a stored proc, and drag that SP to your LINQ DataContext.
The LINQ provider has to compile your query to sql every time it's called, and that takes time, and server CPU cycles. If it's a complex query, it can eat up significant resources. Also may miss some optimizations you can do with straight SQL.
Unless of course there is a purpose to it.
If you have ORM problem, grap the actual SQL commands, take a look at it, and compare with what you want to achieve. Can you show the generated SQL as well, so we can find the difference easier?

C# - Linq-To-SQL - Issue with queries

I am thoroughly frustrated right now. I am having an issue with LINQ-To-SQL. About 80% of the time, it works great and I love it. The other 20% of the time, the query that L2S creates returns the correct data, but when actually running it from code, it doesn't return anything. I am about to pull my hair out. I am hoping somebody can see a problem or has heard of this before. Google searching isn't returning much of anything.
Here is the linq query...
var query = from e in DataLayerGlobals.GetInstance().db.MILLERTIMECARDs
where e.deleted_by == -1
&& e.LNAME == lastName
&& e.FNAME == firstName
&& e.TIMECARDDATE == startDate.ToString("MM/dd/yyyy")
group e by e.LNAME into g
select new EmployeeHours
{
ContractHours = g.Sum(e => e.HRSCONTRACT),
MillerHours = g.Sum(e => e.HRSSHOWRAIN + e.HRSOTHER),
TravelHours = g.Sum(e => e.HRSTRAVEL)
};
This is the generated query....
SELECT SUM([t0].[HRSCONTRACT]) AS [ContractHours],
SUM([t0].[HRSSHOWRAIN] + [t0].[HRSOTHER]) AS [MillerHours],
SUM([t0].[HRSTRAVEL]) AS [TravelHours]
FROM [dbo].[MILLERTIMECARD] AS [t0]
WHERE ([t0].[deleted_by] = #p0)
AND ([t0].[LNAME] = #p1)
AND ([t0].[FNAME] = #p2)
AND ([t0].[TIMECARDDATE] = #p3)
GROUP BY [t0].[LNAME]
Now when I plug in the EXACT same values that the linq query is using into the generated query, I get the correct data. When I let the code run, I get nothing.
Any ideas?
What type is TIMECARDDATE? Date, datetime, datetime2, smalldatetime, datetimeoffset or character?
Any chance local date/time settings are messing up the date comparison of startDate.ToString(...)? Since you're sending #p3 as a string, 01/02/2009 may mean Feb 1st or January 2nd, depending on the date/time setting on the server.
My instinct is telling me that you need to be pulling out DataLayerGlobals.GetInstance().db.MILLERTIMECARDs into an IQueryable variable and executing your Linq query against that, although there really should be no difference at all (other than maybe better readability).
You can check the results of the IQueryable variable first, before running the Linq query against it.
To extend this concept a bit further, you can create a series of IQueryable variables that each store the results of a Linq query using each individual condition in the original query. In this way, you should be able to isolate the condition that is failing.
I'd also have a look at the LNAME & FNAME data types. If they're NCHAR/NVARCHAR you may need to Trim the records, e.g.
var query = from e in DataLayerGlobals.GetInstance().db.MILLERTIMECARDs
where e.deleted_by == -1
&& e.LNAME.Trim() == lastName
&& e.FNAME.Trim() == firstName
&& e.TIMECARDDATE == startDate.ToString("MM/dd/yyyy")
group e by e.LNAME into g
select new EmployeeHours
{
ContractHours = g.Sum(e => e.HRSCONTRACT),
MillerHours = g.Sum(e => e.HRSSHOWRAIN + e.HRSOTHER),
TravelHours = g.Sum(e => e.HRSTRAVEL)
};

Categories

Resources