Curious slowness of EF vs SQL - c#

In a heavily multi-threaded scenario, I have problems with a particular EF query. It's generally cheap and fast:
Context.MyEntity
.Any(se => se.SameEntity.Field == someValue
&& se.AnotherEntity.Field == anotherValue
&& se.SimpleField == simpleValue
// few more simple predicates with fields on the main entity
);
This compiles into a very reasonable SQL query:
SELECT
CASE WHEN ( EXISTS (SELECT
1 AS [C1]
FROM (SELECT [Extent1].[Field1] AS [Field1]
FROM [dbo].[MyEntity] AS [Extent1]
INNER JOIN [dbo].[SameEntity] AS [Extent2] ON [Extent1].[SameEntity_Id] = [Extent2].[Id]
WHERE (N'123' = [Extent2].[SimpleField]) AND (123 = [Extent1].[AnotherEntity_Id]) AND -- further simple predicates here -- ) AS [Filter1]
INNER JOIN [dbo].[AnotherEntity] AS [Extent3] ON [Filter1].[AnotherEntity_Id1] = [Extent3].[Id]
WHERE N'123' = [Extent3].[SimpleField]
)) THEN cast(1 as bit) ELSE cast(0 as bit) END AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
The query, in general, has optimal query plan, uses the right indices and returns in tens of milliseconds which is completely acceptable.
However, when a critical number of threads (<=40) starts executing this query, the performance on it drops to tens of seconds.
There are no locks in the database, no queries are writing data to these tables and it reproduces very well with a database that's practically isolated from any other operations. The DB resides on the same physical machine and the machine is not overloaded at any point, i.e. has plenty of spare CPU, memory and other resources the CPU is overloaded by this operation.
Now what's really bizarre is that when I replace the EF Any() call with Context.Database.ExecuteSqlCommand() with the copy-pasted SQL (also using parameters), the problem magically disappears. Again, this reproduces very reliably - replacing the Any() call with copy-pasted SQL increases the performance by 2-3 orders of magnitude .
An attached profiler (dotTrace) sampling shows that the threads seem to all spend their time in the following method:
Is there anything I've missed or did we hit some ADO.NET / SQL Server cornercase?
MORE CONTEXT
The code running this query is a Hangfire job. For the purpose of test, a script queues a lot of jobs to be performed and up to 40 threads keep processing the job. Each job uses a separate DbContext instance and it's not really being used a lot. There are a few more queries before and after the problematic query and they take expected times to execute.
We're using many different Hangfire jobs for similar purposes and they behave as expected. Same with this one, except when it gets slow under high concurrency (of exact same jobs). Also, just switching to SQL on this particular query fixes the problem.
The profiling snapshot above is representative, all the threads slow down on this particular method call and spend the vast majority of their time on it.
UPDATE
I'm currently re-running a lot of those checks for sanity and errors. The easy reproduction means it's still on a remote machine to which I can't connect using VS for debugging.
One of the checks showed that my previous statement about free CPU was false, the CPU was not entirely overloaded but multiple cores were in fact running on full capacity for the whole duration of the long running jobs.
Re-checking everything again and will come back with updates here.

Can you try as shown below and see whether is there any performance improvement or not ...
Context.MyEntity.AsNoTracking()
.Any(se => se.SameEntity.Field == someValue
&& se.AnotherEntity.Field == anotherValue
&& se.SimpleField == simpleValue
);

Check if you are reusing the context in a loop. Doing so may create many objects during your performance test and giving the garbage collector a lot of work to do.

Faulty initial assumptions. The SQL in the question was obtained by pasting the code into LINQPad and having it generate the SQL.
After attaching an SQL profiler to the actual DB used, it showed a slightly different SQL involving outer joins, which are suboptimal and didn't have a proper index in place.
It remains a mystery why LINQPad generated different SQL, even though it's using the same EntityFramework.dll, but the original problem is resolved and all that remains is to optimize the query.
Many thanks for everyone involved.

Related

LINQ Search nvarchar(MAX) column extremely slow using .Contains()

I have a .net core API and I am trying to search 4.4 million records using .Contains(). This is obviously extremely slow - 26 seconds. I am just querying one column which is the name of the record. How is this problem generally solved when dealing with millions of records?
I have never worked with millions of records before so apart from the obvious altering of the .Select and .Take, I haven't tried anything too drastic. I have spent many hours on this though.
The other filters included in the .Where are only used when a user chooses to use them on the front end - The real problem is just searching by CompanyName.
Note; I am using .ToArray() when returning the results.
I have indexes in the database but cannot add one for CompanyName as it is Nvarchar(MAX).
I have also looked at the execution plan and it doesn't really show anything out of the ordinary.
query = _context.Companies.Where(
c => c.CompanyName.Contains(paging.SearchCriteria.companyNameFilter.ToUpper())
&& c.CompanyNumber.StartsWith(
string.IsNullOrEmpty(paging.SearchCriteria.companyNumberFilter)
? paging.SearchCriteria.companyNumberFilter.ToUpper()
: ""
)
&& c.IncorporationDate > paging.SearchCriteria.companyIncorperatedGreaterFilter
&& c.IncorporationDate < paging.SearchCriteria.companyIncorperatedLessThanFilter
)
.Select(x => new Company() {
CompanyName = x.CompanyName,
IncorporationDate = x.IncorporationDate,
CompanyNumber = x.CompanyNumber
}
)
.Take(10);
I expect the query to take around 1 / 2 seconds as when I execute a like query in ssms it take about 1 / 2 seconds.
Here is the code being submitted to DB:
Microsoft.EntityFrameworkCore.Database.Command: Information: Executing DbCommand [Parameters=[#__p_4='?' (DbType = Int32), #__ToUpper_0='?' (Size = 4000), #__p_1='?' (Size = 4000), #__paging_SearchCriteria_companyIncorperatedGreaterFilter_2='?' (DbType = DateTime2), #__paging_SearchCriteria_companyIncorperatedLessThanFilter_3='?' (DbType = DateTime2), #__p_5='?' (DbType = Int32)], CommandType='Text', CommandTimeout='30']
SELECT [t].[CompanyName], [t].[IncorporationDate], [t].[CompanyNumber]
FROM (
SELECT TOP(#__p_4) [c].[CompanyName], [c].[IncorporationDate], [c].[CompanyNumber], [c].[ID]
FROM [Companies] AS [c]
WHERE (((((#__ToUpper_0 = N'') AND #__ToUpper_0 IS NOT NULL) OR (CHARINDEX(#__ToUpper_0, [c].[CompanyName]) > 0)) AND (((#__p_1 = N'') AND #__p_1 IS NOT NULL) OR ([c].[CompanyNumber] IS NOT NULL AND (#__p_1 IS NOT NULL AND (([c].[CompanyNumber] LIKE [c].[CompanyNumber] + N'%') AND (((LEFT([c].[CompanyNumber], LEN(#__p_1)) = #__p_1) AND (LEFT([c].[CompanyNumber], LEN(#__p_1)) IS NOT NULL AND #__p_1 IS NOT NULL)) OR (LEFT([c].[CompanyNumber], LEN(#__p_1)) IS NULL AND #__p_1 IS NULL))))))) AND ([c].[IncorporationDate] > #__paging_SearchCriteria_companyIncorperatedGreaterFilter_2)) AND ([c].[IncorporationDate] < #__paging_SearchCriteria_companyIncorperatedLessThanFilter_3)
) AS [t]
ORDER BY [t].[IncorporationDate] DESC
OFFSET #__p_5 ROWS FETCH NEXT #__p_4 ROWS ONLY
SOLVED! With the help of both answers!
In the end as suggested, I tried full-text searching which was lightening fast but compromised accuracy of search results. In order to filter those results more accurately, I used .Contains on the query after applying the full-text search.
Here is the code that works. Hopefully this helps others.
//query = _context.Companies
//.Where(c => c.CompanyName.StartsWith(paging.SearchCriteria.companyNameFilter.ToUpper())
//&& c.CompanyNumber.StartsWith(string.IsNullOrEmpty(paging.SearchCriteria.companyNumberFilter) ? paging.SearchCriteria.companyNumberFilter.ToUpper() : "")
//&& c.IncorporationDate > paging.SearchCriteria.companyIncorperatedGreaterFilter && c.IncorporationDate < paging.SearchCriteria.companyIncorperatedLessThanFilter)
//.Select(x => new Company() { CompanyName = x.CompanyName, IncorporationDate = x.IncorporationDate, CompanyNumber = x.CompanyNumber }).Take(10);
query = _context.Companies.Where(c => EF.Functions.FreeText(c.CompanyName, paging.SearchCriteria.companyNameFilter.ToUpper()));
query = query.Where(x => x.CompanyName.Contains(paging.SearchCriteria.companyNameFilter.ToUpper()));
(I temporarily excluded the other filters for simplicity)
When you run the query in SSMS, it's probably cached for subsequent calls. The original query probably took similar time as the EF query. That said, there are disadvantages to parametrised queries - while you can better reuse execution plans in a parametrised query, this also means that the execution plan isn't necessarily the best for the actual query you're trying to run right now.
For example, if you specify a CompanyNumber (which is easy to find in an index due to the StartsWith), you can filter the data first by CompanyNumber, thus making the name search trivial (I assume CompanyNumber is unique, so either you get 0 records, or you get the one you get by CompanyNumber). This might not be possible for the parametrised query, if its execution plan was optimized for looking up by name.
But in the end, Contains is a performance killer. It needs to read every single byte of data in your table's CompanyName field; which usually means it has to read every single row, and process much of its data. Searching by a substring looks deceptively simple, but always carries heavy penalties - its complexity is linear with respect to data size.
One option is to find a way to avoid the Contains. Users often ask for features they don't actually need. StartsWith might work just as well for most of the cases. But that's a business decision, of course.
Another option would be finding a way to reduce the query as much as possible before you apply the Contains filter - if you only allow searching for company name with other filters that narrow the search down, you can save the DB server a lot of work. This may be tricky, and can sometimes collide with the execution plan collission issue - you might want to add some way to avoid having the same execution plan for two queries that are wildly different; an easy way in EF would be to build the query up dynamically, rather than trying for one expression:
var query = _context.Companies;
if (!string.IsNullOrEmpty(paging.SearchCriteria.companyNameFilter))
query = query.Where(c => c.CompanyName.Contains(paging.SearchCriteria.companyNameFilter));
if (!string.IsNullOrEmpty(paging.SearchCriteria.companyNumberFilter))
query = query.Where(c => c.CompanyNumber.StartsWith(paging.SearchCriteria.companyNumberFilter));
// etc. for the rest of the query
This means that you actually have multiple parametrised queries that can each have their own execution plan, more in line with what the query actually does. For some extreme cases, it might also be worthwhile to completely prevent execution plan caching (this is often useful in reports).
The final option is using full-text search. You can find plenty of tutorials on how to make this work. This works essentially by splitting the unformatted string data to individual words or phrases, and indexing those. This means that a search for "hello world" doesn't necessarily return all the records that have "hello world" in the name, and it might also return records that have something else than "hello world" in the name. Think Google Search rather than Contains. This can often be a great method for human-written text, but it can be very confusing for the user who doesn't understand why you'd return search results that are completely different from what he was searching for. It also often doesn't work well if you need to do partial searches (e.g. searching for "Computer" might return "Computer, Inc.", but searching for "Comp" might return nothing).
The first option is likely the fastest, and closest to what the users would expect. It has the weakness that it can't search in the middle, though. The second option is the most correct, and might make your query substantially faster, especially in the most common cases with good statistics. The third option is probably about as fast as the first one, but can be tricky to setup properly, and can be confusing for your users. It does also provide you with more powerful ways to query the text data (e.g. using wildcards).
Welcome to stack overflow. It looks like you are suffering from at least one of these three problems in your code and your architecture.
First: indexing
You've mentioned that this cannot be indexed but there is support in SQL Server for full text indexing at the very least.
.Contains
This method isn't exactly suitable for the size of operation you're performing. If possible, perhaps as a last resort, consider moving to a parameterized query. For now, however, it looks like you want to keep your business logic in the .net code rather than spreading it into SQL and that's a worthy plan.
c.IncorporationDate
Date comparison can be a little costly in SQL Server. Once you're dealing with so many millions of rows you might get a lot of performance benefit from correctly partitioned tables and indexes.
Consider whether or not these rows can change at all. Something named IncoporationDate sounds like it definitely should not be changed. I suspect you may want to leverage that after reading the rest of these.

Parameterized Sybase query: How to recompile?

I've got a parameterized search query that doesn't perform too well, and I think it's because of parameter sniffing. I'd like to do something like the OPTION(RECOMPILE) mentioned in this answer, but I'm not sure what the Sybase equivalent is.
Is there a Sybase equivalent of OPTION(RECOMPILE)? Or would I need to switch to a stored procedure to get that functionality?
NOTE: I have no idea what 'parameter sniffing' is so fwiw ...
A few scenarios I can imagine (off the top of my head) that could explain poor performance for a query, where performance could be improved by forcing a (re)compile of the query:
1 - DDL changes (eg, column datatype change, index creation/modification) for a table, or updated stats, could lead to a better compilation plan; in the good ol' days it was necessary to run sp_recompile table_name after such a change, but in recent years (last 6-8 years?) this should be automatically performed under the covers; soooo, if you're running a (relatively) old version of ASE, and assuming a DDL/stats modification to a referenced table, it may be necessary to have the table's owner run sp_recompile table-name after such a DDL change
2 - with ASE 15/16 it's not uncommon for the DBA to configure the dataserver with statement cache enabled; this allows the dataserver to create light weight procedures (LWPs; aka 'temp' procedure) for queries that are oft repeated (the objective being to eliminate the costly compilation overhead for look-alike queries); the downside to using statement cache is that a difference in parameter values that could lead to large variations in record counts can cause follow-on queries to obtain/re-use a 'poor' query plan associated with a previous copy of the query; in this scenario the SQL developer can run set statement_cache off prior to running a query ... this will override any statement cache settings at the dataserver level and allow the query to be compiled with the current set of SARGs/parameters ... with the trade-off being that you'll now incur the overhead of compiling every query submitted to the dataserver
3 - if the application is using prepared statements to submit 'parameterized' queries, the process of 'preparing' the statement typically signals the dataserver to create a LWP (aka 'temp' procedure); in this scenario the first invocation of the prepared statement will be compiled and stored in procedure cache, with follow-on invocations (by the app) of the prepared statement re-using the query plan from the first invocation; again, one benefit is the elimination of costly compilation overhead for queries #2 through #n, but the downside of re-using 'poor' query plan if the parameter values can lead to a large variation in the number of records affected; the only 'fix' I can think of for this scenario is for the application to be recoded to not use prepared statements ... and if the dataserver is configured with statement cache enabled, then make sure 'set statement_cache off' is issued before submitting (non-prepared) statements to the dataserver; and of course the obvious downside is that each query will now incur the overhead of compilation
NOTES:
set statement_cache {on|off} is a session-level setting; you only need to issue it once to enable (on) or disable (off) statement cache for the rest of the session
If you know you have 2 (or 3/4/5) different sets of parameters that could lead to different optimization plans for the same query, you can trick the dataserver's statement cache lookup process by submitting slightly different versions of the same query; to the dataserver these will look like 'different' queries and thus be matched with different query plans in statement cache; with this little coding technique you could still benefit from the use of statement cache (or application-specific prepared statements) while limiting (eliminating?) the re-use of 'poor' query plans; for example, while these three queries are logically identical, the dataserver will treat them as 'different' queries because of the different table aliases ...
select count(*) from sysobjects where id = 1
select count(*) from sysobjects o where id = 1
select count(*) from sysobjects o1 where id = 1
... and if using application-side prepared statements then you would create/manage 3 different prepared statements

LINQ to Entities query takes long to compile, SQL runs fast

I'm working on a piece of code, written by a coworker, that interfaces with a CRM application our company uses. There are two LINQ to Entities queries in this piece of code that get executed many times in our application, and I've been asked to optimize them because one of them is really slow.
These are the queries:
First query, this one compiles pretty much instantly. It gets relation information from the CRM database, filtering by a list of relation IDs given by the application:
from relation in context.ADRELATION
where ((relationIds.Contains(relation.FIDADRELATION)) && (relation.FLDELETED != -1))
join addressTable in context.ADDRESS on relation.FIDADDRESS equals addressTable.FIDADDRESS
into temporaryAddressTable
from address in temporaryAddressTable.DefaultIfEmpty()
join mailAddressTable in context.ADDRESS on relation.FIDMAILADDRESS equals
mailAddressTable.FIDADDRESS into temporaryMailAddressTable
from mailAddress in temporaryMailAddressTable.DefaultIfEmpty()
select new { Relation = relation, Address = address, MailAddress = mailAddress };
The second query, which takes about 4-5 seconds to compile, and takes information about people from the database (again filtered by a list of IDs):
from role in context.ROLE
join relationTable in context.ADRELATION on role.FIDADRELATION equals relationTable.FIDADRELATION into temporaryRelationTable
from relation in temporaryRelationTable.DefaultIfEmpty()
join personTable in context.PERSON on role.FIDPERS equals personTable.FIDPERS into temporaryPersonTable
from person in temporaryPersonTable.DefaultIfEmpty()
join nationalityTable in context.TBNATION on person.FIDTBNATION equals nationalityTable.FIDTBNATION into temporaryNationalities
from nationality in temporaryNationalities.DefaultIfEmpty()
join titelTable in context.TBTITLE on person.FIDTBTITLE equals titelTable.FIDTBTITLE into temporaryTitles
from title in temporaryTitles.DefaultIfEmpty()
join suffixTable in context.TBSUFFIX on person.FIDTBSUFFIX equals suffixTable.FIDTBSUFFIX into temporarySuffixes
from suffix in temporarySuffixes.DefaultIfEmpty()
where ((rolIds.Contains(role.FIDROLE)) && (relation.FLDELETED != -1))
select new { Role = role, Person = person, relation = relation, Nationality = nationality, Title = title.FTXTBTITLE, Suffix = suffix.FTXTBSUFFIX };
I've set up the SQL Profiler and took the SQL from both queries, then ran it in SQL Server Management Studio. Both queries ran very fast, even with a large (~1000) number of IDs. So the problem seems to lie in the compilation of the LINQ query.
I have tried to use a compiled query, but since those can only contain primitive parameters, I had to strip out the part with the filter and apply that after the Invoke() call, so I'm not sure if that helps much. Also, since this code runs in a WCF service operation, I'm not sure if the compiled query will even still exist on subsequent calls.
Finally what I tried was to only select a single column in the second query. While this obviously won't give me the information I need, I figured it would be faster than the ~200 columns we're selecting now. No such case, it still took 4-5 seconds.
I'm not a LINQ guru at all, so I can barely follow this code (I have a feeling it's not written optimally, but can't put my finger on it). Could anyone give me a hint as to why this problem might be occurring?
The only solution I have left is to manually select all the information instead of joining all these tables. I'd then end up with about 5-6 queries. Not too bad I guess, but since I'm not dealing with horribly inefficient SQL here (or at least an acceptable level of inefficiency), I was hoping to prevent that.
Thanks in advance, hope I made things clear. If not, feel free to ask and I'll provide additional details.
Edit:
I ended up adding associations on my entity framework (the target database didn't have foreign keys specified) and rewriting the query thusly:
context.ROLE.Where(role => rolIds.Contains(role.FIDROLE) && role.Relation.FLDELETED != -1)
.Select(role => new
{
ContactId = role.FIDROLE,
Person = role.Person,
Nationality = role.Person.Nationality.FTXTBNATION,
Title = role.Person.Title.FTXTBTITLE,
Suffix = role.Person.Suffix.FTXTBSUFFIX
});
Seems a lot more readable and it's faster too.
Thanks for the suggestions, I will definitely keep the one about making multiple compiled queries for different numbers of arguments in mind!
Gabriels answer is correct: Use a compiled query.
It looks like you are compiling it again for every WCF request which of course defeats the purpose of one-time initialization. Instead, put the compiled query into a static field.
Edit:
Do this: Send maximum load to your service and pause the debugger 10 times. Look at the call stack. Did it stop more often in L2S code or in ADO.NET code? This will tell you if the problem is still with L2S or with SQL Server.
Next, let's fix the filter. We need to push it back into the compiled query. This is only possible by transforming this:
rolIds.Contains(role.FIDROLE)
to this:
role.FIDROLE == rolIds_0 || role.FIDROLE == rolIds_1 || ...
You need a new compiled query for every cardinality of rolIds. This is nasty, but it is necessary to get it to compile. In my project, I have automated this task but you can do a one-off solution here.
I guess most queries will have very few role-id's so you can materialize 10 compiled queries for cardinalities 1-10 and if the cardinality exceeds 10 you fall back to client-side filtering.
If you decide to keep the query inside the code, you could compile it. You still have to compile the query once when you run your app, but all subsequent call are gonna use that already compiled query. You can take a look at MSDN help here: http://msdn.microsoft.com/en-us/library/bb399335.aspx.
Another option would be to use a stored procedure and call the procedure from your code. Hence no compile time.

When does compile queries of LINQ to SQL improve performance

I was referring to an article which focuses on Speeding up LINQ to SQL Queries. One of the techniques it mentions is "Use Compiled Queries" and explains how to use it.
I wanted to see performance improvement of compiled queries and hence i tried the same example provided by the author. I used Northwind Db as datacontext. I tried normal execution and compiledquery execution and checked them on LINQ PAD.
First I tried executing the query without using CompileQuery. It took 2.065 seconds.
var oo = from o in Orders
where o.OrderDetails.Any (p => p.UnitPrice > 100)
select o;
oo.Dump ("Order items with unit price more than $100");
var oo1 = from o in Orders
where o.OrderDetails.Any (p => p.UnitPrice > 10)
select o;
oo1.Dump ("Order items with unit price more than $10");
Secondly, the queries with using CompileQuery. It took 2.100 seconds.
var oo = CompiledQuery.Compile ((TypedDataContext dc, decimal unitPrice) =>
from o in Orders
where o.OrderDetails.Any (p => p.UnitPrice > unitPrice)
select o
);
oo (this, 100).Dump ("Order items with unit price more than $100");
oo (this, 10).Dump ("Order items with unit price more than $10");
Re-executing them several times showed that the time taken by both of the approaches are almost similar.
Here we see only two query executions for each method. I tried making 10 queries for each of them. But both of them completed around 7 seconds.
Does pre-compiling the queries really improve the performance? Or am I getting it wrong it terms of usage ?
Thank you for your time and consideration.
Edit:
After reading the accepted answer, readers may also want to go through this article which nicely explains how compiled queries improve performance.
Bear in mind that there are two main pieces of a LINQ query that can be particularly expensive:
Compiling the LINQ expressions into an SQL Statement.
Running the SQL Statement and retrieving the results
In your case, you have a relatively simple query, and either a very slow database connection, some very large data sets, or tables that are not indexed in an optimal way to run this particular query. Or maybe a combination of all three.
So compared to the amount of time it is taking to produce the SQL for your query (maybe 10-50 milliseconds), the second step is taking so much time (~1000 ms) that you can hardly notice the difference.
You would see significant improvements if the following conditions are all true:
your LINQ query is complex,
you have a fast connection to your database,
the SQL query itself runs quickly on that database, and
the result set is small enough that it gets transferred back from the database relatively quickly.
In practice, I've had queries that can take upwards of 500ms to compile, but only a few milliseconds to actually run. These are usually the cases where I focus on precompiling queries.
One good way to know ahead of time what kind of performance gains you can expect from precompiled queries is to time the second instance of your query using a Stopwatch object, and then run the generated SQL directly using LINQPad's Analyze SQL feature. If the SQL query returns quickly but the LINQ query takes a long time, that's a good candidate for precompiling.

LINQ Query incredibly slow - why?

I got a very simple LINQ query:
List<table> list = ( from t in ctx.table
where
t.test == someString
&& t.date >= dateStartInt
&& t.date <= dateEndInt
select t ).ToList<table>();
The table which gets queried has got about 30 million rows, but the columns test and date are indexed.
When it should return around 5000 rows it takes several minutes to complete.
I also checked the SQL command which LINQ generates.
If I run that command on the SQL Server it takes 2 seconds to complete.
What's the problem with LINQ here?
It's just a very simple query without any joins.
That's the query SQL Profiler shows:
exec sp_executesql N'SELECT [t0].[test]
FROM [dbo].[table] AS [t0]
WHERE ([t0].[test] IN (#p0)) AND ([t0].[date] >= #p1)
AND ([t0].[date] <= #p2)',
N'#p0 nvarchar(12),#p1 int,#p2 int',#p0=N'123test',#p1=110801,#p2=110804
EDIT:
It's really weird. While testing I noticed that it's much faster now. The LINQ query now takes 3 seconds for around 20000 rows, which is quite ok.
What's even more confusing:
It's the same behaviour on our production server. An hour ago it was really slow, now it's fast again. As I was testing on the development server, I didn't change anything on the production server. The only thing I can think of being a problem is that both servers are virtualized and share the SAN with lots of other servers.
How can I find out if that's the problem?
Before blaming LINQ first find out where the actual delay is taking place.
Sending the query
Execution of the query
Receiving the results
Transforming the results into local types
Binding/showing the result in the UI
And any other events tied to this process
Then start blaming LINQ ;)
If I had to guess, I would say "parameter sniffing" is likely, i.e. it has built and cached a query plan based on one set of parameters, which is very suboptimal for your current parameter values. You can tackle this with OPTION (OPTIMIZE FOR UNKNOWN) in regular TSQL, but there is no LINQ-to-SQL / EF way of expolsing this.
My plan would be:
use profiling to prove that the time is being lost in the query (as opposed to materialization etc)
once confirmed, consider using direct TSQL methods to invoke
For example, with LINQ-to-SQL, ctx.ExecuteQuery<YourType>(tsql, arg0, ...) can be used to throw raw TSQL at the server (with parameters as {0} etc, like string.Format). Personally, I'd lean towards "dapper" instead - very similar usage, but a faster materializer (but it doesn't support EntityRef<> etc for lazy-loading values - which is usually a bad thing anyway as it leads to N+1).
i.e. (with dapper)
List<table> list = ctx.Query<table>(#"
select * from table
where test == #someString
and date >= #dateStartInt
and date <= #dateEndInt
OPTION (OPTIMIZE FOR UNKNOWN)",
new {someString, dateStartInt, dateEndInt}).ToList();
or (LINQ-to-SQL):
List<table> list = ctx.ExecuteQuery<table>(#"
select * from table
where test == {0}
and date >= {1}
and date <= {2}
OPTION (OPTIMIZE FOR UNKNOWN)",
someString, dateStartInt, dateEndInt).ToList();

Categories

Resources