Azure SQL DB hanging on to connection until EF times out - c#

I have an intermittent issue that I can't seem to get to the bottom of whereby a .NET Framework 4.6 MVC API (C#) uses an Entity Framework 6 (runtime v4.0.30319) stored procedure call that takes a long time to respond.
The stored procedure is hosted on Azure SQL DB.
It is an unremarkable procedure, it interrogates a couple of tables to give a nutritionist their clients' latest data and website interactions.
ALTER PROCEDURE [PORTAL].[GetNutritionistCustomers]
#id_usernut varchar(20)
AS
WITH Activity AS (
SELECT LastActivity = MAX(ExecutionDate),
ded.UserName
FROM portal.DailyExecutionData ded
WHERE ded.ExecutionDate < GETDATE()
GROUP BY ded.UserName
),
Logging AS (
SELECT LastLogin = MAX(l.LoggedOnDate),
l.UserName
FROM PORTAL.Logins l
WHERE l.LoginType = 'Direct Client'
GROUP BY l.UserName
)
SELECT unc.CompanyCode,
a.ACCOUNT,
ad.ADDRESS1,
un.ID_UserNut,
ueu.ExpirationDate,
Expired = CAST(CASE WHEN ueu.ExpirationDate < GETDATE() THEN 1 ELSE 0 END AS BIT),
LastActive = la.LastActivity,
l.LastLogin
FROM UK_Nutritionist un
JOIN UK_NutCompany unc ON un.ID_UserNut = unc.ID_UserNut
JOIN UK_EnabledUsers ueu ON unc.CompanyCode = ueu.UserName
JOIN Klogix.ACCOUNT a ON ueu.ID_User = a.ACCOUNTID
LEFT JOIN Klogix.ADDRESS ad ON a.ADDRESSID = ad.ADDRESSID AND ad.IsPrimary = 1
LEFT JOIN Activity la ON ueu.UserName = la.UserName
LEFT JOIN Logging l ON ueu.UserName = l.UserName
WHERE un.ID_UserNut = #id_usernut
Run in SSMS or ADS this procedure takes on average about 50-100ms to complete. This is consistent, the tables are well indexed and the query plan is all index seeks. There are no Key or RID lookup red flags.
Having investigated with profiler, I've determined that what happens is EF creates a connection, calls the procedure. I can see the RPC entered, EF reaches it connection timeout period, and then we get RPC completed, and the data returns to EF, and the code spins out a list of pocs to return to the caller in json.
[HttpGet]
[Route("Customers")]
public HttpResponseMessage Get()
{
try
{
if (IsTokenInvalid(Request))
return Request.CreateResponse(HttpStatusCode.Unauthorized);
var nutritionistCustomers = new ExcdbEntities().GetNutritionistCustomers(TokenPayload.Nutritionist).Select(x => new NutritionistCustomers()
{
Account = x.ACCOUNT,
Address = x.ADDRESS1,
CompanyCode = x.CompanyCode,
ExpirationDate = x.ExpirationDate,
expired = x.Expired,
LastActive = x.LastActive,
LastLogin = x.LastLogin
}).ToList();
return Request.CreateResponse(HttpStatusCode.OK, GenerateResponse(nutritionistCustomers))
}
catch (Exception e)
{
return Request.CreateResponse(HttpStatusCode.InternalServerError, GenerateResponse(e));
}
}
If I change the timeout on the connection then the amount of time that SQL waits before releasing the data changes.
I thought it might be something to do with the Azure app service that hosts the API, but it turns out that running it in debug in Visual Studio has the same issue.
The short term fix is to recompile the stored procedure. This will then return the data immediately. But at some undetermined point in the future the same procedure will suddenly manifest this behaviour again.
Only this procedure does this, all other EF interactions, whether linq to sql table or view interrogations, and all other procedures seem to behave well.
This has happened before in a now legacy system that had the same architecture, though it was a different procedure on a different database.
Whilst shortening the timeout is a workaround, it's a system wide value, and even 10 seconds, enough to cope with system hiccouphs, is way too long for this customer selection screen, which should be instantaneous.
Also I would rather fix the underlying problem, if anyone has any clue what might be going on.
I have considered an OPTION_RECOMPILE on the statement, but it just feels wrong to do so.
If anyone else has experienced this and has any insight I'd be most grateful is it's driving me to distraction.

In the end this turned out to be a concatenation issue.
EF sets the CONCAT_NULL_YIELDS_NULL off inits context, and for whatever reason this caused a truly terrible performance issue.
For what I can only call legacy issue of inept previous employees, a date returned by the procedure did some inexplicable chopping up and concatenation of various dateparts of a date.
I replaced this by, you know, returning the date (and amending the EF object obviously), and hey presto no more performance issue.

Related

Timeout when using AliasToBean with stored procedure

I have some code that has a stored procedure execution to fill a DTO for use in displaying a table of data for a user.
The stored procedure will output, for example, 20 records with 17 columns, pulling from 9 tables, in roughly 1.3 seconds. The stored procedure works perfectly.
In my C# code, I have a DTO with all the same columns as the stored procedure, same data types. I run the following code to fill the DTO:
public IEnumerable<AgentActivityOrdersShortDTO> GetAgentActivityOrdersReport(DateTime currentDate, DateTime ordersBeforeDate, AgentActivityReportType reportType)
{
var listOfOperations = _oms.GetSPSOperationsFromBalOperation((int)_ctx.TenantInfo.Operation, true).Select(o => o.SPSOperationMapping).ToList();
var stringOfOperations = listOfOperations.Aggregate("", (current, o) => current + (o.ToString() + ","));
IList<AgentActivityOrdersShortDTO> results = null;
switch (reportType)
{
case AgentActivityReportType.ListingAgent:
results = _sdsp.CurrentSession.CreateSQLQuery("exec GetActivityReport_OrderData_LA :CurrentDate, :OrdersBeforeDate, :OperationIds")
.SetParameter("CurrentDate", currentDate)
.SetParameter("OrdersBeforeDate", ordersBeforeDate)
.SetParameter("OperationIds", stringOfOperations)
.SetResultTransformer(Transformers.AliasToBean(typeof(AgentActivityOrdersShortDTO)))
.List<AgentActivityOrdersShortDTO>()
.ToList();
break;
case AgentActivityReportType.SellingAgent:
results = _sdsp.CurrentSession.CreateSQLQuery("exec GetActivityReport_OrderData_SB :CurrentDate, :OrdersBeforeDate, :OperationIds")
.SetParameter("CurrentDate", currentDate)
.SetParameter("OrdersBeforeDate", ordersBeforeDate)
.SetParameter("OperationIds", stringOfOperations)
.SetResultTransformer(Transformers.AliasToBean(typeof(AgentActivityOrdersShortDTO)))
.List<AgentActivityOrdersShortDTO>()
.ToList();
break;
}
return results.ToList();
}
Some days, this runs with no issues. Other days, I get what I'm experiencing today:
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
It seems to be intermittent, and I've occasionally "fixed" it by dropping and recreating the stored procedure, or by a flush&fill on the indexes of the database. But these aren't log-term solutions.
Thoughts or ideas on how to possibly fix would be great.
If dropping and re-creating the stored procs "fixes" it, then it points to recompile of query plan solving the issue, and a less than optimal plan being cached.
To start with try adding "WITH RECOMPILE" to the 2 stored procs called by the code.
Check out parameter sniffing articles for background.
eg: http://www.brentozar.com/archive/2013/06/the-elephant-and-the-mouse-or-parameter-sniffing-in-sql-server/

NHibernate: Object hierarchy and performance

I've a database with a Customer table. Each of these customers has a foreign key to an Installation table, which further has an foreign key to an Address table (table renamed for simplicity).
In NHibernate I'm trying to query the Customer table like this:
ISession session = tx.Session;
var customers = session.QueryOver<Customer>().Where(x => x.Country == country);
var installations = customers.JoinQueryOver(x => x.Installation, JoinType.LeftOuterJoin);
var addresses = installations.JoinQueryOver(x => x.Address, JoinType.LeftOuterJoin);
if (installationType != null)
{
installations.Where(x => x.Type == installationType);
}
return customers.TransformUsing(new DistinctRootEntityResultTransformer()).List<Customer>();
Which results in a SQL query similar to (catched by NHibernate Profiler):
SELECT *
FROM Customer this_
left outer join Installation installati1_
on this_.InstallationId = installati1_.Id
left outer join Address address2_
on installati1_.AddressId = address2_.Id
WHERE this_.CountryId = 4
and installati1_.TypeId = 1
When I execute the above SQL query in Microsoft SQL Server Management Studio it executes in about 5 seconds but returns ~200.000 records. Nevertheless it takes a lot of time to retrieve the List when running the code. I've been waiting for 10 minutes without any results. The debug-log indicated that a lot of objects are constructed and initiated because of the object hierarchy. Is there a way to fix this performance issue?
I'm not sure what you are trying to do, but loading and saving 200000 records through any OR mapper is not feasable. 200000 objects will take a lot of memory and time to be created. Depending on what you want to do, loading them in pages or make a update query directly on the database (sp or named query) can fix your performance. Batching can be done by:
criteria.SetFirstResult(START).SetMaxResult(PAGESIZE);
NHibernate Profiler shows two times in the duration column x/y, with x being the time to execute the query and y the time to initialize the objects. The first step is to determine where the problem lies. If the query is slow, get the actual query sent to the database using SQL Profiler (assuming SQL Server) and check its performance in SSMS.
However, I suspect your issue may be the logging level. If you have the logging level set to DEBUG, NHibernate will generate very verbose logs and this will significantly impact performance.
Even if you can get it to perform well with 200000 records that's more than you can display to the user in a meaningful way. You should use paging/filtering to reduce the size of the result set.

LINQ to Entities query takes long to compile, SQL runs fast

I'm working on a piece of code, written by a coworker, that interfaces with a CRM application our company uses. There are two LINQ to Entities queries in this piece of code that get executed many times in our application, and I've been asked to optimize them because one of them is really slow.
These are the queries:
First query, this one compiles pretty much instantly. It gets relation information from the CRM database, filtering by a list of relation IDs given by the application:
from relation in context.ADRELATION
where ((relationIds.Contains(relation.FIDADRELATION)) && (relation.FLDELETED != -1))
join addressTable in context.ADDRESS on relation.FIDADDRESS equals addressTable.FIDADDRESS
into temporaryAddressTable
from address in temporaryAddressTable.DefaultIfEmpty()
join mailAddressTable in context.ADDRESS on relation.FIDMAILADDRESS equals
mailAddressTable.FIDADDRESS into temporaryMailAddressTable
from mailAddress in temporaryMailAddressTable.DefaultIfEmpty()
select new { Relation = relation, Address = address, MailAddress = mailAddress };
The second query, which takes about 4-5 seconds to compile, and takes information about people from the database (again filtered by a list of IDs):
from role in context.ROLE
join relationTable in context.ADRELATION on role.FIDADRELATION equals relationTable.FIDADRELATION into temporaryRelationTable
from relation in temporaryRelationTable.DefaultIfEmpty()
join personTable in context.PERSON on role.FIDPERS equals personTable.FIDPERS into temporaryPersonTable
from person in temporaryPersonTable.DefaultIfEmpty()
join nationalityTable in context.TBNATION on person.FIDTBNATION equals nationalityTable.FIDTBNATION into temporaryNationalities
from nationality in temporaryNationalities.DefaultIfEmpty()
join titelTable in context.TBTITLE on person.FIDTBTITLE equals titelTable.FIDTBTITLE into temporaryTitles
from title in temporaryTitles.DefaultIfEmpty()
join suffixTable in context.TBSUFFIX on person.FIDTBSUFFIX equals suffixTable.FIDTBSUFFIX into temporarySuffixes
from suffix in temporarySuffixes.DefaultIfEmpty()
where ((rolIds.Contains(role.FIDROLE)) && (relation.FLDELETED != -1))
select new { Role = role, Person = person, relation = relation, Nationality = nationality, Title = title.FTXTBTITLE, Suffix = suffix.FTXTBSUFFIX };
I've set up the SQL Profiler and took the SQL from both queries, then ran it in SQL Server Management Studio. Both queries ran very fast, even with a large (~1000) number of IDs. So the problem seems to lie in the compilation of the LINQ query.
I have tried to use a compiled query, but since those can only contain primitive parameters, I had to strip out the part with the filter and apply that after the Invoke() call, so I'm not sure if that helps much. Also, since this code runs in a WCF service operation, I'm not sure if the compiled query will even still exist on subsequent calls.
Finally what I tried was to only select a single column in the second query. While this obviously won't give me the information I need, I figured it would be faster than the ~200 columns we're selecting now. No such case, it still took 4-5 seconds.
I'm not a LINQ guru at all, so I can barely follow this code (I have a feeling it's not written optimally, but can't put my finger on it). Could anyone give me a hint as to why this problem might be occurring?
The only solution I have left is to manually select all the information instead of joining all these tables. I'd then end up with about 5-6 queries. Not too bad I guess, but since I'm not dealing with horribly inefficient SQL here (or at least an acceptable level of inefficiency), I was hoping to prevent that.
Thanks in advance, hope I made things clear. If not, feel free to ask and I'll provide additional details.
Edit:
I ended up adding associations on my entity framework (the target database didn't have foreign keys specified) and rewriting the query thusly:
context.ROLE.Where(role => rolIds.Contains(role.FIDROLE) && role.Relation.FLDELETED != -1)
.Select(role => new
{
ContactId = role.FIDROLE,
Person = role.Person,
Nationality = role.Person.Nationality.FTXTBNATION,
Title = role.Person.Title.FTXTBTITLE,
Suffix = role.Person.Suffix.FTXTBSUFFIX
});
Seems a lot more readable and it's faster too.
Thanks for the suggestions, I will definitely keep the one about making multiple compiled queries for different numbers of arguments in mind!
Gabriels answer is correct: Use a compiled query.
It looks like you are compiling it again for every WCF request which of course defeats the purpose of one-time initialization. Instead, put the compiled query into a static field.
Edit:
Do this: Send maximum load to your service and pause the debugger 10 times. Look at the call stack. Did it stop more often in L2S code or in ADO.NET code? This will tell you if the problem is still with L2S or with SQL Server.
Next, let's fix the filter. We need to push it back into the compiled query. This is only possible by transforming this:
rolIds.Contains(role.FIDROLE)
to this:
role.FIDROLE == rolIds_0 || role.FIDROLE == rolIds_1 || ...
You need a new compiled query for every cardinality of rolIds. This is nasty, but it is necessary to get it to compile. In my project, I have automated this task but you can do a one-off solution here.
I guess most queries will have very few role-id's so you can materialize 10 compiled queries for cardinalities 1-10 and if the cardinality exceeds 10 you fall back to client-side filtering.
If you decide to keep the query inside the code, you could compile it. You still have to compile the query once when you run your app, but all subsequent call are gonna use that already compiled query. You can take a look at MSDN help here: http://msdn.microsoft.com/en-us/library/bb399335.aspx.
Another option would be to use a stored procedure and call the procedure from your code. Hence no compile time.

LINQ Query incredibly slow - why?

I got a very simple LINQ query:
List<table> list = ( from t in ctx.table
where
t.test == someString
&& t.date >= dateStartInt
&& t.date <= dateEndInt
select t ).ToList<table>();
The table which gets queried has got about 30 million rows, but the columns test and date are indexed.
When it should return around 5000 rows it takes several minutes to complete.
I also checked the SQL command which LINQ generates.
If I run that command on the SQL Server it takes 2 seconds to complete.
What's the problem with LINQ here?
It's just a very simple query without any joins.
That's the query SQL Profiler shows:
exec sp_executesql N'SELECT [t0].[test]
FROM [dbo].[table] AS [t0]
WHERE ([t0].[test] IN (#p0)) AND ([t0].[date] >= #p1)
AND ([t0].[date] <= #p2)',
N'#p0 nvarchar(12),#p1 int,#p2 int',#p0=N'123test',#p1=110801,#p2=110804
EDIT:
It's really weird. While testing I noticed that it's much faster now. The LINQ query now takes 3 seconds for around 20000 rows, which is quite ok.
What's even more confusing:
It's the same behaviour on our production server. An hour ago it was really slow, now it's fast again. As I was testing on the development server, I didn't change anything on the production server. The only thing I can think of being a problem is that both servers are virtualized and share the SAN with lots of other servers.
How can I find out if that's the problem?
Before blaming LINQ first find out where the actual delay is taking place.
Sending the query
Execution of the query
Receiving the results
Transforming the results into local types
Binding/showing the result in the UI
And any other events tied to this process
Then start blaming LINQ ;)
If I had to guess, I would say "parameter sniffing" is likely, i.e. it has built and cached a query plan based on one set of parameters, which is very suboptimal for your current parameter values. You can tackle this with OPTION (OPTIMIZE FOR UNKNOWN) in regular TSQL, but there is no LINQ-to-SQL / EF way of expolsing this.
My plan would be:
use profiling to prove that the time is being lost in the query (as opposed to materialization etc)
once confirmed, consider using direct TSQL methods to invoke
For example, with LINQ-to-SQL, ctx.ExecuteQuery<YourType>(tsql, arg0, ...) can be used to throw raw TSQL at the server (with parameters as {0} etc, like string.Format). Personally, I'd lean towards "dapper" instead - very similar usage, but a faster materializer (but it doesn't support EntityRef<> etc for lazy-loading values - which is usually a bad thing anyway as it leads to N+1).
i.e. (with dapper)
List<table> list = ctx.Query<table>(#"
select * from table
where test == #someString
and date >= #dateStartInt
and date <= #dateEndInt
OPTION (OPTIMIZE FOR UNKNOWN)",
new {someString, dateStartInt, dateEndInt}).ToList();
or (LINQ-to-SQL):
List<table> list = ctx.ExecuteQuery<table>(#"
select * from table
where test == {0}
and date >= {1}
and date <= {2}
OPTION (OPTIMIZE FOR UNKNOWN)",
someString, dateStartInt, dateEndInt).ToList();

Improve .NET/MSSQL select & update performance

I'd like to increase performance of very simple select and update queries of .NET & MSSQL 2k8.
My queries always select or update a single row. The DB tables have indexes on the columns I query on.
My test .NET code looks like this:
public static MyData GetMyData(int accountID, string symbol)
{
using (var cnn = new SqlConnection(connectionString))
{
cnn.Open();
var cmd = new SqlCommand("MyData_Get", cnn);
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(CreateInputParam("#AccountID", SqlDbType.Int, accountID));
cmd.Parameters.Add(CreateInputParam("#Symbol", SqlDbType.VarChar, symbol));
SqlDataReader reader = cmd.ExecuteReader();
while (reader.Read())
{
var MyData = new MyData();
MyData.ID = (int)reader["ID"];
MyData.A = (int)reader["A"];
MyData.B = reader["B"].ToString();
MyData.C = (int)reader["C"];
MyData.D = Convert.ToDouble(reader["D"]);
MyData.E = Convert.ToDouble(reader["E"]);
MyData.F = Convert.ToDouble(reader["F"]);
return MyData;
}
}
}
and the according stored procedure looks like this:
PROCEDURE [dbo].[MyData_Get]
#AccountID int,
#Symbol varchar(25)
AS
BEGIN
SET NOCOUNT ON;
SELECT p.ID, p.A, p.B, p.C, p.D, p.E, p.F FROM [MyData] AS p WHERE p.AccountID = #AccountID AND p.Symbol = #Symbol
END
What I'm seeing if I run GetMyData in a loop, querying MyData objects, I'm not exceeding about ~310 transactions/sec. I was hoping to achieve well over a 1000 transactions/sec.
On the SQL Server side, not really sure what I can improve for such a simple query.
ANTS profiler shows me that on the .NET side, as expected, the bottleneck is cnn.Open and cnn.ExecuteReader, however I have no idea how I could significantly improve my .NET code?
I've seen benchmarks though where people seem to easily achieve 10s of thousands transactions/sec.
Any advice on how I can significantly improve the performance for this scenario would be greatly appreciated!
Thanks,
Tom
EDIT:
Per MrLink's recommendation, adding "TOP 1" to the SELECT query improved performance to about 585 transactions/sec from 310
EDIT 2:
Arash N suggested to have the select query "WITH(NOLOCK)" and that dramatically improved the performance! I'm now seeing around 2500 transactions/sec
EDIT 3:
Another slight optimization that I just did on the .NET side helped me to gain another 150 transactions/sec. Changing while(reader.Read()) to if(reader.Read()) surprisingly made quite a difference. On avg. I'm now seeing 2719 transactions/sec
Try using WITH(NOLOCK) in your SELECT statement to increase the performance. This would select the row without locking it.
SELECT p.ID, p.A, p.B, p.C, p.D, p.E, p.F FROM [MyData] WITH(NOLOCK) AS p WHERE p.AccountID = #AccountID AND p.Symbol = #Symbol
Some things to consider.
First, your not closing the server connection. (cnn.Close();) Eventually, it will get closed by the garbage collector. But until that happens, your creating a brand new connection to the database every time rather than collecting one from the connection pool.
Second, Do you have an index set in Sql Server covering the AccountID and Symbol columns?
Third, While accountId being and int is nice and fast. The Symbol column being varchar(25) is always going to be much slower. Can you change this to an int flag?
Make sure your database connections are actually pooling. If you are seeing a bottleneck in cnn.Open, there would seem to be a good chance they are not getting pooled.
I was hoping to achieve well over a 1000 transactions/sec [when running GetMyData in a loop]
What you are asking for is for GetMyData to run in less than 1ms - this is just pointless optimisation! At the bare minimum this method involves a round trip to the database server (possibly involving network access) - you wouldn't be able to make this method much faster if your query was SELECT 1.
If you have a genuine requirement to make more requests per second then the answer is either to use multiple threads or to buy a faster PC.
There is absolutely nothing wrong with your code - I'm not sure where you have seen people managing 10,000+ transactions per second, but I'm sure this must have involved multiple concurrent clients accessing the same database server rather than a single thread managing to execute queries in less than a 10th of a ms!
Is your method called frequently? Could you batch your requests so you can open your connection, create your parameters, get the result and reuse them for several queries before closing the whole thing up again?
If the data is not frequently invalidated (updated) I would implement a cache layer. This is one of the most effective ways (if used correctly) to gain performance.
You could use output parameters instead of a select since there's always a single row.
You could also create the SqlCommand ahead of time and re-use it. If you are executing a lot of queries in a single thread you can keep executing it on the same connection. If not, you could create a pool of them or do cmdTemplate.Clone() and set the Connection.
Try re-using the Command and Prepare'ing it first.
I can't say that it will definitely help, but it seems worth a try.
In no particular order...
Have you (or your DBAs) examined the execution plan your stored procedure is getting? Has SQL Server cached a bogus execution plan (either due to oddball parameters or old stats).
How often are statistics updated on the database?
Do you use temp tables in your stored procedure? If so, are they create upfront. If not, you'll be doing a lot of recompiles as creating a temp table invalidates the execution plan.
Are you using connection pooling? Opening/Closing a SQL server connection is an expensive operation.
Is your table clustered on accountID and Symbol?
Finally...
Is there a reason you're hitting this table by account and symbol rather than, say, just retrieving all the data for an entire account in one fell swoop?

Categories

Resources