Functions are slow when querying Azure Hyperscale secondary replica - c#

I have a ASP .NET Core application using EF and an Azure SQL database. We recently migrated the database to the Hyperscale service tier. The database has 2 vCores and 2 secondary replicas. When we have a function query a secondary replica (by either modifying the connection string to include ApplicationIntent=READONLY; or by using a new services.AddDbContext() from our Startup.cs) we find that functions take 20-30x longer to execute.
For instance, this function:
public async Task<List<StaffWorkMuchModel>> ExemptStaffWorkMuchPerWeek(int quarterId, int facilityId) {
using (var dbConnection = (IDbConnection) _serviceProvider.GetService(typeof(IDbConnection))) {
dbConnection.ConnectionString += "ApplicationIntent=READONLY;";
dbConnection.Open();
return (await dbConnection.QueryAsync<StaffWorkMuchModel>("ExemptStaffWorkMuchPerWeek", new {
id_qtr = quarterId,
id_fac = facilityId
}, commandType: CommandType.StoredProcedure, commandTimeout: 150)).ToList();
}
}
We have tried to query the secondary replica directly using SQL Server Management Studio and have found that the queries all return in less than a second. Also, when we add breakpoints in our code, it seems like the queries are returning results immediately. Most of the pages we are having issues with use ajax to call 4+ functions very similar to the one above. It almost seems like they are not running asynchronously.
This same code runs great when we comment out:
dbConnection.ConnectionString += "ApplicationIntent=READONLY;";
Any idea what could be causing all of our secondary replica functionss to load so slow?

Related

Strange timeouts with READ UNCOMMITED

In our application, we need to make an exclusive access to a certain place. It is a legacy application with no communication between client and server, all clients communicate directly to the database.
Edit: This is an implementation of a semaphore for sql server
What we were using and working with no problem was this:
using (var sqlConnection = new SqlConnection(Parameters.ConnectionStringUncommited))
{
sqlConnection.Open();
using (var sqlTransaction = sqlConnection.BeginTransaction(IsolationLevel.ReadUncommitted))
{
using (var sqlCommand = new DstSqlCommand($"SELECT COUNT(*) FROM Locks WHERE Id = {IDTOCHECK}", sqlConnection, sqlTransaction))
{
if ((int)sqlCommand.ExecuteScalar() > 0)
ShowMessage("Locked");
else
{
sqlCommand.CommandText = "INSERT INTO Locks(Id) VALUES({IDTOCHECK})";
sqlCommand.ExecuteNonQuery();
// Here we make lot of work, can be locked a lot of time, minutes even, because we show windows and things
// We never commit the transaction, just rollback it
sqlTransaction.Rollback();
}
}
}
}
This was working perfectly with our version using SqlServer 2008 R2 and .NET Framework 3.5. We just made an update and updated to SqlServer 2014 and .NET Framework 4.8 and we are having sometimes timeouts in the SELECT COUNT(*).....statement. The problem is that only happens on our customers, so the debug is very hard.
I really don't get it, the transaction has an isolation level of read uncommited and the code has not changed. What is really happening here?
Just to update this, with sp_getapplock I have no problems, so I mark this as an answer.

Hangfire causing locks in SQL Server

We are using Hangfire 1.7.2 within our ASP.NET Web project with SQL Server 2016. We have around 150 sites on our server, with each site using Hangfire 1.7.2. We noticed that when we upgraded these sites to use Hangfire, the DB server collapsed. Checking the DB logs, we found out there were multiple locking queries. We have identified one RPC Event “sys.sp_getapplock;1” In the all blocking sessions. It seems like Hangfire is locking our DB rendering whole DB unusable. We noticed almost 670+ locking queries because of Hangfire.
This could possibly be due to these properties we setup:
SlidingInvisibilityTimeout = TimeSpan.FromMinutes(30),
QueuePollInterval = TimeSpan.FromHours(5)
Each site has around 20 background jobs, a few of them run every minute, whereas others every hour, every 6 hours and some once a day.
I have searched the documentation but could not find anything which could explain these two properties or how to set them to avoid DB locks.
Looking for some help on this.
EDIT: The following queries are executed at every second:
exec sp_executesql N'select count(*) from [HangFire].[Set] with (readcommittedlock, forceseek) where [Key] = #key',N'#key nvarchar(4000)',#key=N'retries'
select distinct(Queue) from [HangFire].JobQueue with (nolock)
exec sp_executesql N'select count(*) from [HangFire].[Set] with (readcommittedlock, forceseek) where [Key] = #key',N'#key nvarchar(4000)',#key=N'retries'
irrespective of various combinations of timespan values we set. Here is the code of GetHangfirServers we are using:
public static IEnumerable<IDisposable> GetHangfireServers()
{
// Reference for GlobalConfiguration.Configuration: http://docs.hangfire.io/en/latest/getting-started/index.html
// Reference for UseSqlServerStorage: http://docs.hangfire.io/en/latest/configuration/using-sql-server.html#configuring-the-polling-interval
GlobalConfiguration.Configuration
.SetDataCompatibilityLevel(CompatibilityLevel.Version_170)
.UseSimpleAssemblyNameTypeSerializer()
.UseRecommendedSerializerSettings()
.UseSqlServerStorage(ConfigurationManager.ConnectionStrings["abc"]
.ConnectionString, new SqlServerStorageOptions
{
CommandBatchMaxTimeout = TimeSpan.FromMinutes(5),
SlidingInvisibilityTimeout = TimeSpan.FromMinutes(30),
QueuePollInterval = TimeSpan.FromHours(5), // Hangfire will poll after 5 hrs to check failed jobs.
UseRecommendedIsolationLevel = true,
UsePageLocksOnDequeue = true,
DisableGlobalLocks = true
});
// Reference: https://docs.hangfire.io/en/latest/background-processing/configuring-degree-of-parallelism.html
var options = new BackgroundJobServerOptions
{
WorkerCount = 5
};
var server = new BackgroundJobServer(options);
yield return server;
}
The worker count is set just to 5.
There are just 4 jobs and even those are completed (SELECT * FROM [HangFire].[State]):
Do you have any idea why the Hangfire is hitting so many queries at each second?
We faced this issue in one of our projects. The hangfire dashboard is pretty read heavy and it polls the hangfire db very frequently to refresh job status.
Best solution that worked for us was to have a dedicated hangfire database.
That way you will isolate the application queries from hangfire queries and your application queries won't be affected by the hangfire server and dashboard queries.
There is a newer configuration option called SlidingInvisibilityTimeout when configuring SqlServerStorage that causes these database locks as part of newer fetching non-transactional message fetching algorithm. It is meant for long running jobs that may cause backups of transactional logs to error out (as there is a database transaction that is still active as part of the long running job).
.UseSqlServerStorage(
"connection_string",
new SqlServerStorageOptions { SlidingInvisibilityTimeout = TimeSpan.FromMinutes(5) });
Our DBA did not like the database locks, so I just removed this SlidingInvisibilityTimeout option to use the old transactional based message fetching algorithm since I didn't have any long running jobs in my queue.
Whether you enable this option or not is dependent on your situation. You may want to consider moving your queue database outside of your application database if it isn't already and enable the SlidingInvisibilityTimeout option. If your DBA can't live with the locks even if the queue is a separate database, then maybe you could refactor your tasks into many more smaller tasks that are shorter lived. Just some ideas.
https://www.hangfire.io/blog/2017/06/16/hangfire-1.6.14.html
SqlServerStorage runs Install.sql which takes an exclusive schema lock on the Hangfire-schema.
DECLARE #SchemaLockResult INT;
EXEC #SchemaLockResult = sp_getapplock #Resource = '$(HangFireSchema):SchemaLock',
#LockMode = 'Exclusive'
From the Hangfire documentation:
"SQL Server objects are installed automatically from the SqlServerStorage constructor by executing statements
described in the Install.sql file (which is located under the tools folder in the NuGet package). Which contains
the migration script, so new versions of Hangfire with schema changes can be installed seamlessly, without your
intervention."
If you don't want to run this script everytime you could set SqlServerStorageOptions.PrepareSchemaIfNecessary to false.
var options = new SqlServerStorageOptions
{
PrepareSchemaIfNecessary = false
};
var sqlServerStorage = new SqlServerStorage(connectionstring, options);
Instead run the Install.sql manually by using this line:
SqlServerObjectsInstaller.Install(connection);

Deadlock when parallel DB call [duplicate]

When creating a report I have to execute 3 queries that involve separated entities of the same context. Because they are quite heavy ones I decided to use the .ToListAsync(); in order to have them run in parallel, but, to my surprise, I get a exception out of it...
What is the correct way to perform queries in parallel using EF 6? Should I manually start new Tasks?
Edit 1
The code is basically
using(var MyCtx = new MyCtx())
{
var r1 = MyCtx.E1.Where(bla bla bla).ToListAsync();
var r2 = MyCtx.E2.Where(ble ble ble).ToListAsync();
var r3 = MyCtx.E3.Where(ble ble ble).ToListAsync();
Task.WhenAll(r1,r2,r3);
DoSomething(r1.Result, r2.Result, r3.Result);
}
The problem is this:
EF doesn't support processing multiple requests through the same DbContext object. If your second asynchronous request on the same DbContext instance starts before the first request finishes (and that's the whole point), you'll get an error message that your request is processing against an open DataReader.
Source: https://visualstudiomagazine.com/articles/2014/04/01/async-processing.aspx
You will need to modify your code to something like this:
async Task<List<E1Entity>> GetE1Data()
{
using(var MyCtx = new MyCtx())
{
return await MyCtx.E1.Where(bla bla bla).ToListAsync();
}
}
async Task<List<E2Entity>> GetE2Data()
{
using(var MyCtx = new MyCtx())
{
return await MyCtx.E2.Where(bla bla bla).ToListAsync();
}
}
async Task DoSomething()
{
var t1 = GetE1Data();
var t2 = GetE2Data();
await Task.WhenAll(t1,t2);
DoSomething(t1.Result, t2.Result);
}
As a matter of interest, when using EF Core with Oracle, multiple parallel operations like the post here using a single DB context work without issue (despite Microsoft's documentation). The limitation is in the Microsoft.EntityFrameworkCore.SqlServer.dll driver, and is not a generalized EF issue. The corresponding Oracle.EntityFrameworkCore.dll driver doesn't have this limitation.
Check out https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/enabling-multiple-active-result-sets
From the documentation:
Statement interleaving of SELECT and BULK INSERT statements is
allowed. However, data manipulation language (DML) and data definition
language (DDL) statements execute atomically.
Then your above code works and you get the performance benefits for reading data.

How to avoid .NET Connection Pool timeouts when inserting 37k rows

I'm trying to figure out the best way to batch insert about 37k rows into my Sql Server using DAPPER.
My problem is that when I use Parallel.ForEach - the number of connections to the database increases over a short period of time - finally hitting nearly or about 100 ... which gives connection pool errors. If I force the max degree of parall then it's hit that max number and stays there.
Setting the maxdegree feels wrong.
It currently is doing about 10-20 inserts a second. This is also in a simple Console App - so there's no other database activity besides what's happening in my Parallel.ForEach loop.
Is using Parallel.ForEach the incorrect thing in this case because this is not-CPU bound?
Should I be using async/await ? If so, what stopping this from doing hundreds of db calls in one go?
Sample code which is basically what I'm doing.
var items = GetItemsFromSomewhere(); // Returns 37K items.
Parallel.ForEach(items => item)
{
using (var sqlConnection = new SqlConnection(_connectionString))
{
var result = sqlConnection.Execute(myQuery, new { ... } );
}
}
My (incorrect) understanding of this was that there should on be about 8 or so connections at any time to the db. The Connection Pool will release the connection (which remains instantiated in the Connection Pool, waiting to be used). And if the Execute takes .. i donno .. lets say even a 1 second (the longest running time for an insert was about 500ms .. and that's 1 in every 100 or so) ... that's ok .. that thread is blocked and chills until the Execute completes. Then the scope completes (and Dispose is auto called) and the connection closed. With the connection closed, the Parallel.ForEach then grabs the next item in the collection, goes to the connection pool and then grabs a spare connection (remember - we just closed one, a split second ago) ... rinse.repeat.
Is this wrong?
Notes:
.NET 4.5
Sql 2012
Console app.
Using Dapper.NET for sql code.
First of all: If it is about performance, use SqlBulkCopy. This works with SQL-Server. If you are using other database servers, they might have their own SqlBulkCopy-solution (Oracle has one).
SqlBulkCopy works like a bulk-select: One state opens one connection and streams all the data from the server to the client. With an insert, it works the other way arround: It streams all the new records from the client to the server.
See: https://msdn.microsoft.com/en-us/library/ex21zs8x(v=vs.110).aspx
If you insist of using parallellism, you might want to consider the follow code:
void BulkInsert<T>(object p)
{
IEnumerator<T> e = (IEnumerator<T>)p;
using (var sqlConnection = new SqlConnection(_connectionString))
{
while(true)
{
T item;
lock(e)
{
if (!e.MoveNext())
return;
item = e.Current;
}
var result = sqlConnection.Execute(myQuery, new { ... } );
}
}
}
Now create your own threads and invoke this method on these threads with one and the same parameter: The iterator which runs through your collection. Each threat opens its own connection once, starts inserting, and after all items are inserted, the connection is closed. This solutions uses as many connections as your created threads.
PS: Multiple variants of above code are possible . You could call it from background threads, from Tasks, etc. I hope you get the point.
You should use SqlBulkCopy instead of inserting one by one. Faster and more efficient.
https://msdn.microsoft.com/en-us/library/ex21zs8x(v=vs.110).aspx
credits to the answer owner
Sql Bulk Copy/Insert in C#

C# Multithreaded application and SQL connections help

I need some advice regarding an application I wrote. The issues I am having are due to my DAL and connections to my SQL Server 2008 database not being closed, however I have looked at my code and each connection is always being closed.
The application is a multithreaded application that retrieves a set of records and while it processes a record it updates information about it.
Here is the flow:
The administrator has the ability to set the number of threads to run and how many records per thread to pull.
Here is the code that runs after they click start:
Adapters are abstractions to my DAL here is a sample of what they look like:
public class UserDetailsAdapter: IDataAdapter<UserDetails>
{
private IUserDetailFactory _factory;
public UserDetailsAdapter()
{
_factory = new CampaignFactory();
}
public UserDetails FindById(int id){
return _factory.FindById(id);
}
}
As soon as the _factory is called it processes the SQL and immediately closes the connection.
Code For Threaded App:
private int _recordsPerthread;
private int _threadCount;
public void RunDetails()
{
//create an adapter instance that is an abstration
//of the data factory layer
var adapter = new UserDetailsAdapter();
for (var i = 1; i <= _threadCount; i++)
{
//This adater makes a call tot he databse to pull X amount of records and
//set a lock filed so the next set of records that are pulled are differnt.
var details = adapter.FindTopDetailsInQueue(_recordsPerthread);
if (details != null)
{
var parameters = new ArrayList {i, details};
ThreadPool.QueueUserWorkItem(ThreadWorker, parameters);
}
else
{
break;
}
}
}
private void ThreadWorker(object parametersList)
{
var parms = (ArrayList) parametersList;
var threadCount = (int) parms[0];
var details = (List<UserDetails>) parms[1];
var adapter = new DetailsAdapter();
//we keep running until there are no records left inthe Database
while (!_noRecordsInPool)
{
foreach (var detail in details)
{
var userAdapter = new UserAdapter();
var domainAdapter = new DomainAdapter();
var user = userAdapter.FindById(detail.UserId);
var domain = domainAdapter.FindById(detail.DomainId);
//...do some work here......
adapter.Update(detail);
}
if (!_noRecordsInPool)
{
details = adapter.FindTopDetailsInQueue(_recordsPerthread);
if (details == null || details.Count <= 0)
{
_noRecordsInPool = true;
break;
}
}
}
}
The app crashes because there seem to be connection issues to the database. Looking in my log files for the DAL I am seeing this:
Timeout expired. The timeout period
elapsed prior to obtaining a
connection from the pool. This may
have occurred because all pooled
connections were in use and max pool
size was reached
When I run this in one thread it works fine. I am guessing when I runt his in multiple threads I am obviously making too many connections to the DB. Any thoughts on how I can keep this running in multiple threads and make sure the database doesn’t give me any errors.
Update:
I am thinking my issues may be deadlocks in my database. Here is the code in SQL that is running whe I get a deadlock error:
WITH cte AS (
SELECT TOP (#topCount) *
FROM
dbo.UserDetails WITH (READPAST)
WHERE
dbo.UserDetails where IsLocked = 0)
UPDATE cte
SET
IsLocked = 1
OUTPUT INSERTED.*;
I have never had issues with this code before (in other applications). I reorganzied my Indexes as they were 99% fragmented. That didn't help. I am at a loss here.
I'm confused as to where in your code connections get opened, but you probably want your data adapters to implement IDispose (making sure to close the pool connection as you leave using scope) and wrap your code in using blocks:
using (adapter = new UserDetailsAdapter())
{
for (var i = 1; i <= _threadCount; i++)
{
[..]
}
} // adapter leaves scope here; connection is implicitly marked as no longer necessary
ADO.NET uses connection pooling, so there's no need to (and it can be counter-productive to) explicitly open and close connections.
It is not clear to me how you actually connect to the database. The adapter must reference a connection.
How do you actually initialize that connection?
If you use a new adapter for each thread, you must use a new connection for each adapter.
I am not too familiar with your environment, but I am certain that you really need a lot of open connections before your DB starts complaining about it!
Well, after doing some research I found that there might be a bug in SQL server 2008 and running parallel queries. I’ll have to dig up the link where I found the discussion on this, but I ended up running this on my server:
sp_configure 'max degree of parallelism', 1;
GO
RECONFIGURE WITH OVERRIDE;
GO
This can decrease your server performance, overall, so it may not be an option for some people, but it worked great for me.
For some queries I added the MAXDOP(n) (n being the number of processors to utilize) option so they can run more efficiently. It did help a bit.
Secondly, I found out that my DAL’s Dispose method was using the GC.Suppressfinalize method. So, my finally sections were not firing in my DAL properly and not closing out my connections.
Thanks to all who gave their input!

Categories

Resources