When creating a report I have to execute 3 queries that involve separated entities of the same context. Because they are quite heavy ones I decided to use the .ToListAsync(); in order to have them run in parallel, but, to my surprise, I get a exception out of it...
What is the correct way to perform queries in parallel using EF 6? Should I manually start new Tasks?
Edit 1
The code is basically
using(var MyCtx = new MyCtx())
{
var r1 = MyCtx.E1.Where(bla bla bla).ToListAsync();
var r2 = MyCtx.E2.Where(ble ble ble).ToListAsync();
var r3 = MyCtx.E3.Where(ble ble ble).ToListAsync();
Task.WhenAll(r1,r2,r3);
DoSomething(r1.Result, r2.Result, r3.Result);
}
The problem is this:
EF doesn't support processing multiple requests through the same DbContext object. If your second asynchronous request on the same DbContext instance starts before the first request finishes (and that's the whole point), you'll get an error message that your request is processing against an open DataReader.
Source: https://visualstudiomagazine.com/articles/2014/04/01/async-processing.aspx
You will need to modify your code to something like this:
async Task<List<E1Entity>> GetE1Data()
{
using(var MyCtx = new MyCtx())
{
return await MyCtx.E1.Where(bla bla bla).ToListAsync();
}
}
async Task<List<E2Entity>> GetE2Data()
{
using(var MyCtx = new MyCtx())
{
return await MyCtx.E2.Where(bla bla bla).ToListAsync();
}
}
async Task DoSomething()
{
var t1 = GetE1Data();
var t2 = GetE2Data();
await Task.WhenAll(t1,t2);
DoSomething(t1.Result, t2.Result);
}
As a matter of interest, when using EF Core with Oracle, multiple parallel operations like the post here using a single DB context work without issue (despite Microsoft's documentation). The limitation is in the Microsoft.EntityFrameworkCore.SqlServer.dll driver, and is not a generalized EF issue. The corresponding Oracle.EntityFrameworkCore.dll driver doesn't have this limitation.
Check out https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/enabling-multiple-active-result-sets
From the documentation:
Statement interleaving of SELECT and BULK INSERT statements is
allowed. However, data manipulation language (DML) and data definition
language (DDL) statements execute atomically.
Then your above code works and you get the performance benefits for reading data.
Related
In the previous day I am looking for a way to make my code fully asynchronous. So that when called by a rest API, I' ll get an immediate response meanwhile the process is running in the background.
To do that I simply used
tasks.Add(Task<bool>.Run( () => WholeProcessFunc(parameter) ))
where WholeProcessFunc is the function that make all the calculations(it may be computationally intensive).
It works as expected however I read that it is not optimal to wrap the whole process in a Task.Run.
My code need to compute different entity framework query which result depends on the previous one and contains also foreach loop.
For instance I can' t understand which is the best practice to make async a function like this:
public async Task<List<float>> func()
{
List<float> acsi = new List<float>();
using (var db = new EFContext())
{
long[] ids = await db.table1.Join(db.table2 /*,...*/)
.Where(/*...*/)
.Select(/*...*/).ToArrayAsync();
foreach (long id in ids)
{
var all = db.table1.Join(/*...*/)
.Where(/*...*/);
float acsi_temp = await all.OrderByDescending(/*...*/)
.Select(/*...*/).FirstAsync();
if (acsi_temp < 0) { break; }
acsi.Add(acsi_temp);
}
}
return acsi;
}
In particular I have difficulties with the foreach loop and the fact that the result of a query is used in the next .
Finally with the break statement which I don't get how to translate it. I read about cancellation token, could it be the way ?
Is wrapping up all this function in a Task.Run a solid solution ?
In the previous day I am looking for a way to make my code fully asynchronous. So that when called by a rest api, I' ll get an immediate response meanwhile the process is running in the background.
Well, that's one meaning of the word "asynchronous". Unfortunately, it's completely different than the kind of "asynchronous" that async/await does. async yields to the thread pool, not the client (browser).
It works as expected however I read that it is not optimal to wrap the whole process in a Task.Run.
It only seems to work as expected. It's likely that once your web site gets higher load, it will start to fail. It's definite that once your web site gets busier and you do things like rolling upgrades, it will start to fail.
Is wrapping up all this function in a Task.Run a solid solution ?
Not at all. Fire-and-forget is inherently dangerous.
A proper solution should be a basic distributed architecture:
A durable queue, such as an Azure Queue or Rabbit (if properly configured to be durable).
An independent processor, such as an Azure Function or Win32 Service.
Then the ASP.NET app will encode the work to be done into a queue message, enqueue that to the durable queue, and then return. Some time later, the processor will retrieve the message from that queue and do the actual work.
You can translate your code to return an IAsyncEnumerable<...>, that way the caller can process the results as they are obtained. In an asp.net 5 MVC endpoint, this includes writing serialised json to the browser;
public async IAsyncEnumerable<float> func()
{
using (var db = new EFContext())
{
//...
foreach (long id in ids)
{
//...
if(acsi_temp<0) { yield break; }
yield return acsi_temp;
}
}
}
public async Task<IActionResult> ControllerAction(){
if (...)
return NotFound();
return Ok(func());
}
Note that if your endpoint is an async IAsyncEnumerable coroutine. In asp.net 5, your headers would be flushed before your action even started. Giving you no way to return any http error codes.
Though for performance, you should try rework your queries so you can fetch all the data up front.
I have a ASP .NET Core application using EF and an Azure SQL database. We recently migrated the database to the Hyperscale service tier. The database has 2 vCores and 2 secondary replicas. When we have a function query a secondary replica (by either modifying the connection string to include ApplicationIntent=READONLY; or by using a new services.AddDbContext() from our Startup.cs) we find that functions take 20-30x longer to execute.
For instance, this function:
public async Task<List<StaffWorkMuchModel>> ExemptStaffWorkMuchPerWeek(int quarterId, int facilityId) {
using (var dbConnection = (IDbConnection) _serviceProvider.GetService(typeof(IDbConnection))) {
dbConnection.ConnectionString += "ApplicationIntent=READONLY;";
dbConnection.Open();
return (await dbConnection.QueryAsync<StaffWorkMuchModel>("ExemptStaffWorkMuchPerWeek", new {
id_qtr = quarterId,
id_fac = facilityId
}, commandType: CommandType.StoredProcedure, commandTimeout: 150)).ToList();
}
}
We have tried to query the secondary replica directly using SQL Server Management Studio and have found that the queries all return in less than a second. Also, when we add breakpoints in our code, it seems like the queries are returning results immediately. Most of the pages we are having issues with use ajax to call 4+ functions very similar to the one above. It almost seems like they are not running asynchronously.
This same code runs great when we comment out:
dbConnection.ConnectionString += "ApplicationIntent=READONLY;";
Any idea what could be causing all of our secondary replica functionss to load so slow?
I've written some code to write some data to SQL Server, using Dapper. I don't need to wait for this write to complete before continuing other work, so want to use Task.Run() to make this asynchronous.
I have (using) statements for calling this in the rest of my system:
using (IDataAccess ida = new DAL())
{
ida.WriteMessageToDB(id, routingKey, msgBody);
}
My DAL will automatically check the dbConnection.State when the using statement is ran, and attempt a simple fix if it's closed. This works just fine for any non-async/TPL select calls.
However, when I throw a load of writes at the same time, the Task.Run() code was falling over as the connection was closed for some of them - essentially I think the parallel nature of the code meant the state was being closed by other tasks.
I 'fixed' this by doing a check to open the Connection.State within the Task.Run() code, and this appears to have 'solved' the problem. Like so:
Task.Run(() =>
{
if (dbConnection.State == ConnectionState.Closed)
{
dbConnection.Open();
}
if (dbConnection.State == ConnectionState.Open)
{
*Dapper SQL String and Execute Commands*
}
});
When I run SELECT * FROM sys.dm_exec_connections from SSMS after this, I see a lot more connections. To be expected?
Now as I understand it:
Dapper doesn't deal with connection pooling
SQL Server should automatically deal with connection pooling?
Is there anything wrong with this solution? Or a better way of doing it? I'd like to use connection pooling for obvious reasons, and as painlessly as possible.
Thanks in advance.
Thanks Juharr - I've upvoted your reply.
For reference to others, I changed write function to await and Dapper async:
private async Task WriteMessageToDB(Guid id, string tableName, string jsonString)
{
string sql = *Redacted*
await dbConnection.ExecuteScalarAsync<int>(sql, new { ID = id, Body = jsonString });
}
And then created a new task in the caller that monitors the outcome.
This is working consistently under load, and not seeing excessive new connections being created either.
I am consuming a web service provided to me by a vendor in c# application. This application calls a web method in a loop and that slows down the performance. To get the complete set of results, it takes more than an hour.
Can I apply multi threading on my side to consume this web service in multiple threads and combine the results together?
Is there any better approach to retrieve data in minutes instead of hours?
First of all you have to make sure your vendor does indeed support this or does not prohibit it (which is very probable too).
The code itself to do this is fairly straightforward, using a method such as Parallel.For
Simple Example (google.com):
Parallel.For(0, norequests,
i => {
//Code that does your request goes here
} );
Exaplanation:
In a Parallel.For loop, all the requests get executed in-parallel (as implied in the name), which could potentially provide a very significant increase in performance.
Further reading:
MSDN on Parallel.For loops
You should really ask your vendor. We can only speculate about why it takes that long or if firing multiple requests will actually yield the same results as the one that takes long.
Basically, sending one request getting one response should beat the multi-threaded variant because it should be easier to optimize on the servers side.
If you want to know why this is not the case with the current version of the service, ask the vendor.
This is only samples, if you call web services in parallel:
private void TestParallelForeach()
{
string[] uris = {"http://192.168.1.2", "http://192.168.1.3", "http://192.168.1.4"};
var results = new List<string>();
var syncObj = new object();
Parallel.ForEach(uris, uri =>
{
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
try
{
var result = webClient.DownloadString(uri);
lock (syncObj)
{
results.Add(result);
}
}
catch (Exception ex)
{
// Do error handling here...
}
}
});
// Do with "results" here....
}
I just tried to make a part of my ASP .NET MVC Application asynchronous but even after reading and trying out a lot I don't really understand the async-await pattern and hope someone could give me a hint.
Basically I have the following:
A javascript call to my controller which fetches a partial View for a chart (this happens several times after page load for a lot of charts)
// Load content of one chart
my.loadChartContent = function (data, callback) {
$.post("/Dashboard/GetChartContent/", data, function (datain) {
if (isFunction(callback))
callback(datain);
});
};
A controller action which calls a database method in another class
public ActionResult GetChartContent(int id, bool isDraft = false)
{
//do something
//...
var chartdata = _dataService.GetDataForChart(chart, isDraft, _user.Id); //long running query
//do something with chartdata
return View(chartdata);
}
The data class (_dataService) which fetches the data from the database with a SqlDataReader and loads a DataTable with that data.
The problem is that although the javascript is executed asynchronously the Controller-Actions seems to be blocked until a result from the DataService class returns. I would like to start all queries to the database and wait for the results asynchronously, so that long-running queries don't block shorter ones. (In SQL Server Profiler I see the queries as Begin-End, Begin-End, Begin-End => but it should be begin-begin-begin - end-end-end)
Where should I use async-await? Is it enough to use it (somehow) for my controller action or is it necessary to make the whole "call-chain" asynchronous?
Update:
When I use SQLConnection.OpenAsync and ExecuteReaderAsync the code never finishes...and I don't get why?
public async Task<Query> GetSqlServerData(Query query)
{
var dt = new DataTable();
var con = new SqlConnection(query.ConnectionString);
await con.OpenAsync();
var cmd = new SqlCommand(query.SelectStatement, con);
var datareader = await cmd.ExecuteReaderAsync();
dt.Load(datareader);
con.Close();
query.Result = dt;
return query;
}
Thank you all for your answers. But the real answer is way simpler than I thought. Stephen pointed me in the right direction with the sentence
All HTTP calls are naturally asynchronous.
I knew that and generally that's true but obviously not when you're using sessions because ASP.NET MVC waits for each request (from one user) to be completed to synchronize the session. You can find a better explanation here: http://www.stefanprodan.eu/2012/02/parallel-processing-of-concurrent-ajax-requests-in-asp-net-mvc/
So - just decorating my controller with ...
[SessionState(System.Web.SessionState.SessionStateBehavior.ReadOnly)]
...did it and now I have the result I wanted - simultaneous queries on my SQL Server and a lot faster response time.
#Stephen - it's more than 2 simultaneous requests => http://www.browserscope.org/?category=network
The problem is that although the javascript is executed asynchronously the Controller-Actions seems to be blocked until a result from the DataService class returns. I would like to start all queries to the database and wait for the results asynchronously, so that long-running queries don't block shorter ones.
The term "asynchronous" can be applied a few different ways. In JavaScript, everything is asynchronous. All HTTP calls are naturally asynchronous.
If you're not seeing any sql overlapping, then you should be sure that you're calling loadChartContent multiple times (not calling it once at a time chained through callbacks or anything like that). The browser will limit you to two simultaneous requests, but you should see two requests at a time hitting your sql server.
Making your server side async won't help you, because async doesn't change the HTTP protocol (as I describe on my blog). Even if you make your database access async, your controller action will still have to wait for them to complete, and the browser will still limit you to two outstanding requests. Server-side async is useful for scaling your web servers, but if your web server is talking to a single SQL Server backend, then there's no point in scaling your web server because your web server is not the determining factor in your scalability.
On a side note, I suspect the reason your async code never finishes is because you're using Result or Wait; I explain this on my blog.
So, back to your original problem: if you want to start all the queries to the database, then you'll need to change API to be "chunky" instead of "chatty". I.e., add a GetChartContents action which takes multiple ids and runs all of those queries in parallel. I'd recommend using async database methods with something like this:
public async Task<ActionResult> GetChartContents(int[] ids, bool isDraft = false)
{
var charts = ...;
var chartTasks = Enumerable.Range(0, ids.Length)
.Select(i => _dataService.GetDataForChartAsync(charts[i], isDraft, _user.Id))
.ToArray();
var results = await Task.WhenAll(chartTasks);
...
return View(results);
}
I think this tutorial gives a pretty good starting point for what you're trying to do.
http://www.asp.net/mvc/tutorials/mvc-4/using-asynchronous-methods-in-aspnet-mvc-4
If you have:
public async Task<Query> GetSqlServerData(Query query)
{
var dt = new DataTable();
var con = new SqlConnection(query.ConnectionString);
await con.OpenAsync();
var cmd = new SqlCommand(query.SelectStatement, con);
var datareader = await cmd.ExecuteReaderAsync();
dt.Load(datareader);
con.Close();
query.Result = dt;
return query;
}
Then use:
query1 = await GetSqlServerData(query1);
query2 = await GetSqlServerData(query2);
query3 = await GetSqlServerData(query3);