I have a controller method that has multiple tasks running simultaneously and I use await Task.WhenAll() to ensure all tasks finish before returning the response in my controller. Each tasks runs a SQL query and does some other stuff but I run into an error:
An unhandled exception has occurred while executing the request.
System.InvalidOperationException: ExecuteReader requires an open and available Connection. The connection's current state is open.
My code is setup as follows:
using (var connection = new SqlConnection(connectionString))
{
List<Task> tasks = new List<Task>();
tasks.Add(Task.Run(async () =>
{
connection.Query("exec dbo.DOCFUNCTION #TAB_ID, #DriveItemID, #WebURL, #EditLink", new
{
TAB_ID = queryResult.TAB_ID,
DriveItemID = UploadResponse.Id,
WebURL = UploadResponse.WebUrl,
EditLink = SharedLinkResponse.Link.WebUrl,
}); //run other code inside task
}));
tasks.Add(Task.Run(async () =>
{
connection.Query("exec dbo.DOCFUNCTION #TAB_ID, #DriveItemID, #WebURL, #EditLink", new
{
TAB_ID = queryResult.TAB_ID,
DriveItemID = UploadResponse.Id,
WebURL = UploadResponse.WebUrl,
EditLink = SharedLinkResponse.Link.WebUrl,
});//run other code inside task
}));
tasks.Add(Task.Run(async () =>
{
var queryResult = connection.QuerySingle(query, new { jsonData = jsonData, certificationJSON = certificationJSON, UserID = UserID });//run other code inside task
}));
await Task.WhenAll(tasks);
}
If I remove the tasks.add() and just use await Task.Run() for each tasks then everything works fine. Its when I try to do this asynchronously it causes this error. I am not sure what the issue could be. Am I not allowed to make multiple queries in separate threads that could occur close in time?
Related
I'm asking this more because I have no idea of why the way I solved the issue works.
Here's the method that gives me the error (notice the return line):
public async Task<IEnumerable<ContractServiceResponse>> GetContractServices(Guid contractId)
{
var services = await _translationManager.GetCachedResource<Service>();
var rates = await _translationManager.GetCachedResource<Rate>();
var contractServices = await _dbContext.ContractService
.Where(cs => cs.ContractId == contractId)
.ToListAsync();
var serviceCenterBillableConcepts = await _billableConceptService.GetServiceCenterBillableConcepts(services.Select(s => s.Id).Distinct());
var contractBillableConcepts = await _billableConceptService.GetContractBillableConcepts(contractId);
var servicesResponse = contractServices.Select(async cs => new ContractServiceResponse
{
Id = cs.ServiceId,
Service = services.FirstOrDefault(s => s.Id == cs.ServiceId).Translation,
Enabled = cs.Billable,
ExceptionReason = cs.BillableExceptionReason,
BillableConcepts = await AddContractBillableConceptsToServices(cs.ServiceId, contractBillableConcepts, serviceCenterBillableConcepts, rates)
});
return await Task.WhenAll(servicesResponse);
}
And here's the code that works (notice the return line).
public async Task<IEnumerable<ContractServiceResponse>> GetContractServices(Guid contractId)
{
var services = await _translationManager.GetCachedResource<Service>();
var rates = await _translationManager.GetCachedResource<Rate>();
var contractServices = await _dbContext.ContractService
.Where(cs => cs.ContractId == contractId)
.ToListAsync();
var serviceCenterBillableConcepts = await _billableConceptService.GetServiceCenterBillableConcepts(services.Select(s => s.Id).Distinct());
var contractBillableConcepts = await _billableConceptService.GetContractBillableConcepts(contractId);
var servicesResponse = contractServices.Select(async cs => new ContractServiceResponse
{
Id = cs.ServiceId,
Service = services.FirstOrDefault(s => s.Id == cs.ServiceId).Translation,
Enabled = cs.Billable,
ExceptionReason = cs.BillableExceptionReason,
BillableConcepts = await AddContractBillableConceptsToServices(cs.ServiceId, contractBillableConcepts, serviceCenterBillableConcepts, rates)
});
return servicesResponse.Select(t => t.Result);
}
Why the second one works, but the first one doesn't?
Thanks in advance.
First, this has nothing to do with SqlDataReader or any database. Databases don't return HTTP status codes. The 409 is returned by whatever is called by await AddContractBillableConceptsToServices
The first example executes as many calls as there are items in contractServices at the exact same time. If there are 100 items, it will execute 100 requests. All those requests would try to allocate and use memory, access databases or limited other resources at the same time, causing blocks, queueing and possibly deadlocks. Making 100 concurrent calls could easily result in 100x degradation if not outright crashing the service.
That's why production services always implement throttling and queuing, and, if a client is ill-behaved, will throw a 409 and block it for a while.
The second code runs sequentially. t.Result blocks until the task t completes, so the second code is no different than:
foreach(var cs in contractServices)
{
var t=AddContractBillableConceptsToServices(cs.ServiceId, contractBillableConcepts, serviceCenterBillableConcepts, rates);
t.Wait();
yield return new ContractServiceResponse
{
BillableConcepts = t.Result
}
}
Execute only N requests at a time
A real solution is to execute only a limited number of requests at a time, using, eg Parallel.ForEachAsync :
var results=new ConcurrentQueue<ContractServiceResponse> ();
Parallel.ForEachAsync(contractServices, async cs=>{
var concepts=AddContractBillableConceptsToServices(cs.ServiceId, contractBillableConcepts, serviceCenterBillableConcepts, rates);
var response=new ContractServiceResponse
{
...
BillableConcepts = concepts
}
results.Enqueue(response);
});
By default, ForEachAsync will make as many concurrent calls as there are cores. This can be changed through the ParallelOptions.MaxDegreeOfParallelism property.
Youre executing everything in sequence. servicesResponse.Select(t => t.Result) won't do anything until the IEnumerable is enumerated by the consumer.
Task.WhenAll() will run the tasks in parallel. Your issue is that the code you're calling, does not play nice with parallelism.
To solve this, don't use WhenAll, but just use await.
Alternatively you can pinpoint the code which breaks when it is run in parallel, and implement some kind of locking mechanism using a semaphoreslim
I have a chron job which calls a database table and gets about half a million records returned. I need to loop through all of that data, and send API post's to a third party API. In general, this works fine, but the processing time is forever (10 hours). I need a way to speed it up. I've been trying to use a list of Task with SemaphoreSlim, but running into issues (it doesn't like that my api call returns a Task). I'm wondering if anyone has a solution to this that won't destroy the VM's memory?
Current code looks something like:
foreach(var data in dataList)
{
try
{
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
} catch//
}
But I'm trying to do this and getting the syntax wrong:
var tasks = new List<Task<DataObj>>();
var throttler = new SemaphoreSlim(10);
foreach(var data in dataList)
{
await throttler.WaitAsync();
tasks.Add(Task.Run(async () => {
try
{
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
}
finally
{
throttler.Release();
}
}));
}
Your list is of type Task<DataObj>, but your async lambda doesn't return anything, so its return type is Task. To fix the syntax, just return the value:
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
return response;
As others have noted in the comments, I also recommend not using Task.Run here. A local async method would work fine:
var tasks = new List<Task<DataObj>>();
var throttler = new SemaphoreSlim(10);
foreach(var data in dataList)
{
tasks.Add(ThrottledPostData(data));
}
var results = await Task.WhenAll(tasks);
async Task<DataObj> ThrottledPostData(Data data)
{
await throttler.WaitAsync();
try
{
var response = await _apiService.PostData(data);
_logger.Trace(response.Message);
return response;
}
finally
{
throttler.Release();
}
}
I have an issue with an endpoint blocking calls from other endpoints in my app. When we call this endpoint, this basically blocks all other api calls from executing, and they need to wait until this is finished.
public async Task<ActionResult> GrantAccesstoUsers()
{
// other operations
var grantResult = await
this._workSpaceProvider.GrantUserAccessAsync(this.CurrentUser.Id).ConfigureAwait(false);
return this.Ok(result);
}
The GrantUserAccessAsync method calls set of tasks that will run on a parallel.
public async Task<List<WorkspaceDetail>> GrantUserAccessAsync(string currentUser)
{
var responselist = new List<WorkspaceDetail>();
try
{
// calling these prematurely to be reused once threads are created
// none expensive calls
var properlyNamedWorkSpaces = await this._helper.GetProperlyNamedWorkspacesAsync(true).ConfigureAwait(false);
var dbGroups = await this._reportCatalogProvider.GetWorkspaceFromCatalog().ConfigureAwait(false);
var catalogInfo = await this._clientServiceHelper.GetDatabaseConfigurationAsync("our-service").ConfigureAwait(false);
if (properlyNamedWorkSpaces != null && properlyNamedWorkSpaces.Count > 0)
{
// these methods returns tasks for parallel processing
var grantUserContributorAccessTaskList = await this.GrantUserContributorAccessTaskList(properlyNamedWorkSpaces, currentUser, dbGroups, catalogInfo).ConfigureAwait(false);
var grantUserAdminAccessTaskList = await this.GrantUserAdminAccessTaskList(properlyNamedWorkSpaces, currentUser, dbGroups, catalogInfo).ConfigureAwait(false);
var removeInvalidUserAndSPNTaskList = await this.RemoveAccessRightsToWorkspaceTaskList(properlyNamedWorkSpaces, dbGroups, currentUser, catalogInfo).ConfigureAwait(false);
var tasklist = new List<Task<WorkspaceDetail>>();
tasklist.AddRange(grantUserContributorAccessTaskList);
tasklist.AddRange(grantUserAdminAccessTaskList);
tasklist.AddRange(removeInvalidUserAndSPNTaskList);
// Start running Parallel Task
Parallel.ForEach(tasklist, task =>
{
Task.Delay(this._config.CurrentValue.PacingDelay);
task.Start();
});
// Get All Client Worspace Processing Results
var clientWorkspaceProcessingResult = await Task.WhenAll(tasklist).ConfigureAwait(false);
// Populate result
responselist.AddRange(clientWorkspaceProcessingResult.ToList());
}
}
catch (Exception)
{
throw;
}
return responselist;
}
These methods are basically identical in structure and they look like this:
private async Task<List<Task<WorkspaceDetail>>> GrantUserContributorAccessTaskList(List<Group> workspaces, string currentUser, List<WorkspaceManagement> dbGroups, DatabaseConfig catalogInfo)
{
var tasklist = new List<Task<WorkspaceDetail>>();
foreach (var workspace in workspaces)
{
tasklist.Add(new Task<WorkspaceDetail>(() =>
this.GrantContributorAccessToUsers(workspace, currentUser, dbGroups, catalogInfo).Result));
// i added a delay here because we encountered an issue before in production and this seems to solve the problem. this is set to 4ms.
Task.Delay(this._config.CurrentValue.DelayInMiliseconds);
}
return tasklist;
}
The other methods called here looks like this:
private async Task<WorkspaceDetail> GrantContributorAccessToUsers(Group workspace, string currentUser, List<Data.ReportCatalogDB.WorkspaceManagement> dbGroups, DatabaseConfig catalogInfo)
{
// This prevents other thread or task to start and prevents exceeding the number of threads allowed
await this._batchProcessor.WaitAsync().ConfigureAwait(false);
var result = new WorkspaceDetail();
try
{
var contributorAccessresult = await this.helper.GrantContributorAccessToUsersAsync(workspace, this._powerBIConfig.CurrentValue.SPNUsers).ConfigureAwait(false);
if (contributorAccessresult != null
&& contributorAccessresult.Count > 0)
{
// do something
}
else
{
// do something
}
// this is done to reuse the call that is being executed in the helper above. it's an expensive call from an external endpoint so we opted to reuse what was used in the initial call, instead of calling it again for this process
var syncWorkspaceAccessToDb = await this.SyncWorkspaceAccessAsync(currentUser, workspace.Id, contributorAccessresult, dbGroups, catalogInfo).ConfigureAwait(false);
foreach (var dbResponse in syncWorkspaceAccessToDb) {
result.ResponseMessage += dbResponse.ResponseMessage;
}
}
catch (Exception ex)
{
this._loghelper.LogEvent(this._logger, logEvent, OperationType.GrantContributorAccessToWorkspaceManager, LogEventStatus.FAIL);
}
finally
{
this._batchProcessor.Release();
}
return result;
}
The last method called writes the record in a database table:
private async Task<List<WorkspaceDetail>> SyncWorkspaceAccessAsync(string currentUser,
Guid workspaceId,
List<GroupUser> groupUsers,
List<WorkspaceManagement> dbGroups,
DatabaseConfig catalogInfo) {
var result = new List<WorkspaceDetail>();
var tasklist = new List<Task<WorkspaceDetail>>();
// get active workspace details from the db
var workspace = dbGroups.Where(x => x.PowerBIGroupId == workspaceId).FirstOrDefault();
try
{
// to auto dispose the provider, we are creating this for each instance because
// having only one instance creates an error when the other task starts running
using (var contextProvider = this._contextFactory.GetReportCatalogProvider(
catalogInfo.Server,
catalogInfo.Database,
catalogInfo.Username,
catalogInfo.Password,
this._dbPolicy))
{
if (workspace != null)
{
// get current group users in the db from the workspace object
var currentDbGroupUsers = workspace.WorkspaceAccess.Where(w => w.Id == workspace.Id
&& w.IsDeleted == false).ToList();
#region identify to process
#region users to add
// identify users to add
var usersToAdd = groupUsers.Where(g => !currentDbGroupUsers.Any(w => w.Id == workspace.Id ))
.Select(g => new WorkspaceAccess
{
// class properties
}).ToList();
#endregion
var addTasks = await this.AddWorkspaceAccessToDbTask(catalogProvider, usersToAdd, workspace.PowerBIGroupId, workspace.WorkspaceName).ConfigureAwait(false);
tasklist.AddRange(addTasks);
// this is a potential fix that i did, hoping adding another parallel thread can solve the problem
Parallel.ForEach(tasklist, new ParallelOptions { MaxDegreeOfParallelism = this._config.CurrentValue.MaxDegreeOfParallelism }, task =>
{
Task.Delay(this._config.CurrentValue.PacingDelay);
task.Start();
});
var processResult = await Task.WhenAll(tasklist).ConfigureAwait(false);
// Populate result
result.AddRange(processResult.ToList());
}
}
}
catch (Exception ex)
{
// handle error
}
return result;
}
I tried some potential solutions already, like the methods here are written with Task.FromResult before instead of async so I changed that. Reference is from this thread:
Using Task.FromResult v/s await in C#
Also, I thought it was a similar issue that we faced before when we are creating multiple db context connections needed when running multiple parallel tasks by adding a small delay on tasks but that didn't solve the problem.
Task.Delay(this._config.CurrentValue.DelayInMiliseconds);
Any help would be much appreciated.
I assume your this._batchProcessor is an instance of SemaphoreSlim. If your other endpoints somehow call
await this._batchProcessor.WaitAsyc()
that means they can't go further until semaphor will be released.
Another thing I'd like to mention: please avoid using Parallel.ForEach with async/await. TPL is not designed to work with async/await, here is good answer why you should avoid using them together: Nesting await in Parallel.ForEach
I have a solution that creates multiple I/O based tasks and I'm using Task.WhenAny() to manage these tasks. But often many of the tasks will fail due to network issue or request throttling etc.
I can't seem to find a solution that enables me to successfully retry failed tasks when using a Task.WhenAny() approach.
Here is what I'm doing:
var tasks = new List<Task<MyType>>();
foreach(var item in someCollection)
{
task.Add(GetSomethingAsync());
}
while (tasks.Count > 0)
{
var child = await Task.WhenAny(tasks);
tasks.Remove(child);
???
}
So the above structure works for completing tasks, but I haven't found a way to handle and retry failing tasks. The await Task.WhenAny throws an AggregateException rather than allowing me to inspect a task status. When In the exception handler I no longer have any way to retry the failed task.
I believe it would be easier to retry within the tasks, and then replace the Task.WhenAny-in-a-loop antipattern with Task.WhenAll
E.g., using Polly:
var tasks = new List<Task<MyType>>();
var policy = ...; // See Polly documentation
foreach(var item in someCollection)
tasks.Add(policy.ExecuteAsync(() => GetSomethingAsync()));
await Task.WhenAll(tasks);
or, more succinctly:
var policy = ...; // See Polly documentation
var tasks = someCollection.Select(item => policy.ExecuteAsync(() => GetSomethingAsync()));
await Task.WhenAll(tasks);
If you don't want to use the Polly library for some reason, you could use the Retry method bellow. It accepts a task factory, and keeps creating and then awaiting a task until it completes successfully, or the maxAttempts have been reached:
public static async Task<TResult> Retry<TResult>(Func<Task<TResult>> taskFactory,
int maxAttempts)
{
int failedAttempts = 0;
while (true)
{
try
{
var task = taskFactory();
return await task.ConfigureAwait(false);
}
catch
{
failedAttempts++;
if (failedAttempts >= maxAttempts) throw;
}
}
}
You could then use this method to download (for example) some web pages.
string[] urls =
{
"https://stackoverflow.com",
"https://superuser.com",
//"https://no-such.url",
};
var httpClient = new HttpClient();
var tasks = urls.Select(url => Retry(async () =>
{
return (Url: url, Html: await httpClient.GetStringAsync(url));
}, maxAttempts: 5));
var results = await Task.WhenAll(tasks);
foreach (var result in results)
{
Console.WriteLine($"Url: {result.Url}, {result.Html.Length:#,0} chars");
}
Output:
Url: https://stackoverflow.com, 112,276 chars
Url: https://superuser.com, 122,784 chars
If you uncomment the third url then instead of these results an HttpRequestException will be thrown, after five failed attempts.
The Task.WhenAll method will wait for the completion of all tasks before propagating the error. In case it is preferable to report the error as soon as possible, you can find solutions in this question.
I am having issues with processing Async lambda in ActionBlock.
var state = new ConcurrentBag<UsersData>();
var getData = new ActionBlock<IEnumerable<int>>(async (userIdsBatch) =>
{
var query = _queryBuilder.GetQuery(userIdsBatch);
string response = await _handler.GetResponseAsync(query).ConfigureAwait(false);
UsersData userData = new Parser().GetUserData(response);
state.Add(userData);
},
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = _MAX_DEGREE_OF_PARALLELISM // 1
});
foreach (IEnumerable<int> userIdsBatch in GetUserBatches(userIds, 10))
{
getData.Post(userIdsBatch);
}
getData.Complete();
await getData.Completion.ConfigureAwait(false);
// merge the states for different batches synchronously.
The ActionBlock is exiting before completion of the async call within it. Few Tasks that were produced are thrown away, not all of them are completed.
Is there a way to wait for all the tasks to complete before the synchronously merging the results?