What is wrong with that optimistic concurrency worker implementation? - c#

I have tried to implement an optimistic concurrency 'worker'.
Goal is to read a batch of data from the same database table (single table with no relations) with multiple parallel 'worker'. This did seem to work so far. I get optimistic concurrency exceptions here and there, catch them and retry.
So far so good, and the function to get the data works stable on my local setup. When moving the application to a test environment however, I get a strange timeout exception, which even if caught, will end the async function (breaks the while loop). Does someone see a flaw in the implementation? What could cause the timeout? What could cause the end of the async function?
public async IAsyncEnumerable<List<WorkItem>> LoadBatchedWorkload([EnumeratorCancellation] CancellationToken token, int batchSize, int runID)
{
DataContext context = null;
try
{
context = GetNewContext(); // create a new dbContext
List<WorkItem> workItems;
bool loadSuccessInner;
while (true)
{
if (token.IsCancellationRequested) break;
loadSuccessInner = false;
context.Dispose();
context = GetNewContext(); // create a new dbContext
RunState currentRunState = context.Runs.Where(a => a.Id == runID).First().Status;
try
{
// Error happens on the following line: Microsoft.Data.SqlClient.SqlException: Timeout
workItems = context.WorkItems.Where(a => a.State == ProcessState.ToProcess).Take(batchSize).ToList();
loadSuccessInner = true;
}
catch (Exception ex)
{
workItems = new List<WorkItem>();
}
if (workItems.Count == 0 && loadSuccessInner)
{
break;
}
//... update to a different RunState
//... if set successful yield the result
//... else cleanup and retry
}
}
finally
{
if (context != null) context.Dispose();
}
}
I verified that EntityFramework (here with MS SQL Server adapter) does full server side query, which
translates to a simple query like this: SELECT TOP 10 field_1, field_2 FROM WorkItems WHERE field_2 = 0
The query usually takes <1ms and the timeout is left on default of
30s
I verified that there are no cancelation requests fired
This happens also when there is only a single worker and no one else is accessing the database. I'm aware that a timeout can happen when the resource is busy or blocked. But until now, I never saw a timeout on any other query yet.

(I'll update this answer whenever more information is being provided.)
Does someone see a flaw in the implementation?
Generally, your code looks fine.
What could cause the end of the async function?
Nothing in the code you showed should normally be an issue. Start by putting another try-catch block inside the loop, to ensure, that no other exceptions are getting thrown anywhere else (especially later in the not shown code):
public async IAsyncEnumerable<List<WorkItem>> LoadBatchedWorkload([EnumeratorCancellation] CancellationToken token, int batchSize, int runID)
{
DataContext context = null;
try
{
context = GetNewContext();
List<WorkItem> workItems;
bool loadSuccessInner;
while (true)
{
try
{
// ... (the inner loop code)
}
catch (Exception e)
{
// TODO: Log the exception here using your favorite method.
throw;
}
}
}
finally
{
if (context != null) context.Dispose();
}
}
Take a look at your log and ensure, that the log does not show any exceptions being thrown. Then additionally log every possible exit condition (break and return) from the loop, to find out how and why the code exits the loop.
If there are no other break or return statements in your code, then the only way the code can exit from the loop is if zero workItems are successfully returned from the database.
What could cause the timeout?
Make sure, that any Task returning/async methods you call are being called using await.
To track down, where the exceptions are actually coming from, you should deploy a Debug release with pdb files to get a full stack trace with source code line references.
You can also implement a DbCommandInterceptor and trace failing commands on your own:
public class TracingCommandInterceptor : DbCommandInterceptor
{
public override void CommandFailed(DbCommand command, CommandErrorEventData eventData)
{
LogException(eventData);
}
public override Task CommandFailedAsync(DbCommand command, CommandErrorEventData eventData, CancellationToken cancellationToken = new CancellationToken())
{
LogException(eventData);
return Task.CompletedTask;
}
private static void LogException(CommandErrorEventData eventData)
{
if (eventData.Exception is SqlException sqlException)
{
// -2 = Timeout error
// See https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/cc645611(v=sql.105)?redirectedfrom=MSDN
if (sqlException.Number == -2)
{
var stackTrace = new StackTrace();
var stackTraceText = stackTrace.ToString();
// TODO: Do some logging here and output the stackTraceText
// and other helpful information like the command text etc.
// -->
}
}
}
}
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder.UseLoggerFactory(LoggingFactory);
optionsBuilder.UseSqlServer(connectionString);
optionsBuilder.EnableSensitiveDataLogging();
optionsBuilder.EnableDetailedErrors();
// Add the command interceptor.
optionsBuilder.AddInterceptors(new TracingCommandInterceptor());
base.OnConfiguring(optionsBuilder);
}
Additionally logging the command text of the failed command in the interceptor is also a good idea.

Related

Entity Framework Exception in async code is not caught

I have a service hosted on Azure, which will run in multiple instances simultaneously, it's async calls all the way up. In the deep of the chain, there is a method to save the data to database. I have the code to handle the exception when the multiple instances are trying to write the same data to database, but the exception never caught.
The Service hosted on Azure
public async Task Start(CancellationToken cancellationToken = default)
{
// Do something ....
await processedInventoryRepository.Commit(invenotyData).ConfigureAwait(false);
}
The Repository save data to database
public class Repository
{
public async Task Commit(InventoryData data)
{
try
{
await SaveHardware(data.Hardware).ConfigureAwait(false); ;
await SaveProduct(data.Product).ConfigureAwait(false); ;
await SaveInstall(data.Installs).ConfigureAwait(false); ;
}
// Exception handle not handled here
catch (DbUpdateException ex)
{
var innerException = (SqlException)ex.InnerException;
if (innerException != null && (innerException.Number == 2627 || innerException.Number == 2601))
{
// log the error;
}
}
}
private async Task SaveInstall(DbContext _context, Install installs)
{
_context.Installs.Add(installs);
await _context.SaveChangesAsync().ConfigureAwait(false);
}
private async Task SaveProduct(DbContext _context, Porduct product)
{
try
{
if (!_context.Products.Any(p => p.Id == product.Id))
{
_context.Products.Add(product);
await _context.SaveChangesAsync().ConfigureAwait(false);
}
}
// Exception not handled here
catch (DbUpdateException ex)
{
var innerException = (SqlException)ex.InnerException;
if (innerException != null && (innerException.Number == 2627 || innerException.Number == 2601))
{
// log the error
}
}
}
There is nothing special in the data saving to the database, but worth to mention that Install has a foreign key of Product.
The exception happened in the SaveProduct Method, which is expected, as multiple instance are racing to add the same lookups. That's why the catch clause is placed there trying to ignore this error and let the execution carry on. _context.Produts.Add(product) is OK, but exception is thrown when the SaveChanges is called. none of those catch works.
It's hard to get the stripped code to replicate the problem, I had same code in web api, and trigger it by using postman, it works as there is no exception. But in Azure, once there are multiple instances are running, the exceptions are happening all the time.
The exceptions are OK, but I just can't figure out how to handle them. Thanks in advance.

Handling domain errors in MassTransit

I'm wondering how I should handle domain exceptions in a proper way?
Does all of my consumer's code should be wrapped into a try, catch block, or I should just thrown an Exception, which will be handled by apropriate FaultConsumer?
Consider this two samples:
Example-1 - whole operation is wrapped into try...catch block.
public async Task Consume(ConsumeContext<CreateOrder> context)
{
try
{
//Consumer that creates order
var order = new Order();
var product = store.GetProduct(command.ProductId); // check if requested product exists
if (product is null)
{
throw new DomainException(OperationCodes.ProductNotExist);
}
order.AddProduct(product);
store.SaveOrder(order);
context.Publish<OrderCreated>(new OrderCreated
{
OrderId = order.Id;
});
}
catch (Exception exception)
{
if (exception is DomainException domainException)
{
context.Publish<CreateOrderRejected>(new CreateOrderRejected
{
ErrorCode = domainException.Code;
});
}
}
}
Example-2 - MassTransit handles DomainException, by pushing message into CreateOrder_error queue. Another service subscribes to this event, and after the event is published on this particular queue, it process it;
public async Task Consume(ConsumeContext<CreateOrder> context)
{
//Consumer that creates order
var order = new Order();
var product = store.GetProduct(command.ProductId); // check if requested product exists
if (product is null)
{
throw new DomainException(OperationCodes.ProductNotExist);
}
order.AddProduct(product);
store.SaveOrder(order);
context.Publish<OrderCreated>(new OrderCreated
{
OrderId = order.Id;
});
}
Which approach should be better?
I know that I can use Request/Response and gets information about error immediately, but in my case, it must be done via message broker.
In your first example, you are handling a domain condition (in your example, a product not existing in the catalog) by producing an event that the order was rejected for an unknown product. This makes complete sense.
Now, if the database query to check the product couldn't connect to the database, that's a temporary situation that may resolve itself, and thus using a retry or scheduled redelivery makes sense - to try again before giving up entirely. Those are exceptions you would want to throw.
But the business exception you'd want to catch, and handle by publishing an event.
public async Task Consume (ConsumeContext<CreateOrder> context) {
try {
var order = new Order ();
var product = store.GetProduct (command.ProductId); // check if requested product exists
if (product is null) {
throw new DomainException (OperationCodes.ProductNotExist);
}
order.AddProduct (product);
store.SaveOrder (order);
context.Publish<OrderCreated> (new OrderCreated {
OrderId = order.Id;
});
} catch (DomainException exception) {
await context.Publish<CreateOrderRejected> (new CreateOrderRejected {
ErrorCode = domainException.Code;
});
}
}
My take on this is that you seem to go to the fire-and-forget commands mess. Of course, it is very context-specific, since there are scenarios, especially integration when you don't have a user on the other side sitting and wondering if their command was eventually executed and what is the outcome.
So, for integration scenarios, I concur with Chris' answer, publishing a domain exception event makes perfect sense.
For the user-interaction scenarios, however, I'd rather suggest using request-response that can return different kinds of response, like a positive and negative response, as described in the documentation. Here is the snippet from the docs:
Service side:
public class CheckOrderStatusConsumer :
IConsumer<CheckOrderStatus>
{
public async Task Consume(ConsumeContext<CheckOrderStatus> context)
{
var order = await _orderRepository.Get(context.Message.OrderId);
if (order == null)
await context.RespondAsync<OrderNotFound>(context.Message);
else
await context.RespondAsync<OrderStatusResult>(new
{
OrderId = order.Id,
order.Timestamp,
order.StatusCode,
order.StatusText
});
}
}
Client side:
var (statusResponse,notFoundResponse) = await client.GetResponse<OrderStatusResult, OrderNotFound>(new { OrderId = id});
// both tuple values are Task<Response<T>>, need to find out which one completed
if(statusResponse.IsCompletedSuccessfully)
{
var orderStatus = await statusResponse;
// do something
}
else
{
var notFound = await notFoundResponse;
// do something else
}

EF SqlQuery throws not supported MultipleActiveResultSets exception if previous operation timeouts out

I have the following logic:
public async Task UpdateData(DbContext context)
{
try
{
await LongUpdate(context);
}
catch (Exception e)
{
try
{
await context.Database.ExecuteSqlCommandAsync($#"update d set d.UpdatedAt = GETDATE() from SomeTable d where id > 11");
}
catch (Exception ex)
{
throw;
}
}
}
// this operations takes about 1 minute
private static async Task<int> LongUpdate(DbContext context)
{
context.Database.CommandTimeout = 5; // change this to 15 to see MultipleActiveResultSets exception
return await context.Database.SqlQuery<int>($#"update otherTable set UpdatedAt = GETDATE();SELECT ##ROWCOUNT").FirstOrDefaultAsync();
}
As presented above there are two update operations both awaited.
LongUpdate takes more than minute.
When timeout is set to 5s:
LongUpdate throws timeout exception and the second update is executed successfully.
When I increase timeout to 15s or more:
LongUpdate throws timeout exception but second update immediately throws: System.InvalidOperationException: The connection does not support MultipleActiveResultSets..
Shouldn’t await prevent this exception?
Why this depends on timeout value?
According to EF docs Database property should not be used in a way you do. So because it is as i think incorrect way we could not even consider what is happenning. All you db operations should go via Database Context using DbSet<T> with Save or SaveAsyncmethod ofDbContext` call after changes in datasets. Of course you could execute raw sql but other way like this:
public static IList<StockQuote> GetLast(this DbSet<StockQuote> dataSet, int stockId)
{
IList<StockQuote> lastQuote = dataSet.FromSqlRaw("SELECT * FROM stockquote WHERE StockId = {0} ORDER BY Timestamp DESC LIMIT 1", new object[] { stockId })
.ToList();
return lastQuote;
}
To create DbContext (in below example to MySql) with command timeout you coulde use something like this:
public static class ServiceCollectionExtension
{
public static IServiceCollection ConfigureMySqlServerDbContext<TContext>(this IServiceCollection serviceCollection, string connectionString,
ILoggerFactory loggerFactory, int timeout = 600)
where TContext : DbContext
{
return serviceCollection.AddDbContext<TContext>(options => options.UseQueryTrackingBehavior(QueryTrackingBehavior.TrackAll)
.UseLoggerFactory(loggerFactory)
.UseMySql(connectionString, ServerVersion.AutoDetect(connectionString), sqlOptions => sqlOptions.CommandTimeout(timeout))
.UseLazyLoadingProxies());
}
}
just call services.ConfigureMySqlServerDbContext<ModelContext>(Settings.ConnectionString, loggerFactory);
I think if you change your approach you get rid of exceptions.
Shouldn’t await prevent this exception?
It depends on your pattern. We need to ensure that all access is sequential. In another word, the second asynchronous request on the same DbContext instance shouldn't start before the first request finishes (and that's the whole point). Although This is typically done by using the await keyword on each async operation, in some cases we may not achieve it. In your case, the first part of LongUpdate method execution, context.Database.SqlQuery<int>() is not an async method itself. It will provide results synchronously for FirstOrDefaultAsync(). I think this is not a problem with EF async behavior.
Why does it depend on the timeout value?
After a specific amount of time, the SQL query execution enters a critical state that can't leave it without spending more time than what you set as CommandTimeout, but your code moves forward and, the exception happens.
Note the applications that have IO-related contention will benefit the most from using asynchronous queries and save operations according to Performance considerations for EF 4, 5, and 6. The page EF async methods are slower than non-async lists some noticeable points.
The command timeout is distinct from the connection timeout. A value set with this API for the command timeout will override any value set in the connection string. Database.CommandTimeout Property is use for Gets or sets the timeout value, in seconds, for all context operations.
private static async Task<int> LongUpdate(DbContext context)
{
context.Database.CommandTimeout = 5; // change this to 15 to see MultipleActiveResultSets exception
return await context.Database.SqlQuery<int>($#"update otherTable set UpdatedAt = GETDATE();SELECT ##ROWCOUNT").FirstOrDefaultAsync();
}
here you set CommandTimeout, If your query not execute in 5 second then TimeoutException fired and after that you are trying to execute another query in catch block, but you use same context here, which is already timeout and its throws: System.InvalidOperationException:.
So to fix this you have to initialize your context again.
public async Task UpdateData(DbContext context)
{
try
{
await LongUpdate(context);
}
catch (Exception e)
{
try
{
context = new MyContext()// initialize your DbContext here.
await context.Database.ExecuteSqlCommandAsync($#"update d set d.UpdatedAt = GETDATE() from SomeTable d where id > 11");
}
catch (Exception ex)
{
throw;
}
}
}

Exceptions are just ignored in async code block

Before I use Nito.MVVM, I used plain async/await and it was throwing me an aggregate exception and I could read into it and know what I have. But since Nito, my exceptions are ignored and the program jumps from async code block and continue executes. I know that it catch exceptions because when I put a breakpoint on catch(Exception ex) line it breaks here but with ex = null. I know that NotifyTask has properties to check if an exception was thrown but where I put it, it checks when Task is uncompleted, not when I need it.
View model:
public FileExplorerPageViewModel(INavigationService navigationService)
{
_navigationService = navigationService;
_manager = new FileExplorerManager();
Files = NotifyTask.Create(GetFilesAsync("UniorDev", "GitRemote/GitRemote"));
}
Private method:
private async Task<ObservableCollection<FileExplorerModel>> GetFilesAsync(string login, string reposName)
{
return new ObservableCollection<FileExplorerModel>(await _manager.GetFilesAsync(login, reposName));
}
Manager method(where exception throws):
public async Task<List<FileExplorerModel>> GetFilesAsync(string login, string reposName)
{
//try
//{
var gitHubFiles = await GetGitHubFilesAsync(login, reposName);
var gitRemoteFiles = new List<FileExplorerModel>();
foreach ( var file in gitHubFiles )
{
if ( file.Type == ContentType.Symlink || file.Type == ContentType.Submodule ) continue;
var model = new FileExplorerModel
{
Name = file.Name,
FileType = file.Type.ToString()
};
if ( model.IsFolder )
{
var nextFiles = await GetGitHubFilesAsync(login, reposName);
var count = nextFiles.Count;
}
model.FileSize = file.Size.ToString();
gitRemoteFiles.Add(model);
}
return gitRemoteFiles;
//}
//catch ( WebException ex )
//{
// throw new Exception("Something wrong with internet connection, try to On Internet " + ex.Message);
//}
//catch ( Exception ex )
//{
// throw new Exception("Getting ExplorerFiles from github failed! " + ex.Message);
//}
}
With try/catch or without it has the same effect. This behavior is anywhere where I have NotifyTask.
Update
There is no event, that fires when exception occurred, but there is Property Changed event, so I used it and added this code:
private void FilesOnPropertyChanged(object sender, PropertyChangedEventArgs propertyChangedEventArgs)
{
throw new Exception("EXCEPTION");
bool failed;
if ( Files.IsFaulted )
failed = true;
}
And exception not fires.
I added throw exception in App class (main class) and it fired. And when I have exceptions that come from XAML, it also fires. So maybe it not fires when it comes from a view model, or something else. I have no idea. Will be very happy for some help with it.
Update
We deal with exception = null, but the question is still alive. What I wanna add, that I rarely this issue, when the app is starting to launch on the physic device. I read some info about it, and it doesn't seem to be related, but maybe it is:
I'm not entirely sure what your desired behavior is, but here's some information I hope you find useful.
NotifyTask is a data-bindable wrapper around Task. That's really all it does. So, if its Task faults with an exception, then it will update its own data-bindable properties regarding that exception. NotifyTask is intended for use when you want the UI to respond to a task completing, e.g., show a spinner while the task is in progress, an error message if the task faults, and data if the task completes successfully.
If you want your application to respond to the task faulting (with code, not just a UI update), then you should use try/catch like you have commented out in GetFilesAsync. NotifyTask doesn't change how those exceptions work; they should work just fine.
I know that it catch exceptions because when I put a breakpoint on catch(Exception ex) line it breaks here but with ex = null.
That's not possible. I suggest you try it again.
I know that NotifyTask has properties to check if an exception was thrown but where I put it, it checks when Task is uncompleted, not when I need it.
If you really want to (asynchronously) wait for the task to complete and then check for exceptions, then you can do so like this:
await Files.TaskCompleted;
var ex = Files.InnerException;
Or, if you just want to re-raise the exception:
await Files.Task;
Though I must say this usage is extremely unusual. The much more proper thing to do is to have a try/catch within your GetFilesAsync.

How to in case of timeout to execute method again and again until it completes successfully?

I have asp.net application. All business logic in business layer.
Here is the example of the method
public void DoSomething()
{
PersonClass pc = new PersonClass();
pc.CreatePerson();
pc.AssignBasicTask();
pc.ChangePersonsStatus();
pc.CreateDefaultSettings();
}
what happens once in a while, one of the sub method can timeout, so as a result the process can be incompleted.
what I think in this case to make sure all steps completed properly is
public void DoSomething()
{
PersonClass pc = new PersonClass();
var error = null;
error = pc.CreatePerson();
if(error != timeout exception)
error = pc.AssignBasicTask();
else
return to step above
if(error != timeout exception)
error = pc.ChangePersonsStatus();
else
return to step above
if(error != timeout exception)
error = pc.CreateDefaultSettings();
else
return to step above
}
but it's just an idea, more then sure it's a proper way how to handle this.
Of course, this can be done more or less elegantly, with different options for timing out or giving up - but an easy way to achieve what you want, would be to define a retry method which keeps retrying an action until it succeeds:
public static class RetryUtility
{
public T RetryUntilSuccess<T>(Func<T> action)
{
while(true)
{
try
{
return action();
}
catch
{
// Swallowing exceptions is BAD, BAD, BAD. You should AT LEAST log it.
}
}
}
public void RetryUntilSuccess(Action action)
{
// Trick to allow a void method being passed in without duplicating the implementation.
RetryUntilSuccess(() => { action(); return true; });
}
}
Then do
RetryUtility.RetryUntilSuccess(() => pc.CreatePerson());
RetryUtility.RetryUntilSuccess(() => pc.AssignBasicTask());
RetryUtility.RetryUntilSuccess(() => pc.ChangePersonsStatus());
RetryUtility.RetryUntilSuccess(() => pc.CreateDefaultSettings());
I must urge you to think about what to do if the method keeps failing, you could be creating an infinite loop - perhaps it should give up after N retries or back off with exponentially raising retry time - you will need to define that, since we cannot know enough about your problem domain to decide that.
You have it pretty close to correct in your psuedo-code, and there a lot of ways to do this, but here is how I would do it:
PersonClass pc = new PersonClass();
while(true)
if(pc.CreatePerson())
break;
while(true)
if(pc.AssignBasicTask())
break;
This assumes that your methods return true to indicate success, false to indicate a timeoiut failure (and probably an exception for any other kind of failure). And while I didn't do it here, I would strongly recommend some sort of try counting to make sure it doesn't just loop forever and ever.
Use a TransactionScope for to make sure everything is executed as a unit. More info here: Implementing an Implicit Transaction using Transaction Scope
You should never retry a timed out operation infinitely, you may end up hanging the server or with an infinite loop or both. There should always be a threshold of how many retries is acceptable to attempt before quitting.
Sample:
using(TransactionScope scope = new TransactionScope())
{
try
{
// Your code here
// If no errors were thrown commit your transaction
scope.Complete();
}
catch
{
// Some error handling
}
}

Categories

Resources