StackExchange.Redis Transaction chaining parameters - c#

I'm trying to execute a basic transactional operation that contains two operations
Get the length of a set scard MySet
Pop the entire set with the given length : spop MySet len
I know it is possible to use smembers and del consecutively. But what I want to achieve is to get the output of the first operation and use it in the second operation and do it in a transaction. Here is what I tried so far:
var transaction = this.cacheClient.Db1.Database.CreateTransaction();
var itemLength = transaction.SetLengthAsync(key).ContinueWith(async lengthTask =>
{
var length = await lengthTask;
try
{
// here I want to pass the length argument
return await transaction.SetPopAsync(key, length); // probably here transaction is already committed
// so it never passes this line and no exceptions thrown.
}
catch (Exception ex)
{
throw;
}
});
await transaction.ExecuteAsync();
Also, I tried the same thing with CreateBatch and get the same result. I'm currently using the workaround I mentioned above. I know it is also possible to evaluate a Lua script but I want to know is it possible with transactions or am I doing something terribly wrong.

The nature of redis is that you cannot read data during multi/exec - you only get results when the exec runs, which means it isn't possible to use those results inside the multi. What you are attempting is kinda doomed. There are two ways of doing what you want here:
Speculatively read what you need, then perform a multi/exec (transaction) block using that knowledge as a constraint, which SE.Redis will enforce inside a WATCH block; this is really complex and hard to get right, quite honestly
Use Lua, meaning: ScriptEvaluate[Async], where you can do everything you want in a series of operations that execute contiguously on the server without competing with other connections
Option 2 is almost always the right way to do this, ever since it became possible.

Related

C# BulkWriteAsync, Transactions and Results

I am relatively new to working with mongodb. Currently I am getting a little more familiar with the API and especially with C# drivers. I have a few understanding questions around bulk updates. As the C# driver offers a BulkWriteAsync method, I could read a lot about it in the mongo documentation. As I understand, it is possible to configure the BulkWrite not to stop in case of an error at any step. This can be done by use the unordered setting. What I did not found is, what happens to the data. Does the database do a rollback in case of an error? Or do I have to use a surrounding by myself? In case of an error: can I get details of which step was not successful? Think of a bulk with updates on 100 documents. Can I find out, which updates were not successfull? As the BulkWriteResult offers very little information, I am not sure if this operation is realy a good one for me.
thanks in advance
You're right in that BulkWriteResult doesn't provide the full set of information to make a call on what to do.
In the case of a MongoBulkWriteException<T>, however, you can access the WriteErrors property to get the indexes of models that errored. Here's a pared down example of how to use the property.
var models = sourceOfModels.ToArray();
for (var i = 0; i < MaxTries; i++)
try
{
return await someCollection.BulkWriteAsync(models, new BulkWriteOptions { IsOrdered = false });
}
catch (MongoBulkWriteException e)
{
// reconstitute the list of models to try from the set of failed models
models = e.WriteErrors.Select(x => models[x.Index]).ToArray();
}
Note: The above is very naive code. My actual code is more sophisticated. What the above does is try over and over to do the write, in each case, with only the outstanding writes. Say you started with 1000 ReplaceOne<T> models to write, and 900 went through; the second try will try against the remaining 100, and so on until retries are exhausted, or there are no errors.
If the code is not within a transaction, and an error occurs, of course nothing is rolled back; you have some writes that succeed and some that do not. In the case of a transaction, the exception is still raised (MongoDB 4.2+). Prior to that, you would not get an exception.
Finally, while the default is ordered writes, unordered writes can be very useful when the writes are unrelated to one another (e.g. documents representing DDD aggregates where there are no dependencies). It's this same "unrelatedness" that also obviates the need for a transaction.

Use CloudBlob.ExistsAsync vs catch StorageException.BlobNotFound, in terms of performance?

I want to download a file from Azure Blob Storage which may not exist yet. And I'm looking for the most reliable and performant way into handling this. To this end, I've found two options that both work:
Option 1: Use ExistsAsync()
Duration: This takes roughly 1000~1100ms to complete.
if (await blockBlob.ExistsAsync())
{
await blockBlob.DownloadToStreamAsync(ms);
return ms;
}
else
{
throw new FileNotFoundException();
}
Option 2: Catch the exception
Duration: This takes at least +1600ms, everytime.
try
{
await blockBlob.DownloadToStreamAsync(ms);
return ms;
}
catch (StorageException e)
{
// note: there is no 'StorageErrorCodeStrings.BlobNotFound'
if (e.RequestInformation.ErrorCode == "BlobNotFound")
throw new FileNotFoundException();
throw;
}
The metrics are done through simple API calls on a webapi, which consumes the above functions and returns an appropriate message. I've manually tested the end-to-end scenario here, through Postman. There is some overhead in this approach of course. But summarized, it seems the ExistsAsync() operation consequently saves at least 500ms. At least on my local machine, whilst debugging. Which is a bit remarkable, because the DoesServiceRequest attribute on ExistsAsync() seems to indicate it is another expensive http call that needs to be made.
Also, the ExistsAsync API docs don't say anything about it's usage or any side-effects.
A blunt conclusion, based on poor man's testing, would therefor lead me to option no. 1, because:
it's faster in debug/localhost (the catch; says nothing about compiled in prod)
to me it's more eloquent, especially because also the ErrorCode needs manual checking of a particular code
I would assume the ExistsAsync() operation is there for this exact reason
But here is my actual question: is this the correct interpration of the usage of ExistsAsync()?
E.g. is the "WHY" it exists = to be more efficiƫnt than simply catching a not found exception, particularly for performance reasons?
Thanks!
But here is my actual question: is this the correct interpration of the usage of ExistsAsync()?
You can easily take a look at the implementation yourself.
ExistsAsync() is just a wrapper around an http call that throws an http not found if the blob is not there and return false in that case. True otherwise.
I'd say go for ExistsAsync as it seems the most optimal way, especially if you count on the fact that sometimes the blob is not there. DownloadToStreamAsync has more work to do in terms of wrapping the exception in a StorageException and maybe do some more cleanup.
I would assume the ExistsAsync() operation is there for this exact reason
Consider this: sometimes you just want to know if a given blob exists without being interested in the content. For example to give a warning that something will be overwritten when uploading. In that case using ExistsAsync is a nice use case because using DownloadToStreamAsync will be expensive for just a check on existence since it will download the content if the blob is there.

best way to catch database constraint errors

I am calling a stored procedure that inserts data in to a sql server database from c#. I have a number of constraints on the table such as unique column etc. At present I have the following code:
try
{
// inset data
}
catch (SqlException ex)
{
if (ex.Message.ToLower().Contains("duplicate key"))
{
if (ex.Message.ToLower().Contains("url"))
{
return 1;
}
if (ex.Message.ToLower().Contains("email"))
{
return 2;
}
}
return 3;
}
Is it better practise to check if column is unique etc before inserting the data in C#, or in store procedure or let an exception occur and handle like above? I am not a fan of the above but looking for best practise in this area.
I view database constraints as a last resort kind of thing. (I.e. by all means they should be present in your schema as a backup way of maintaining data integrity.) But I'd say the data should really be valid before you try to save it in the database. If for no other reason, then because providing feedback about invalid input is a UI concern, and a data validity error really shouldn't bubble up and down the entire tier stack every single time.
Furthermore, there are many sorts of assertions you want to make about the shape of your data that can't be expressed using constraints easily. (E.g. state transitions of an order. "An order can only go to SHIPPED from PAID" or more complex scenarios.) That is, you'd need to use involving procedural-language based checks, ones that duplicate even more of your business logic, and then have those report some sort of error code as well, and include yet more complexity in your app just for the sake of doing all your data validation in the schema definition.
Validation is inherently hard to place in an app since it concerns both the UI and is coupled to the model schema, but I veer on the side of doing it near the UI.
I see two questions here, and here's my take...
Are database constraints good? For large systems they're indepensible. Most large systems have more than one front end, and not always in compatible languages where middle-tier or UI data-checking logic can be shared. They may also have batch processes in Transact-SQL or PL/SQL only. It's fine to duplicate the checking on the front end, but in a multi-user app the only way to truly check uniqueness is to insert the record and see what the database says. Same with foreign key constraints - you don't truly know until you try to insert/update/delete.
Should exceptions be allowed to throw, or should return values be substituted? Here's the code from the question:
try
{
// inset data
}
catch (SqlException ex)
{
if (ex.Message.ToLower().Contains("duplicate key"))
{
if (ex.Message.ToLower().Contains("url"))
{
return 1; // Sure, that's one good way to do it
}
if (ex.Message.ToLower().Contains("email"))
{
return 2; // Sure, that's one good way to do it
}
}
return 3; // EVIL! Or at least quasi-evil :)
}
If you can guarantee that the calling program will actually act based on the return value, I think the return 1 and return 2 are best left to your judgement. I prefer to rethrow a custom exception for cases like this (for example DuplicateEmailException) but that's just me - the return values will do the trick too. After all, consumer classes can ignore exceptions just as easily as they can ignore return values.
I'm against the return 3. This means there was an unexpected exception (database down, bad connection, whatever). Here you have an unspecified error, and the only diagnostic information you have is this: "3". Imagine posting a question on SO that says I tried to insert a row but the system said '3'. Please advise. It would be closed within seconds :)
If you don't know how to handle an exception in the data class, there's no way a consumer of the data class can handle it. At this point you're pretty much hosed so I say log the error, then exit as gracefully as possible with an "Unexpected error" message.
I know I ranted a bit about the unexpected exception, but I've handled too many support incidents where the programmer just sequelched database exceptions, and when something unexpected came up the app either failed silently or failed downstream, leaving zero diagnostic information. Very naughty.
I would prefer a stored procedure that checks for potential violations before just throwing the data at SQL Server and letting the constraint bubble up an error. The reasons for this are performance-related:
Performance impact of different error handling techniques
Checking for potential constraint violations before entering SQL Server TRY and CATCH logic
Some people will advocate that constraints at the database layer are unnecessary since your program can do everything. The reason I wouldn't rely solely on your C# program to detect duplicates is that people will find ways to affect the data without going through your C# program. You may introduce other programs later. You may have people writing their own scripts or interacting with the database directly. Do you really want to leave the table unprotected because they don't honor your business rules? And I don't think the C# program should just throw data at the table and hope for the best, either.
If your business rules change, do you really want to have to re-compile your app (or all of multiple apps)? I guess that depends on how well-protected your database is and how likely/often your business rules are to change.
I did something like this:
public class SqlExceptionHelper
{
public SqlExceptionHelper(SqlException sqlException)
{
// Do Nothing.
}
public static string GetSqlDescription(SqlException sqlException)
{
switch (sqlException.Number)
{
case 21:
return "Fatal Error Occurred: Error Code 21.";
case 53:
return "Error in Establishing a Database Connection: 53.";
default
return ("Unexpected Error: " + sqlException.Message.ToString());
}
}
}
Which allows it to be reusable, and it will allow you to get the Error Codes from SQL.
Then just implement:
public class SiteHandler : ISiteHandler
{
public string InsertDataToDatabase(Handler siteInfo)
{
try
{
// Open Database Connection, Run Commands, Some additional Checks.
}
catch(SqlException exception)
{
SqlExceptionHelper errorCompare = new SqlExceptionHelper(exception);
return errorCompare.ToString();
}
}
}
Then it is providing some specific errors for common occurrences. But as mentioned above; you really should ensure that you've tested your data before you just input it into your database. That way no mismatched constraints surface or exists.
Hope it points you in a good direction.
Depends on what you're trying to do. Some things to think about:
Where do you want to handle your error? I would recommend as close to the data as possible.
Who do you want to know about the error? Does your user need to know that 'you've already used that ID'...?
etc.
Also -- constraints can be good -- I don't 100% agree with millimoose's answer on that point -- I mean, I do in the should be this way / better performance ideal -- but practically speaking, if you don't have control over your developers / qc, and especially when it comes to enforcing rules that could blow your database up (or otherwise, break dependent objects like reports, etc. if a duplicate key were to turn-up somewhere, you need some barrier against (for example) the duplicate key entry.

How to avoid geometric slowdown with large Linq transactions?

I've written some really nice, funky libraries for use in LinqToSql. (Some day when I have time to think about it I might make it open source... :) )
Anyway, I'm not sure if this is related to my libraries or not, but I've discovered that when I have a large number of changed objects in one transaction, and then call DataContext.GetChangeSet(), things start getting reaalllly slooowwwww. When I break into the code, I find that my program is spinning its wheels doing an awful lot of Equals() comparisons between the objects in the change set. I can't guarantee this is true, but I suspect that if there are n objects in the change set, then the call to GetChangeSet() is causing every object to be compared to every other object for equivalence, i.e. at best (n^2-n)/2 calls to Equals()...
Yes, of course I could commit each object separately, but that kinda defeats the purpose of transactions. And in the program I'm writing, I could have a batch job containing 100,000 separate items, that all need to be committed together. Around 5 billion comparisons there.
So the question is: (1) is my assessment of the situation correct? Do you get this behavior in pure, textbook LinqToSql, or is this something my libraries are doing? And (2) is there a standard/reasonable workaround so that I can create my batch without making the program geometrically slower with every extra object in the change set?
In the end I decided to rewrite the batches so that each individual item is saved independently, all within one big transaction. In other words, instead of:
var b = new Batch { ... };
while (addNewItems) {
...
var i = new BatchItem { ... };
b.BatchItems.Add(i);
}
b.Insert(); // that's a function in my library that calls SubmitChanges()
.. you have to do something like this:
context.BeginTransaction(); // another one of my library functions
try {
var b = new Batch { ... };
b.Insert(); // save the batch record immediately
while (addNewItems) {
...
var i = new BatchItem { ... };
b.BatchItems.Add(i);
i.Insert(); // send the SQL on each iteration
}
context.CommitTransaction(); // and only commit the transaction when everything is done.
} catch {
context.RollbackTransaction();
throw;
}
You can see why the first code block is just cleaner and more natural to use, and it's a pity I got forced into using the second structure...

how to use exceptions in this scenario?

I have a method which handles a set of records.This method,return true\false after processing.So,if all the records are processed(doing some db updates),will return true.Now,suppose after processing 1 record,some exception is thrown,should I write result=false(at the end of method result is returned) in catch block? And,allow processing of other records to be done?
Continuing to add data to the dbase when adding one record failed is almost always wrong. Records are very frequently related. They represent a set of transactions on a bank account. Or a batch of orders from a customer. Adding these with one of them missing is always a problem.
Not only do you give your client a huge problem coming up with a new batch that contains the single corrected record, you make it far too easy to allow somebody to just ignore the error. The kind of error that doesn't get discovered or causes problems until much later. Invariably with a huge cost associated with correcting the error.
When an error occurs, reject the entire batch. Keep the dbase in a proper state by using transactions. Use, say, SqlTransaction and call BeginTransaction() when you start. Call Commit() when everything worked, call Rollback() in your catch clause.
Your client can now go back to the sub-system that generates the records, make the correction and re-run your program. Your dbase will always contain a proper copy of that sub-system's data. And errors cannot be ignored.
How you handle this with exceptions will be determined very much by what you want to happen in the event of something going wrong. You could, as you say, just write result= false in your catch blocks, but this means you are simply saying to the calling function "Hey - some records were not processed - live with it...". That might be enough for you - it depends what you're trying to do.
At the very least though, I would want to also write the details of the exceptions away to a log. And if you don't have a method somewhere that takes an exception and writes away to a log, it's time to write one (or use a third party solution...)
Otherwise you are losing information that could be useful in determining why things failed...
Whether you process those records you can or throw everything out in the event of a problem is a design question that only you can answer - we don't have the context...
I think it could be something like that
int count = 0;
foreach( item in list)
{
try
{
//update DB
++count;
}
catch(Exception ex)
{
//log exception
}
if(count == list.Count)
return true;
else return false;
}
Another way
bool result = true;
foreach( item in list)
{
try
{
//update DB
}
catch(Exception ex)
{
//log exception
result = false;
}
return result;
}

Categories

Resources