We have noticed that when a DevForce request times out, it is automatically retried. This behavior has also been mentioned on the forums here. In that forum post, the suggested solution is to increase the timeout to try to avoid the problem altogether. For us, that is not really a possible solution. There are some operations that we know will timeout and increasing the timeout is not an acceptable solution.
Worse, if the call is a Stored Procedure query or an InvokeServerMethod call, it's very possible that the call is not idempotent so retrying it again is not safe and could very likely end up doing more harm than good. We've started running into cases like that in our app and it is causing major pains. A simple example would be: we call a stored procedure that creates a Copy of an item. If the copy takes too long, it will keep getting retried but that just means we have 3 copy operations all going in parallel. The end result is that the end user gets an error (because the 3rd rety still times out) but there (eventually) will be three copies of the item (the stored procedure will eventually finish - the retry logic doesn't seem to cancel the previous requests - and I'm not even sure such cancelling is possible). And that is one of the more benign examples - in other cases, the retried operations can cause even worse problems.
I see from the 6.1.6 release notes, that DevForce no longer performs automatic retry for Saves. I'd really like to see that behavior extended to StoredProcedureQueries and InvokeServerMethods. For normal EntityQuery operations (and probably even Connect/Disconnect calls), I'm fine with the rety. If this isn't something that can be changed in the core of DevForce, is there a way to make it configurable or provide some custom way for us to inject code that controls this?
The auto retry behavior for communication failures is configurable in the 7.2.4 release now available. See the release notes for usage information.
Related
By looking at this official documentation it seems that there are basically three types of errors thrown by the MongoDB C# driver:
errors thrown when the driver is not able to properly select or connect to a Server to issue the query against. These errors lead to a TimeoutException
errors thrown when the driver has successfully selected a Server to run the query against, but the server goes down while the query is being executed. These errors manifest themselves as MongoConnectionException
errors thrown during a write operations. These errors leads to MongoWriteException or MongoBulkWriteException depending on the type of write operation being performed.
I'm trying to make my software using MongoDB a bit more resilient to transient errors, so I want to find which exceptions are worth retry.
The problem is not implementing a solid retry policy (I usually employ Polly .NET for that), but instead understanding when the retry makes sense.
I think that retrying on exceptions of type TimeoutException doesn't make sense, because the driver itself waits for a few seconds before timing out an operation (the default is 30 seconds, but you can change that via the connection string options). The idea is that retry the operation after you have waited for 30 seconds before timing out is probably a waste of time. For instance if you decide to implement 3 retries with 1 second of waiting time between them, it takes up to 93 seconds to fail an operation (30 + 30 + 30 + 1 + 1 + 1). This is a huge time.
As documented here retrying on MongoConnectionException is only safe when doing idempotent operations. From my point of view, it makes sense to always retry on these kind of errors provided that the performed operation is idempotent.
The hard bit in deciding a good retry strategy for writes is when you get an exception of type MongoWriteException or MongoBulkWriteException.
Regarding the exceptions of type MongoWriteException is probably worth retrying all the exceptions having a ServerErrorCategory other than DuplicateKey. As documented here you can detect the duplicate key errors by using this property of the MongoWriteException.WriteError object.
Retrying duplicate key errors probably doesn't make sense because you will get them again (that's not a transient error).
I have no idea how to handle errors of type MongoBulkWriteException safely. In that case you are inserting multiple documents to MongoDB and it is entirely possible that only some of them have failed, while the others have been successfully written to MongoDB. So retrying the exact same bulk insert operation could lead to write the same document twice (bulk writes are not idempotent in nature). How can I handle this scenario ?
Do you have any suggestion ?
Do you know any working example or reference regarding retrying queries on MongoDB for the C# driver ?
Retry
Let's start with the basics of Retry.
There are situation where your requested operation relies on a resource, which might not be reachable in a certain point of time. In other words there can be a temporal issue, which will vanish sooner or later. This sort of issues can cause transient failures. With retries you can overcome these problems by attempting to redo the same operation in a specific moment in the future. To be able to use this mechanism the following criteria group should be met:
The potentially introduced observable impact is acceptable
The operation can be redone without any irreversible side effect
The introduced complexity is negligible compared to the promised reliability
Let’s review them one by one:
The word failure indicates that the effect is observable by the requester as well, for example via higher latency / reduced throughput / etc.. If the “penalty“ (delay or reduced performance) is unacceptable then retry is not an option for you.
This requirement is also known as idempotent operation. If I call the action with the same input several times then it will produce the exact same result. In other words, the operation acts like it only depends on its parameter and nothing else influences the result (like other objects' state).
This condition is even though one of the most crucial, this is the one that is almost always forgotten. As always there are trade-offs (If I introduce Z then it will increase X but it might decrease Y) and we should be fully aware of them.
Unless it will give us some unwanted surprises in the least expected time.
Mongo Exception
Let's continue with exceptions that the MongoDb's C# client can throw.
I haven't used MongoDb in last couple of years so this knowledge may have been outdated. But I hope the essence did not change since.
I would also encourage you to introduce detection logic first (catch and log) before you try to mitigate the problem (for example with retry). This will give information about the frequency and amount of occurrences. It will also give you insight about the nature of the problems.
MongoConnectionException with a SocketException as Inner
When:
There is server selection problem
The connection has timed out
The chosen server is unavailable
Retry:
If the problem is due to network issue then it might be useful to retry
If the root cause is misconfiguration then retry won't help
Log:
ConnectionId and Message
ToJson might be useful as well
MongoWriteException or MongoWriteConcernException
When:
There was a persistence problem
Retry:
It depends, if you perform a create operation and the server can detect duplicates (DuplicateKeyError) then it is better to try to write the record multiple times then have one failed write attempt
Most of the time updates are not idempotent but if you use some sort of record versioning then you can try to perform a retry and fail during the optimistic locking
Deletion could be implemented in an idempotent way. This is true for soft and hard delete as well.
Log:
WriteError, WriteConcernError and Message
In case of MongoWriteConcernExpcetion: Code, Command and Result
I have developed an Hangfire application using MVC running in IIS, and it is working absolutely fine, till I saw the size of my SQL Server log file, which grew whopping 40 GB overnight!!
As per information from our DBA, there was an long running transaction, with the following SQL statement (I have 2 hangfire queues in place)-
(#queues1 nvarchar(4000),#queues2 nvarchar(4000),#timeout float)
delete top (1) from [HangFire].JobQueue with (readpast, updlock, rowlock)
output DELETED.Id, DELETED.JobId, DELETED.Queue
where (FetchedAt is null or FetchedAt < DATEADD(second, #timeout, GETUTCDATE()))
and Queue in (#queues1,#queues2)
On exploring the Hangfire library, I found that it is used for dequeuing the jobs, and doing a very simple task that should not take any significant time.
I couldn't found anything that would have caused this error. transactions are used correctly with using statements and object are Disposed in event of exception.
As suggested in some posts, I have checked the recovery mode of my database and verified that it is simple.
I have manually killed the hanged transaction to reclaim the log file space, but it come up again after few hours. I am observing it continuously.
What could be the reason for such behavior? and how it can be prevented?
The issue seems to be intermittent, and it could be of extremely high risk to be deployed on production :(
Starting from Hangfire 1.5.0, Hangfire.SqlServer implementation wraps the whole processing of a background job with a transaction. Previous implementation used invisibility timeout to provide at least once processing guarantee without requiring a transaction, in case of an unexpected process shutdown.
I've implemented a new model for queue processing, because there were a lot of confusion for new users, especially ones who just installed Hangfire and played with it under a debugging session. There were a lot of questions like "Why my job is still under processing state?". I've considered there may be problems with transaction log growth, but I didn't know this may happen even with Simple Recovery Model (please see this answer to learn why).
It looks like there should be a switch, what queue model to use, based on transactions (by default) or based on invisibility timeout. But this feature will be available in 1.6 only and I don't know any ETAs yet.
Currently, you can use Hangfire.SqlServer.MSMQ or any other non-RDBMS queue implementations (please see the Extensions page). Separate database for Hangfire may also help, especially if your application changes a lot of data.
With my WCF service, I am solving an issue that has both performance and design effects.
The service is a stateless RESTful PerCall service, that does a lot of simple and common thins, which all work like a dandy.
But, there is one operation, that has started to scare me a lot recently, so there is the problem:
Clients make parametrized calls to the operation and the computation of the result requires lots of time to finish. But result to a call with identical parameters will always be the same, until data on the server change. And clients make an awful LOT of calls with exact the same parameters. The server, however, cannot predict the parameters, that the users will like, so sadly enough, the results cannot be precomputed.
So I came up with caching layer and store the result object as a key-value pair, where key represents the parameters which lead to this result. And if the relevant data change, I just flush the cache. Still simple and no problems with this.
Client calls the service, server receives the call, looks, whether the result is already cached and returns it, if so. But, if the result is not cached yet, the client starts the computation. The computation may take up to 2 minutes (average time 10-15 seconds) to finish and by that time, other clients may come and because the result is still not known to cache, each of them would start their own computation. Which is NOT what we really want, so there is a flag, if someone has already started the computation with this parameters this is the place in code, where other callers' code stops and waits for the computation to be finished and inserted into cache, from where each of the invoked instances will grab the result, return it to the client and dispose.
And this is the part, which I am really struggling with.
By now, my solution looks something like this (before you read further, I want to warn you, because my experience is not near decent level and I still am a big noob in all C#, WCF and related stuff... no need telling me I'm a noob, because I am fully aware of that):
Stopwatch sw = new Stopwatch();
sw.Start();
while (true)
{
if (Cache.Contains(parameters) || sw.Elapsed > threshold)
break;
Thread.Sleep(100);
}
...do relevant stuff here
As you see, there are more problems with this solution:
Having the loop, check and all this stuff does not only feel ugly, with many clients waiting this way, the resources tend to jump up.
If the operation fails (the initial caller's computation fails to deliver within the limits of threshold), I do not really know, which client has got to be next up trying the computation, or how, or even whether should I run the operation again, or return a fault to the client...
EDIT: This is not related to synchronization, I am aware of the need for locking in some parts of my application, so my concerns are not synchronization-reated.
What should I do when the relevant server-side data change while invoked code is still performing computation (resulting in such result being a wrong one). ... More over, this has some other horrible effects on the application, but yeah, I am getting to the question here:
So, like most of the time, I did my homework and performed qoogling around before asking, but did not succeed in finding some guidance that I would either understand or that would suit my issues and domain.
I got a strong feel, that I have to introduce some kind of (static?) events-based-and-or-asynchronous class (call it layer if you will), that does some tricks and organizes and manages all this things in some kind of a register-to-me-and-i-will-give-you-a-poke / poke-all-registered-threads manner. But despite being able (to certain extent) to use the newly introduced tasks, TPL, and async-await, I not only have very limited experience on this field, more sadly, I really really need help explaining how it could come together with events (or do I even need them?)... When i try / run little things in a test-console application, I might succeed, but bringing it into this bigger environment of my WCF application, I struggle to get a clue.
So guys I will gladly welcome every kind of relevant thoughts, advice, guidance, links, code and criticism touching my topic.
I am aware of the fact, it might be confusing and will do my best to clear all misunderstandings and tricky parts, just ask me for doing that.
Thanks for help!
I want a certain action request to trigger a set of e-mail notifications. The user does something, and it sends the emails. However I do not want the user to wait for page response until the system generates and sends the e-mails. Should I use multithreading for this? Will this even work in ASP.NET MVC? I want the user to get a page response back and the system just finish sending the e-mails at it's own pace. Not even sure if this is possible or what the code would look like. (PS: Please don't offer me an alternative solution for sending e-mails, don't have time for that kind of reconfiguration.)
SmtpClient.SendAsync is probably a better bet than manual threading, though multi-threading will work fine with the usual caveats.
http://msdn.microsoft.com/en-us/library/x5x13z6h.aspx
As other people have pointed out, success/failure cannot be indicated deterministically when the page returns before the send is actually complete.
A couple of observations when using asynchronous operations:
1) They will come back to bite you in some way or another. It's a risk versus benefit discussion. I like the SendAsync() method I proposed because it means forms can return instantly even if the email server takes a few seconds to respond. However, because it doesn't throw an exception, you can have a broken form and not even know it.
Of course unit testing should address this initially, but what if the production configuration file gets changed to point to a broken mail server? You won't know it, you won't see it in your logs, you only discover it when someone asks you why you never responded to the form they filled out. I speak from experience on this one. There are ways around this, but in practicality, async is always more work to test, debug, and maintain.
2) Threading in ASP.Net works in some situations if you understand the ThreadPool, app domain refreshes, locking, etc. I find that it is most useful for executing several operations at once to increase performance where the end result is deterministic, i.e. the application waits for all threads to complete. This way, you gain the performance benefits while still having a clear indication of results.
3) Threading/Async operations do not increase performance, only perceived performance. There may be some edge cases where that is not true (such as processor optimizations), but it's a good rule of thumb. Improperly used, threading can hurt performance or introduce instability.
The better scenario is out of process execution. For enterprise applications, I often move things out of the ASP.Net thread pool and into an execution service.
See this SO thread: Designing an asynchronous task library for ASP.NET
I know you are not looking for alternatives, but using a MessageQueue (such as MSMQ) could be a good solution for this problem in the future. Using multithreading in asp.net is normally discouraged, but in your current situation I don't see why you shouldn't. It is definitely possible, but beware of the pitfalls related to multithreading (stolen here):
•There is a runtime overhead
associated with creating and
destroying threads. When your
application creates and destroys
threads frequently, this overhead
affects the overall application
performance. •Having too many threads
running at the same time decreases the
performance of your entire system.
This is because your system is
attempting to give each thread a time
slot to operate inside. •You should
design your application well when you
are going to use multithreading, or
otherwise your application will be
difficult to maintain and extend. •You
should be careful when you implement a
multithreading application, because
threading bugs are difficult to debug
and resolve.
At the risk of violating your no-alternative-solution prime directive, I suggest that you write the email requests to a SQL Server table and use SQL Server's Database Mail feature. You could also write a Windows service that monitors the table and sends emails, logging successes and failures in another table that you view through a separate ASP.Net page.
You probably can use ThreadPool.QueueUserWorkItem
Yes this is an appropriate time to use multi-threading.
One thing to look out for though is how will you express to the user when the email sending ultamitely fails? Not blocking the user is a good step to improving your UI. But it still needs to not provide a false sense of success when ultamitely it failed at a later time.
Don't know if any of the above links mentioned it, but don't forget to keep an eye on request timeout values, the queued items will still need to complete within that time period.
I'm writing an application in C# which accesses a SQL Server 2005 database. The application is quite database intensive, and even if I try to optimize all access, set up proper indexes and so on I expect that I will get deadlocks sooner or later. I know why database deadlocks occur, but I doubt I'll be able to release the software without deadlocks occuring at some time. The application is using Entity Framework for database access.
Are there any good pattern for handling SQLExceptions (deadlocked) in the C# client code - for example to re-run the statement batch after x milliseconds?
To clarify; I'm not looking for a method on how to avoid deadlocks in the first place (isolation levels, indexes, order of statements etc) but rather how to handle them when they actually occur.
I posted a code sample to handle exactly this a while back, but SO seemed to lose my account in the interim so I can't find it now I'm afraid and don't have the code I used here.
Short answer - wrap the thing in a try..catch. If you catch an error which looks like a deadlock, sleep for a short random time and increment a retry the counter. If you get another error or the retry counter clears your threshold, throw the error back up to the calling routine.
(And if you can, try to bung this in a general routine and run most/all of your DB access through it so you're handling deadlocks program-wide.)
EDIT: Ah, teach me not to use Google! The previous code sample I and others gave is at How to get efficient Sql Server deadlock handling in C# with ADO?
Here is the approach we took in the last application framework I worked on. When we detected a deadlock, we simply reran the transaction. We did this up to 5 times. If after 5 times it failed, we would throw an exception. I don't recall a time that the second attempt ever failed. We would know because we were logging all activity in the backend code. So we knew any time a deadlock occurred, and we knew if it failed more than 5 times. This approach worked well for us.
Randy