I understand how editing rows can cause concurrency issues, but concurrency issues being caused by selecting rows is something I do not understand.
If a query selects data from a database, how can a concurrency issue arise? Is it if there is a change made to the data I'm selecting, things will blow up?
In any case, if there is a concurrency issue caused by a select query, what is the best way to handle it? This is what I have in mind, but I wouldn't be surprised at all if it were wrong.
try
{
var SelectQuery =
from a DB.Table
where a.Value == 1
select new {Result = a};
}
catch
{
//retry query??
}
In this case your select operation essentially amounts to a read / query. Even read only operations can cause concurrency issues in an application.
The simplest example is when the object being read from has thread affinity and the read occurs from a different thread. This can cause a race since the data is being accessed in an improper way.
The best way to handle a concurrency issue is to simply avoid it. If you have 2 threads playing with the same peice of data using a lock to serialize access to the data is probably the best approach. Although a definitive solution requires a bit more detail.
Can you explain what is happening here and why the race is occurring? Do other threads modify this object while you are reading it?
When your query is run, a SQL query will be generated to correspond to your query. If other threads (or anything else) are attempting to modify the tables involved in your query, the database server will generally detect this and take care of the logic necessary to keep this from causing any real problems. It may take a little longer for your query to execute if it keeps bumping heads with update statements, but the only real problem would be if the system detects that some combination of running transactions is actually causing a deadlock. In this case, it will kill one of those connections. I believe this would only happen if your statements are attempting to update database values--not from selects.
A more important point to make, looking at your example, is that the code that you put in the try/catch block doesn't actually do any querying. It just builds an expression tree. The SQL query will not actually be run until you do something that causes this expression tree to be evaluated, like calling SelectQuery.ToList().
Keep in mind that there are a number of things that can "go wrong" when you're trying to query a database. Maybe somebody's doing massive updates of the data you're trying to select, and your connection times out before finishing the query. Maybe a cable gets bumped, or a random bit of cosmic radiation causes a bit somewhere to get lost. Then again, maybe your query has something wrong with it, or maybe the database context you're using is not synchronized to the database schema. Some of the things that could go wrong are only intermittent, and you could just try again like your question suggests. Other things might be longer-lasting, and will keep recurring. For these latter cases, if you try to repeat your action until you stop getting errors, your thread may hang there for a very long time.
So when you're deciding how to handle database connection problems, pay attention to how often you expect each type of problem to occur. I have seen code that attempts to run a transaction three times before giving up, like this. But when it comes to everyday queries, this sort of thing happens so rarely that I personally would just allow the exception to trickle up to where the user interface can say "There was an unexpected error. Please try again. If the problem persists, contact your administrator." Or something like that.
Related
I'm dealing with a fairly large-scale C# application which occasionally hits a SQL Server deadlock. I can't figure out what's causing it. My bandaid solution for now is:
(1) Catch the SqlException.
(2) See if the error code is 1205 (i.e. deadlock).
(3) If it is, Sleep() for a couple of seconds and retry the entire transaction. You can assume the previously failed transaction was successfully rolled back.
This works. The problem occurs only once or twice per week so the performance penalty is trivial. It still bugs me that it's occurring though. How do I figure out why it is occurring?
I know a deadlock occurs when two or more SQL Server threads are contending for the same resources. I obviously know which one of my transactions LOSES that battle. It's always the same query. But I'd like to know which which transaction is WINNING the battle. Maybe it's the same code block. Maybe not. I have no way to tell. Is there some special tool I should use to find the deadlock's source?
More info: The losing transaction isn't doing anything particularly exotic; just two large deletes via ExecuteNonQuery() followed by two large bulk inserts using the SqlBulkCopy class -- all in the same SqlTransaction. Both READER_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION are turned on. There are no humans making ad-hoc queries against the database. My application is the only user.
Again, this works flawlessly 99.99%+ of the time... the deadlock is quite rare. It manifests only once or twice per week.
I'm programming an C# application with a SQLite database, I get the message: Database Locked(5) in my output log sometimes. I don't want that to happen, I know that multiple actions performed to my database give me this exception.
Question:
Is there a way to see if the SQLiteDatabase is busy processing other queries?
If you're creating a multithreaded application, having the knowledge you ask for will not solve any problems. They might make them appear less often, but occasionally you will see the same problem as before.
Why? Because from the instant where you ask "are you locked?" to the instant where you say "in that case, let me lock you to do something", another thread might have jumped in and locked it.
Unless you get a timeout, I would handle the exceptions and determine in each case what to do.
Additionally, since SQLite is inherently a single-user (and not user as in person, but user as in "something that uses the database") database, you might consider that this is not growing to be the wrong tool for the job.
Before I start, I couldn't find any other resources to answer my question, closest is:
Calling a stored procedure simultaniously from multiple threads in asp.net and sql server 2005
but it fails to answer my specific issue/concern.
Basically, I have a massive .net web app that handles millions of requests a day.
Assume:
All of sprocs concerned are simple get sprocs(ex, SELECT [SOMETHING] FROM [SOMEWHERE] INNER JOIN [SOMETHING ELSE] etc....)
All data never changes(it does change from time to time, for the sake of my scenario, assume it doesn't)
The cache is initially empty for whatever reason.
The method in question:
I check for the existence of the object in the application cache. If it exists, I simply return it. If the object is not in cache, a sproc call is made to the database to look up this data. Once the sproc returns, this data is added to cache and then returned.
Under heavy load I have a bit of a performance issue that I'd like to clear up.
Here's my scenario:
User A comes into this method.
Data is not in cache, sproc gets called.
User B comes into this method(while sproc is still running).
Data is not in cache, sproc gets called.
Rinse and repeat over and over.
Under heavy load, these can generate quite a lot of concurrent and redundant active spids. I'm trying to figure out the best way around this. Obviously I could drop in an sp_getAppLock but the requests would still end up 1) dropping into the sproc and 2) have to fire the exact same query. I could lock on an object that is specific to that exact query and have that wrapped around the cache check. But if I do that, I'm potentially opening the door for some massive thread contention and deadlocking.
I have to assume that someone has dealt with this very scenario before and I'm hopeful there is an appropriate solution. Right now the best solution I can come up with is application locking, but I'd really like to know if anyone has any better options. Perhaps a combination of things, say sql app locks and messaging(traditional or non traditional) where after the lock succeeds, any that were just released try to pull down the result-set(from where?) as opposed to re-executing the entire rest of the sproc.
EDIT:
So follow this.... If I lock or "wait" either the caching or the sproc call, under heavy load it's possible that if an element is not cached and the method(or sproc) that generates the to-be-cached object could end up taking longer than expected. While that is spinning away, threads are going to have to wait. By waiting, the only way(at least that I know) is to lock or spin.
Isn't it then possible to have thread pool exhaustion or lock up all available requests and force the requests to be queued? This is my fear and the thing that drove me to look into moving the layer away from the application and into the database. The last time we attempted to lock around the caching, we suffered from severe CPU spikes on our web box because the threads sat in a lock state for so long. Though I believe at the time we did not use Monitor.Enter/Monitor.Exit(or just lock(){}). Either way, does anyone have any details or experience in this area? I know it's typically bad form to lock on long running processes for this very reason. I would suffer loading duplicate content into cache if I could avoid preventing user requests from dropping into the request queue because I'm all out of threads or all active requests are locked.
Or, maybe it's just late and I'm over thinking this. I had started my day with an almost brilliant, "ah-ha" moment. But now I just keep second guessing myself.
Your cache is most likely protected by a lock, so you are already serializing the threads.
Your suggested solution is the best: have a lock around the query. Once the cache is populated the performance difference will be negligible, and you'll avoid multiple (and expensive) database queries.
In the past I has this problem, when cache was flushes and slow queries take my DB down.
Here some solution for this heavy problem is using Locking, ignore the Hebrew explain and look in the code:
http://blogs.microsoft.co.il/blogs/moshel/archive/2009/07/11/cache.aspx
You may want to look into cache optimization if you haven't done so already.
If you are running through a cachemanager anyway, can it not be made smart enough to know that the proc has already been called and it should wait for it to complete?
GetData() {
if (cached) return cache;
if (caching) {
// wait for it to finish
return cache;
}
caching=true;
cache = CallProc();
cached = true;
caching = false;
}
I have a simple stored procedure in T-SQL that is instant when run from SQL Server Management Studio, and has a simple execution plan. It's used in a C# web front-end, where it is usually quick, but occasionally seems to get itself into a state where it sits there and times-out. It then does this consistently from any web-server. The only way to fix it that I’ve found is to drop and recreate it. It only happens with a single stored procedure, out of a couple of hundred similar procedures that are used in the application.
I’m looking for an answer that’s better than making a service to test it every n minutes and dropping and recreating on timeout.
As pointed out by other responses, the reasons could be many, varying from execution plan, to the actual SP code, to anything else. However, in my past experience, I faced a similar problem due to 'parameter sniffing'. Google for it and take a look, it might help. Basically, you should use local variables in your SP instead of the parameters passed in.
Not sure if my situation is too uncommon to be useful to others (It involved use of table variables inside the stored proc). But here is the story anyways.
I was working on an issue where a stored proc would take 10 seconds in most cases, but 3-4 minutes every now and then. After a little digging around, I found a pattern in the issue :
This being a stored proc that takes in a start date and and an end date, if I ran this for a year's worth of data (which is what people normally do), it ran in 10 sec. However when the query plan cache was cleared out, and if someone ran it for a day (uncommon use case), all further calls for a 1-year range would take 3-4 minutes, until I did a DBCC FREEPROCCACHE
The following 2 things were what fixed the problem
My first suspect was Parameter sniffing. Fixed it immediately using the local variable approach This, however, improved performance only by a small percentage (<10%).
In a clutching-at-straws approach, I changed the table variables that the original developer had used in this stored proc, to temp tables. This was what fixed the issue finally. Now that I know that this was the problem, I am doing some reading online, and have come across a few links such as
http://www.sqlbadpractices.com/using-table-variable-for-large-table-vs-temporary-table/
which seem to correspond with the issue I am seeing.
Happy coding!!
It's hard to say for sure without seeing SP code.
Some suggestions.
SQL server by default reuses execution plan for stored procedure. The plan is generated upon the first execution. That may cause a problem. For example, for the first time you provide input with very high selectivity, and SQL Server generates the plan keeping that in mind. Next time you pass low selectivity input, but SP reuses the old plan, causing very slow execution.
Having different execution paths in SP causes the same problem.
Try creating this procedure WITH RECOMPILE option to prevent caching.
Hope that helps.
Run SQL Profiler and execute it from the web site until it happens again. When it pauses / times out check to see what is happening on the SQL server itself.
There are lots of possibilities here depending on what the s'proc actually does. For example, if it is inserting records then you may have issues where the database server needs to expand the database and/or log file size to accept new data. If it's happening on the log file and you have slow drives or are nearing the max of your drive space then it could timeout.
If it's a select, then those tables might be locked for a period of time due to other inserts happening... Or it might be reusing a bad execution plan.
The drop / recreate dance is may only be delaying the execution to the point that the SQL server can catch up or it might be causing a recompile.
My original thought was that it was an index but on further reflection, I don't think that dropping and recreating the stored prod would help.
It most probably your cached execution plan that is causing this.
Try using DBCC FREEPROCCACHE to clean your cache the next time this happens. Read more here http://msdn.microsoft.com/en-us/library/ms174283.aspx
Even this is a reactive step - it does not really solve the issue.
I suggest you execute the procedure in SSMS and check out the actual Execution Plan and figure out what is causing the delay. (in the Menu, go to [View] and then [Include Actual Execution Plan])
Let me just suggest that this might be unrelated to the procedure itself, but to the actual operation you are trying to do on the database.
I'm no MS SQL expert, but I would'n be surprised that it behaves similarly to Oracle when two concurrent transactions try to delete the same row: the transaction that first reaches the deletion locks the row and the second transaction is then blocked until the first one either commits or rolls back. If that was attempted from your procedure it might appear as "stuck" (until the "locking" transaction is finished).
Do you have any long-running transactions that might lock rows that your procedure is accessing?
I'm writing an application in C# which accesses a SQL Server 2005 database. The application is quite database intensive, and even if I try to optimize all access, set up proper indexes and so on I expect that I will get deadlocks sooner or later. I know why database deadlocks occur, but I doubt I'll be able to release the software without deadlocks occuring at some time. The application is using Entity Framework for database access.
Are there any good pattern for handling SQLExceptions (deadlocked) in the C# client code - for example to re-run the statement batch after x milliseconds?
To clarify; I'm not looking for a method on how to avoid deadlocks in the first place (isolation levels, indexes, order of statements etc) but rather how to handle them when they actually occur.
I posted a code sample to handle exactly this a while back, but SO seemed to lose my account in the interim so I can't find it now I'm afraid and don't have the code I used here.
Short answer - wrap the thing in a try..catch. If you catch an error which looks like a deadlock, sleep for a short random time and increment a retry the counter. If you get another error or the retry counter clears your threshold, throw the error back up to the calling routine.
(And if you can, try to bung this in a general routine and run most/all of your DB access through it so you're handling deadlocks program-wide.)
EDIT: Ah, teach me not to use Google! The previous code sample I and others gave is at How to get efficient Sql Server deadlock handling in C# with ADO?
Here is the approach we took in the last application framework I worked on. When we detected a deadlock, we simply reran the transaction. We did this up to 5 times. If after 5 times it failed, we would throw an exception. I don't recall a time that the second attempt ever failed. We would know because we were logging all activity in the backend code. So we knew any time a deadlock occurred, and we knew if it failed more than 5 times. This approach worked well for us.
Randy