Why does my website constantly freeze? - c#

This is a pretty vague question and getting it answered seems like a long shot, but I don't know what else to do.
Ever since I made my website live every now and then it will just freeze. You click on a link and the browser will just site there looking like its trying to connect. It seems the freezing can last up to 2 minutes or so, then everything is just fine. Then a little while later, it will do the same thing.
I track all the exceptions that occur on my website in a log file.
I get these quite a bit ..
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding
And the stack trace shows it leading to some method that's connecting to the database.
I'm assuming the freezing has to do with this timeout problem. My website is hosted on a shared server, and my database is on some other server with about a billion other databases as well.
But even being on a shared server, this freezing problem happens all the time. Its extremely annoying. And I can see this being a pretty catastrophic problem considering my site is ecommerce based and people are doing transactions on it. The last thing I want is the site freezing when my users hit the 'Submit payment' button, then it results in them hitting the submit payment button over and over again because the site froze, then there credit card gets charged about 10 extra times.
Does anyone have any suggestions on the best way to handle this?

I am guessing that it has to do with the database connections. Check to see that they are getting released properly? If not then it will use them all up.
Also check to see if your database has connection pooling configured.

Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding
That's a Sql command timeout exception - it can be somewhat common if your database is under load. Make sure you're disposing of SqlConnections and SqlCommands - though that'd usually result in a pool timeout exception (can't retrieve a connection from the connection pool).
Odds are, someone is running queries that are badly tuned or otherwise sucking resources. It may be your site, but since you're on a shared db server, it could just as easily be someone else's. It could also be blocking, or open transactions - since those would be on your database, that'd be a coding issue. You'll probably need to get your hosting provider involved to track it down or move to a dedicated db server.
You can decrease the CommandTimeout of your SqlCommands - I know that sounds somewhat counter-intuitive, but I often find that it's better to fail early than try for 60 seconds throwing additional load on the server. If your .5 second query isn't done in 5 seconds, odds are it won't be done in 60 either.
Alternatively, if you're the patient type, you can increase the CommandTimeout - but there's also a IIS timeout of 90 seconds that you'll need to modify if you bump it up too much.

The timeout errors is definitely the source of the freezing pages. When it happens the page will wait something like a minute for the database connection before it returns the error message. As the web server only handles one page at a time from each user, the entire site will seem to be frozen for the user until the timeout error comes. Even if it only happens for a few users once in a while, it will seem quite severe to them as they can't access the site at all for a minute or so.
How severe the problem really is depends on how many errors you get. From your description it sounds like you get a bit too many to be normal.
Make sure that all your data readers, command objects and connection objects gets disposed properly, so that you don't leave connections open.
Look for deadlock errors in the log also, as they can cause timeouts. If you have queries that lock each other, you may be able to improve them by changing the order that they use the tables.

Check the SQL Server logs, especially for deadlocks.
If you have multiple connections open, one might be waiting on a row that is locked by the other.

Related

SQL Server log file grew 40GB with Hangfire

I have developed an Hangfire application using MVC running in IIS, and it is working absolutely fine, till I saw the size of my SQL Server log file, which grew whopping 40 GB overnight!!
As per information from our DBA, there was an long running transaction, with the following SQL statement (I have 2 hangfire queues in place)-
(#queues1 nvarchar(4000),#queues2 nvarchar(4000),#timeout float)
delete top (1) from [HangFire].JobQueue with (readpast, updlock, rowlock)
output DELETED.Id, DELETED.JobId, DELETED.Queue
where (FetchedAt is null or FetchedAt < DATEADD(second, #timeout, GETUTCDATE()))
and Queue in (#queues1,#queues2)
On exploring the Hangfire library, I found that it is used for dequeuing the jobs, and doing a very simple task that should not take any significant time.
I couldn't found anything that would have caused this error. transactions are used correctly with using statements and object are Disposed in event of exception.
As suggested in some posts, I have checked the recovery mode of my database and verified that it is simple.
I have manually killed the hanged transaction to reclaim the log file space, but it come up again after few hours. I am observing it continuously.
What could be the reason for such behavior? and how it can be prevented?
The issue seems to be intermittent, and it could be of extremely high risk to be deployed on production :(
Starting from Hangfire 1.5.0, Hangfire.SqlServer implementation wraps the whole processing of a background job with a transaction. Previous implementation used invisibility timeout to provide at least once processing guarantee without requiring a transaction, in case of an unexpected process shutdown.
I've implemented a new model for queue processing, because there were a lot of confusion for new users, especially ones who just installed Hangfire and played with it under a debugging session. There were a lot of questions like "Why my job is still under processing state?". I've considered there may be problems with transaction log growth, but I didn't know this may happen even with Simple Recovery Model (please see this answer to learn why).
It looks like there should be a switch, what queue model to use, based on transactions (by default) or based on invisibility timeout. But this feature will be available in 1.6 only and I don't know any ETAs yet.
Currently, you can use Hangfire.SqlServer.MSMQ or any other non-RDBMS queue implementations (please see the Extensions page). Separate database for Hangfire may also help, especially if your application changes a lot of data.

Checking for internet connection slows the load speed when disconnected

I want to create an easy autoupdate system in my program. It works fine, but I want it to proceed only when the user is connected to the internet.
I tried many ways, every worked, but when I'm disconnected from the internet, the time till the application loads is around 10 seconds, which is really slow. My program checks for the update on load and so does the connection test, which I think is the problem, because if I run the test inside a button click, it loads pretty fast, even when you are disconnected from the internet.
If you are curious, I tried to use every connection test I found, including System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable();.
Your problem is that checking for a connection has a timeout. When there's a connection it finds that out really fast (usually) and you don't notice the delay. When you don't have a connection it has to do more checks and wait for responses. I don't see anyway to adjust the timeout, and even if you could you'd risk it not detecting connections even if they were available.
You should run the check on a separate thread so that your GUI loading isn't disrupted.
Rather than checking at startup, check on a background thread while the application is running and update then. Any solution for checking connection can have a delay even if the internet is up, if there are DNS issues or just general slowness.

Slow initial connection to API

I've got an API written in C# (webforms) and an SQL Server 2008 database that accepts JSON POST data on an AWS EC2 VM. My problem is that the "first" use of this API is rather slow to respond.
What I mean by "first" is that if I were to wait for an hour or so, then post some data, that would be the first. Subsequent posts would process rather quickly in comparison, and I would need to wait another hour or so before experiencing the slow "first" transaction again.
Since only the initial post is slow, it makes me wonder if something is "spinning down" after being idle for some time, and then spinning up again upon first use, adding the extra time.
Things I have tried -
Run program through a performance profiler - This didn't really help. As far as I can see, the program itself doesn't have any obvious parts that run very slowly or inefficiently.
Change configuration to persist at least 1 connection to the database at all times. Again, no real change. I did this by adding "Min Pool Size=1;Max Pool Size=100" to my connection string.
Change configuration to use named pipes instead of TCP. Once again, no real change. I did this by adding "np:" before the server specified in my connection string, eg. server=np:MyServer;database=MyDatabase;
Is there anything else I can do to diagnose the problem? What else should I be looking for in this scenario?
Chances are your app pool is shutting down after a designated period of non-use. The first call after the shutdown forces everything to get loaded back into memory which explains the lag.
You could play with these settings: http://technet.microsoft.com/en-us/library/cc771956%28v=ws.10%29.aspx to see if you get the desired effect, or setup a task scheduler job that makes at least one call every 10+/- minutes of so by doing a simulated post - a simple powershell script could handle that for you and will keep everything 'primed' for the next use.

High number of Request Timeouts on IIS

I have a fairly busy site which does around 10m views a month.
One of my app pools seemed to jam up for a few hours and I'm looking for some ideas on how to troubleshoot it..? I suspect that it somehow ran out of threads but I'm not sure how to determine this retroactively..? Here's what I know:
The site never went 'down', but around 90% of requests started timing out.
I can see a high number of "HttpException - Request timed out." in the log during the outage
I can't find any SQL errors or code errors that would have caused the timeouts.
The timeouts seem to have been site wide on all pages.
There was one page with a bug on it which would have caused errors on that specific page.
The site had to be restarted.
The site is ASP.NET C# 3.5 WebForms..
Possibilities:
Thread depletion: My thought is that the page causing the error may have somehow started jamming up the available threads?
Global code error: Another possibility is that one of my static classes has an undiscovered bug in it somewhere. This is unlikely as the this has never happened before, and I can't find any log errors for these classes, but it is a possibility.
UPDATE
I've managed to trace the issue now while it's occurring. The pages are being loaded normally but for some reason WebResource.axd and ScriptResource.axd are both taking a minute to load. In the performance counters I can see ASP.NET Requests Queued spikes at this point.
The first thing I'd try is Sam Saffron's CPU analyzer tool, which should give an indication if there is something common that is happening too much / too long. In part because it doesn't involve any changes; just run it at the server.
After that, there are various other debugging tools available; we've found that some very ghetto approaches can be insanely effective at seeing where time is spent (of course, it'll only work on the 10% of successful results).
You can of course just open the server profiling tools and drag in various .NET / IIS counters, which may help you spot some things.
Between these three options, you should be covered for:
code dropping into a black hole and never coming out (typically threading related)
code running, but too slowly (typically data access related)

.net remoting stops every 100 seconds

We have very strange problem, one of our applications is continually querying server by using .net remoting, and every 100 seconds the application stops querying for a short duration and then resumes the operation. The problem is on a client and not on the server because applications actually queries several servers in the same time and stops receiving data from all of them in the same time.
100 Seconds is a give away number as it's the default timeout for a webrequest in .Net.
I've seen in the past that the PSI (Project Server Interface within Microsoft Project) didn't override the timeout and so the default of 100 seconds was applied and would terminate anything talking to it for longer than that time.
Do you have access to all of the code and are you sure you have set timeouts where applicable so that any defaults are not being applied unbeknownst to you?
I've never seen that behavior before and unfortunately it's a vague enough scenario I think you're going to have a hard time finding someone on this board who's encountered the problem. It's likely specific to your application.
I think there are a few investigations you can do to help you narrow down the problem.
Determine whether it's the client or server that is actually stalling. If you have problems determining this, try installing a packet filter and monitor the traffic to see who sent the last data. You likely won't be able to read the binary data but at least you will get a sense of who is lagging behind.
Once you figure out whether it's the client or server causing the lag, attempt to debug into the application and get a breakpoint where the hang occurs. This should give you enough details to help track down the problem. Or at least ask a more defined question on SO.
How is the application coded to implement the continuous querying? Is it in a continuous loop? or a loop with a Thread.Sleep? or is it on a timer ?,
It would first be useful to determine if your system is executing this "trigger" in your code when you expect it to, or if it is, and the remoting server is not responding... so, ...
if you cannot reproduce this issue in a development environment where you can debug it, then, if you can, I suggest you add code to this Loop to write out to a log file (or some other persistence mechanism) each time it "should" be examining whatever conditions it uses to decide whether to query the remoting server or not, and then review those logs when the problem reoccurs...
If you can do the same in your remoting server, to record when the server receives a remoting request, this would help as well...
... and oh yes, just a thought, (I don;t know how you have coded this... ) but if you are using a separate thread in client to issue the remoting request, and the channel is being registered, and unregistered on that separate thread, make sure you are deconflicting the requests, cause you can't register the same port twice on the same machine at the same time...
(although this should probably have raised an exception in your client if this was the issue)

Categories

Resources