Error when multiple users access my web app at the same time

Error when multiple users access my web app at the same time - c#

I'm using .Net 2008 and Oracle 10g as my database. The problem I'm getting is after deploying the application in IIS, when multiple users access the same page at a time i'm getting the error. Can't get the output.
Note: Both the users accessing the same page, same menu at a time.
How can I resolve this?

My guess would be a standard thread-safety / synchronization bug, most likely due to some static resource such as a static connection. Obviously this is pure speculation without some more code, but it (=web-sites being highly threaded) is a surprisingly common oversight.
If it is a static resource, then... well, it probably shouldn't be static. Either per-request, or (specifically in the case of connections) scoped to the local code (and let the connection-pooling worry about re-use).

Does it "Work on your machine"? ;)
If not, try to deploy a version locally and attach the debugger to iis. Point two browsers at the site. Whenever your browsers are/seem stuck, open the debugger's threads window and see where the threads are blocked/blocking. You can also ask the debugger to stop on exception throwing

You mean that nothing appears in the Browser?
Look in your program's logs. Any error messages?
Put some trace statements into your code so that you can figure out where it's going.
So the error is saying there's a failure to create a table. Would you expect to create a table for each user? Have a look at the code around the table creation. Consider what the correct behaviour should be when two copies of that code run at the same time.
Again add trace into the code at these points so you can see what happening. Often it's easier to see this than to debug because when mutiple threads are running the debugger gets in the way of reality.,

Related

Anyway to run C# code in remote process(E.g. IIS)

Some of my website is not working well in production environment, sometime crash and sometime exception.
Since these websites are really old, some of them don't have enough log to troubleshoot.
I know i can add code one by one to have enough log, but it is huge work load and may impact the existing logic.
so what i am trying to do is:
without modify any existing product code,
can we attach or inject code into website process in runtime, then i can execute some code to log exception, http request and http response.
my environment is Asp.net 4.0+ and IIS7+
if i can execute my code in product process, it will be much easier to log without modify product code, the code will like this:
AppDomain.CurrentDomain.FirstChanceException += (sender, e) =>
{
///write log...
};
unhandle exception is one of the case, i have several other case that i want to apply to all production, like:
log all request and response (i can enable log and disable log in
anytime, from a remote management system)
Log unhandle exception
log swallowed exception
get a dump
monitor application, once condition reached(e.g. exception count,
time out, memory usage) trigger some action
print all stack when exception or unexpected return value ( with
parameter and local value that function reference)
there is several tech point i can't find solution for now...

If it's a web application and you're dealing with unhandled exceptions, you can add logging into global.asax; it may or may not help but seems like it will give you more insight than what you have so far.
Now, technically speaking, one can argue that modifying global.asax is modifying production code, but I took your question to mean that you don't want to go and modify hundreds of places in production code, etc.

The easiest way is to add an error logging module to IIS.
See, for example, SharpBrake, which has an IIS Module called NotifierHttpModule that automatically catches all exceptions thrown by the site without any modification of the site's code. (The only thing you need to do is include the module initialization in your web.config file.) You could write something yourself based on SharpBrake, or ....
In fact, for what you are doing, you might consider using a service like AirBrake.io or its free open-source clone ErrBit to do the logging, and then you can use SharpBrake to do the error catching.

How can I terminate an ASP.NET application on IIS following an unrecoverable error?

Supopse I have a unhandled exception (or a known serious, unrecoverable error). The scariest situation is a security breach, but it could apply to anything that means my state is so badly hosed I can't expect to continue safely.
What do I do?
In a traditional application, the usual technique is to end my process, quickly. as soon as possible. I'm calling Process.Exit, TerminateProcess, die, or whatever other tool the environment has that means "END. NOW". Eric Lippert's post expresses the reasoning for this attitude well.
In a production ASP.NET application running on IIS, it's not so simple. I can certainly end the current process and cough an error to the event log or wherever. That's essentially what happens with any unhandled exception. But the next time a request comes in, IIS is just going to spin up a new worker process. If my fatal error was a transient problem that's great.
But if my problem persists past the lifetime of my process, the new one won't be any better. It could even be compounded by the intialization code or a reattempt. Plus, if IIS is running multiple worker processes within the same application pool, even killing my process doesn't kill the application. Logically speaking all those other workers may be hosed too and just not know it yet.
So far I've only come up with two options.
End the process and hope for the best. Knowing that the app will just be restarted, this is pretty much the same as "catch(Exception) {}". Hardly satisfying.
"Reaching out" to tell IIS to disable the app, stop IIS, the machine, etc. This seems like a brutal hack. Moreover I'd guess it's likely to require elevated security credentials. During termination of a possibly-compromised process seems like a poor time to have those.

What I can think of are something as following:
You can go ahead use the advanced setting of an Application Pool in IIS named "Rapid-Fail protection", set the Failure Interval long enough as you like, and make the Maximum Failures as 1, then go ahead thrown the exception and make the IIS think this application pool can't work correctly so that it will send back Service Unavailable to client side or even reset the connection(depend on your setting). For more detail please check it here: Failure Settings for an Application Pool . However you need to be very careful to not overkill, I mean you need to write a very good application that all exception been handled properly and only the one you want to terminate application can really been detected by IIS, otherwise maybe a single user click just brought down your site.
Another solution is just go ahead make it your own code, I mean you can record such an error in some certain way like creating a file named SystemCrashed, and then terminate the Application, then check if file exist on Application_Startup and do nothing but terminate the Application if file been found. Something like a lock. This need more code but maybe safer than IIS settings, I mean there can't be too much overkill as long as you get it right to remove the lock.

I've found a bug in the JIT/CLR - now how do I debug or reproduce it?

I have a computationally-expensive multi-threaded C# app that seems to crash consistently after 30-90 minutes of running. The error it gives is
The runtime has encountered a fatal error. The address of the error was at 0xec37ebae, on thread 0xbcc. The error code is 0xc0000005. This error may be a bug in the CLR or in the unsafe or non-verifiable portions of user code. Common sources of this bug include user marshaling errors for COM-interop or PInvoke, which may corrupt the stack.
(0xc0000005 is the error-code for Access Violation)
My app does not invoke any native code, or use any unsafe blocks, or even any non-CLS compliant types like uint. In fact, the line of code that the debugger says caused the crash is
overallLength += distanceTravelled;
Where both values are of type double
Given all this, I believe the crash must be due to a bug in the compiler or CLR or JIT. I'd like to figure out what causes it, or at the very least write a smaller reproduction to send into Microsoft, but I have no idea where to even begin. I've never had to view the CIL-binary, or the compiled JIT output, or the native stacktrace (there is no managed stacktrace at the time of the crash), so I'm not sure how. I can't even figure out how to view the state of all the variables at the time of the crash (VS unfortunately won't tell me like it does after managed-exceptions, and outputting them to console/a file would slow down the app 1000-fold, which is obviously not an option).
So, how do I go about debugging this?
[Edit] Compiled under VS 2010 SP1, running latest version of .Net 4.0 Client Profile. Apparently it's ".Net 4.0C/.Net 4.0E, .Net CLR 1.1.4322"

I'd like to figure out what causes it, or at the very least write a smaller reproduction to send into Microsoft, but I have no idea where to even begin.
"Smaller reproduction" definitely sounds like a great idea here... even if "smaller" won't mean "quicker to reproduce".
Before you even start, try to reproduce the error on another machine. If you can't reproduce it on another machine, that suggests a whole different set of tests to do - hardware, installation etc.
Also, check you're on the latest version of everything. It would be annoying to spend days debugging this (which is likely, I'm afraid) and then end up with a response of "Yes, we know about this - it was a bug in .NET 4 which was fixed in .NET 4.5" for example. If you can reproduce it on a variety of framework versions, that would be even better :)
Next, cut out everything you can in the program:
Does it have a user interface at all? If possible, remove that.
Does it use a database? See if you can remove all database access: definitely any output which isn't used later, and ideally input too. If you can hard code the input within the app, that would be ideal - but if not, files are simpler for reproductions than database access.
Is it data-sensitive? Again, without knowing much about the app it's hard to know whether this is useful, but assuming it's processing a lot of data, can you use a binary search to find a relatively small amount of data which causes the problem?
Does it have to be multi-threaded? If you can remove all the threading, obviously that may well then take much longer to reproduce the problem - but does it still happen at all?
Try removing bits of business logic: if your app is componentized appropriately, you can probably fake out whole significant components by first creating a stub implementation, and then simply removing the calls.
All of this will gradually reduce the size of the app until it's more manageable. At each step, you'll need to run the app again until it either crashes or you're convinced it won't crash. If you have a lot of machines available to you, that should help...

tl;dr Make sure you're compiling to .Net 4.5
This sounds suspiciously like the same error found here. From the MSDN page:
This bug can be encountered when the Garbage Collector is freeing and compacting memory. The error can happen when the Concurrent Garbage Collection is enabled and a certain combination of foreground Garbage Collection and background Garbage Collection occurs. When this situation happens you will see the same call stack over and over. On the heap you will see one free object and before it ends you will see another free object corrupting the heap.
The fix is to compile to .Net 4.5. If for some reason you can't do this, you can also disable concurrent garbage collection by disabling gcConcurrent in the app.config file:
<configuration>
<runtime>
<gcConcurrent enabled="false"/>
</runtime>
</configuration>
Or just compile to x86.

WinDbg is your friend:
http://blogs.msdn.com/b/tess/archive/2006/02/09/net-crash-managed-heap-corruption-calling-unmanaged-code.aspx
http://www.codeproject.com/Articles/23589/Get-Started-Debugging-Memory-Related-Issues-in-Net
http://www.codeproject.com/Articles/22245/Quick-start-to-using-WinDbg

Download Debug Diagnostic Tool v1.2
Run program
Add Rule "Crash"
Select "Specific Process"
on page Advanced Configuration set your exception if you know on which exception it fails or just leave this page as is
Set userdump location
Now wait for process to crash, log file is created by DebugDiag. Now activate tab Advanced Analysis, select Crash/Hang Analyzers in top list and dump file in lower list and hit Start Analysis. This will generate html report for you. Hopes you found usefull info in that report. If you have problem with analyze, upload html report somewhere and place url here so we can focus on it.

My app does not invoke any native code, or use any unsafe blocks, or
even any non-CLS compliant types like uint
You may think this, but threading, synchronization via semaphore, mutex it any handles all are native. .net is a layer over operating system, .net itself does not support pure clr code for multithreading apps, this is because OS already does it.
Most likely this is thread synchronization error. Probably multiple threads are trying to access shared resource like file etc that is outside clr boundary.
You may think you aren't accessing com etc, but when you call certain API like get desktop folder path etc it is called through shell com API.
You have following two options,
Publish your code so that we can review the bottleneck
Redesign your app using .net parallel threading framework, which includes variety of algorithms requiring CPU intensive operations.
Most likely programs fail after certain period of time as collections grow up and operations fail to execute before other thread interfere. For example, producer consumer problem, you will not notice any problem till producer will become slower or fail to finish its operation before consumer kicks in.
Bug in clr is rare, because clr is very stable. But poorly written code may lead error to appear as bug in clr. Clr can not and will never detect whether the bug is in your code or in clr itself.

Did you run a memory test for your machine as the one time I had comparable symptoms one of my dimms turned out to be faulty (a very good memorytester is included in Win7; http://www.tomstricks.com/how-to-test-your-ram-or-memory-with-windows-memory-diagnostic-tool-in-windows-7/)
It might also be a heating/throttling issue if your CPU gets too hot after this period of time. Although that would happen sooner imho.
There should be a dumpfile that you can analyze. If you never did this find someone who did, or send that to microsoft

I will suggest you open a support case via http://support.microsoft.com immediately, as the support guys can show you how to collect the necessary information.
Generally speaking, like #paulsm4 and #psulek said, you can utilize WinDbg or Debug Diag to capture crash dumps of the process, and within it, all necessary information is embedded. However, if this is the very first time you use those tools, you might be puzzled. Microsoft support team can provide you step by step guidance on them, or they can even set up a Live Meeting session with you to capture the data, as the program crashes so often.
Once you are familiar with the tools, in the future you can perform similar troubleshooting more easily,
http://blogs.msdn.com/b/lexli/archive/2009/08/23/when-the-application-program-crashes-on-windows.aspx
BTW, it is too early to say "I've found a bug". Though you cannot obviously find in your program a dependency on native code, it might still have a dependency on native code. We should not draw a conclusion before debugging further into the issue.

Windows service / A new guard page for the stack cannot be created

I have a windows service that does some intensive work every one minute (actually it is starting a new thread each time in which it syncs to different systems over http). The problem is, that after a few days it suddenly stops without no error message.
I have NLog in place and I have registered for AppDomain.CurrentDomain.UnhandledException. The last entry in the textfile-log is just a normal entry without any problems. Looking in the EventLog, I also can't find any message in the application log, however, there are two entries in the system log.
One basically says that the service has been terminated unexpectedly. Nothing more. The second event (at the same time as the first one) says: "...A new guard page for the stack cannot be created..."
From what I've read, this is probably a stack overflow exception. I'm not parsing any XML and I don't do recursive work. I host a webserver using Gate, Nancy and SignalR and have RavenDB running in embedded mode. Every minute a new task is started using the Taskfactory from .NET 4.0 and I also have a ContinueWith where I re-start a System.Timers.Timer to fire again in one minute.
How can I start investigating this issue? What could be possible reasons for such an error?

Based on the information that you provided, I would at least, at the minimum, do the following:
Pay extra attention to any third party calls, and add additional info logging around those points.
There are some circumstances in which AppDomain.CurrentDomain.UnhandledException won't help you - a StackOverflowException being one of them. I believe the CLR will simply just give you a string in this case instead of a stack trace.
Pay extra attention around areas where more than one thread is introduced.
An example of an often overlooked StackOverflowException is:
private string myString;
public string MyString { get { return MyString; } } //should be myString

I got this on a particular computer and traced it to a c# object referencing itself from within an initializer

Just as a 'for what it is worth' - in my case this error was reported when the code was attempting to write to the Windows Event Log and the interactive user did not have sufficient permission. This was a small console app that logged exceptions to a text file and the event log (if desired). On exception, the text file was being updated but then this error was thrown and not caught by the error handling. Disabling the Event Logging stopped the error occurring.

Just in case any other person is having the same problem, in my case I found that my windows service was trapped in an endless recursive loop accidentally. So If anyone else have this problem, take in consideration method calls that may be causing huge recursive loops.

I think why you might all be stumped is because this MAY BE a SSD hardware fault. I get this error consistently while playing games about every 3-5 hours and its my computers page file failing somehow.. I know it isnt RAM because i replaced my CPU/RAM/MOBO combo trying to battle this. And its not programming because different games and different apps all fail at the same time, unless its windows corruption?
I could be wrong but just an idea.
I have two samsung evo's in raid

What could cause a Windows Service to hang when a Console App doing the exact same thing using the exact same base libraries doesn't?

I hate asking questions like this - they're so undefined... and undefinable, but here goes.
Background:
I've got a DLL that is the guts of an application that is a timed process. My timer receives a configuration for the interval at which it runs and a delegate that should be run when the interval elapses. I've got another DLL that contains the process that I inject.
I created two applications, one Windows Service and one Console Application. Each of the applications read their own configuration file and load the same libraries pushing the configured timer interval and delegate into my timed process class.
Problem:
Yesterday and for the last n weeks, everything was working fine in our production environment using the Windows Service. Today, the Windows Service will run for a period of around 20-30 minutes and hangs (with a timer interval of 30 secods), but the console application runs without issue and has for the past 4 hours. Detailed logging doesn't indicate any failure. It's as if the Windows Service just...dies quietly - without stopping.
Given that my Windows Service and Console Applications are doing the exact same thing, I can only think that there is something that is causing the Windows Service process to hang - but I have no idea what could be causing that. I've checked the configuration files, and they're both identical - I even copied and pasted the contents of one into the other just to be sure. No dice.
Can anyone make suggestions as to what might cause a Windows Service to hang, when a counterpart Console Application using the same base libraries doesn't; or can anyone point me in the direction of tools that would allow me to diagnose what could be causing this issue?
Thanks for everyone's help - still digging.

You need to figure out what changed on the production server. At first, the IT guys responsible will swear that nothing changed but you have to be persistent. i've seen this happen to often i've lost count. Software doesn't spoil. Period. The change must have been to the environment.
Difference in execution: You have two apps running the same code. The most likely difference (and culprit) is that the service is running with a different set of security credentials than your console app and might fall victim to security vagaries. Check on that first. Which Windows account is running the service? What is its role and scope? Is there any 3rd party security software running on the server and perhaps Killing errant apps? Do you have to register your service with a 3rd party security service? Is your .Net assembly properly signed? Are your .Net assemblies properly registered and configured on the server? Last but not least, don't forget that a debugger user, which you most likely are, gets away with a lot more stuff than many other account types.
Another thought: Since timing seems to be part of the issues, check the scheduled tasks on the machine. Perhaps there's a process that is set to go off every 30 minutes that is interfering with your own.

You can debug a Windows service by running it interactively within Visual Studio. This may help you to isolate the problem by setting (perhaps conditional) breakpoints.
Alternatively, you can use the Visual Studio "Attach to process" dialog window to find the service process and attach to it with the "Debug CLR" option enabled. Again this allows you to set breakpoints as needed.
Are you using any assertions? If an assertion fires without being re-directed to write to a log file, your service will hang. If the code throws an unhandled exception, perhaps because of a memory leak, then your service process will crash. If you set the Service Control Manager (SCM) to restart your process in the event of a crash, you should be able to see that the service has been restarted. As you have identical code running in both environments, these two situations don't seem likely. But remember that your service is being hosted by the SCM, which means a very different environment to the one in which your console app is running.
I often use a "heartbeat", where each active thread in the service sends a regular (say every 30 seconds) message to a local MSMQ. This enables manual or automated monitoring, and should give you some clues when these heartbeat messages stop arriving.
Annother possibility is some sort of permissions problem, because the service is probably running with a different local/domain user to the console.
After the hang, can you use the SCM to stop the service? If you can't, then there is probably some sort of thread deadlock problem. After the service appears to hang, you can go to a command-line and type sc queryex servicename. This should give you the current STATE of the service.

I would probably put in some file logging just to see how far the program is getting. It may give you a better idea of what is looping/hanging/deadlocked/crashing.

You can try these techniques
Logging start logging the flow of the code in the service. Have this parameter based so you dont have a deluge after you are done. You should log all function names, parameters, timestamps.
Attach Debugger Locally or Remotely attach a debugger with the code to the running service, set appropriate breakpoints (can be based on the data gathered from logging)
PerfMon Run this utility and gather information about the machine that the service is running on for any additional clues (high CPU spikes, IO spikes, excessive paging, etc)

Microsoft provides a good resource on debugging a Windows Service. That essentially sounds like what you'd have to do given that your question is so generic. With that said, has any changes been made to the system over the last few days that could aversely affect the service? Have you made any updates to the code that change the way the service might possibly work?
Again, I think you're going to have to do some serious debugging to find your problem.

What type of timer are you using in the windows service? I've seen numberous people on SO have problems with timers and windows services. Here is a good tutorial just to make sure you are setting it up correctly and using the right type of timer. Hope that helps.

Another potential problem in reference to psasik's answer is if your application is relying on something only available when being run in User Mode.
Running in service mode runs in (is it desktop0?) which can cause some issues in my experience if you are trying to determine states of something that can only be seen in user mode.

Smells like a threading issue to me. Is there any threading or async work being done at all? One crucial question is "does the service hang on the same line of code or same method every time?" Use your logging to find out the last thing that happens before a hang, and if so, post the problem code.
One other tool you may consider is a good profiler. If it is .NET code, I believe RedGate ANTS can monitor it and give you a good picture of any threadlock scenarios.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.