Application crashes or hangs due to openTimeOut setting

Application crashes or hangs due to openTimeOut setting - c#

In my application, i've set the openTimeOut to 1 minute. Now if the service is stopped or server is not running then it causes the problem.
I need to load the forms on the basis of output from the service. I call the service while loading the content of the form. Now if the server is stopped it will hang the UI till the openTimeOut of the service. Main issue is , application uses multiple services and for some other service the timeout is 35 seconds and if it timeouts then it is re-starting the application and my service is still in the openTimeOut mode which results in crashing of the application.
What could be the best solution for this problem.
My question is what is the best way to handle this condition - reduce the openTimeOut or call the service on different thread.

Definitely call the service on a different thread (or, if you use an auto-generated service proxy, you can switch Async versions of the methods which amounts to the same thing). The UI thread should not be dependent on long-running operations or those that may block.
The value of OpenTimeOut will not significant as far as hanging the app (because that will stop happening), but you may want to lower it a bit because 1 minute is perhaps too long to wait in order to discover that no connectivity to the service exists.

Related

Background thread in WCF services

I have developed a WCF service for serving our customers and hosted it on IIS. We have a requirement to log all the requests received and responses sent from WCF in to a database.
But, because of this logging, we don't want to interrupt main flow of requests and responses. So, we are using threads (Threading.Thread and Thread.IsBackground = true) to call procedures to insert/log the requests and responses to database.
I just want to know if there will be problems in implementing/invoking threads on a WCF service. If so, what will be a good solution for this?

Yes, there can be a problem. The application pool in IIS can get recycled which means that the background thread will be killed, even if it's in the middle of some processing.
In reality that will only be a problem when you update your application (as the logger should be done when the app pool is stopped due to the idle timeout).
So if you can live with lost log entries during updates you do not have a problem.

Separate threads in a web service after it's completed

If this has been asked before my apologies, and this is .NET 2.0 ASMX Web services, again my apologies =D
A .NET Application that only exposes web services. Roughly 10 million messages per day load balanced between multiple IIS Servers. Each incoming messages is XML, and an outgoing message is XML. (XMLElement) (we have beefy servers that run on steroids).
I have a SLA that all messages are processed in under X Seconds.
One function, Linking Methods, in the process is now taking 10-20 seconds, it is required for every transaction, however is not critical that it happens before the web service returns the results. Because of this I made a suggestion to throw it on another thread, but now realize that my words and the eager developers behind them might have not fully thought this through.
The below example shows on the left the current flow. On the right what is being attempted
Effectively what I'm looking for is to have a web service spawn a long running (10-20 second) thread that will execute even after the web service is completed.
This is what, effectively, is going on:
Thread linkThread= new Thread(delegate()
{
Linkmembers(GetContext(), ID1, ID2, SomeOtherThing, XMLOrSomething);
});
linkThread.Start();
Using this we've reduced the time from 19 seconds to 2.1 seconds on our dev boxes, which is quite substantial.
I am worried that with the amount of traffic we get, and if a vendor/outside party decides to throttle us, IIS might decide to recycle/kill those threads before they're done processing. I agree our solution might not be the "best" however we don't have the time to build in a Queue system or another Windows Service to handle this.
Is there a better way to do this? Any caveats that should be considered?
Thanks.

Apart from the issues you've described, I cannot think of any. That being said, there are ways to fix the problem that do not involve building your own solution from scratch.
Use MSMQ with WCF: Create a WCF service with an MSMQ endpoint that is IIS hosted (no need to use a windows service as long as WAS is enabled) and make calls to the service from within your ASMX service. You reap all the benefits of reliable queueing without having to build your own.
Plus, if your MSMQ service fails or throws an exception, it will reprocess automatically. If you use DTC and are hitting a database, you can even have the MSMQ transaction flow to the DB.

Logging Via WCF Without Slowing Things Down

We have a large process in our application that runs once a month. This process typically runs in about 30 minutes and generates 342000 or so log events. Recently we updated our logging to a centralized model using WCF and are now having difficulty with performance. Whereas the previous solution would complete in about 30 minutes, with the new logging, it now takes 3 or 4 hours. The problem it seems is because the application is actually waiting for the WCF request to complete before execution continues. The WCF method is already configured as IsOneWay and I wrapped the call on the client side to that WCF method in a different thread to try to prevent this type of problem but it doesn't seem to have worked. I have thought about using the async WCF calls but thought before I tried something else I would ask here to see if there is a better way to handle this.

342000 log events in 30 minutes, if I did my math correctly, comes out to 190 log events per second. I think your problem may have to do with the default throttling settings in WCF. Even if your method is set to one-way, depending on if you're creating a new proxy for each logged event, calling the method will still block while the proxy is created, the channel is opened, and if you're using an HTTP-based binding, it will block until the message has been received by the service (an HTTP-based binding sends back a null response for a 1-way method call when the message is received). The default WCF throttling limits concurrent instances to 10 on the service side, which means only 10 requests will be handled at a time, and any further requests will get queued, so pair that with an HTTP binding, and anything after the first 10 requests are going to block at the client until it's one of the 10 requests getting handled. Without knowing how your services are configured (instance mode, etc.) it's hard to say more than that, but if you're using per-call instancing, I'd recommend setting MaxConcurrentCalls and MaxConcurrentInstances on your ServiceBehavior to something much higher (the defaults are 16 and 10, respectively).
Also, to build on what others have mentioned about aggregating multiple events and submitting them all at once, I've found it helpful to setup a static Logger.LogEvent(eventData) method. That way it's simple to use throughout your code, and you can control in your LogEvent method how you want logging to behave throughout your application, such as configuring how many events should get submitted at a time.

Making a call to another process or remote service (i.e. calling a WCF service) is about the most expensive thing you can do in an application. Doing it 342,000 times is just sheer insanity!
If you must log to a centralized service, you need to accumulate batches of log entries and then, only when you have say 1000 or so in memory, send them all to the service in one hit. This will give you a reasonable performance improvement.

log4net has a buffering system that exists outside the context of the calling thread, so it won't hold up your call while it logs. Its usage should be clear from the many appender config examples - search for the term bufferSize. It's used on many of the slower appenders (eg. remoting, email) to keep the source thread moving without waiting on the slower logging medium, and there is also a generic buffering meta-appender that may be used "in front of" any other appender.
We use it with an AdoNetAppender in a system of similar volume and it works wonderfully.

There's always the traditional syslog there are plenty of syslog daemons that run on Windows. Its designed to be a more efficient way of centralised logging than WCF, which is designed for less intensive opertions, especially if you're not using the tcpip WCF configuration.
In other words, have a go with this - the correct tool for the job.

Long-running Asynchronous Thread in WCF

There is a WCF Service with a long-running Asynchronous Thread.
This long-running operation can run more then 1 day.
We are hosting WCF Service on IIS 6.
The Thread is running OK, but in 20 minutes we are receiving error message:
"Thread has been aborted"
The Thread is dead as a result.
Our WCF Service configuration:
[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single)]
[ServiceBehavior(ConcurrencyMode = ConcurrencyMode.Single)]
Can you suggest the source of this problem?
Thank you for you answers.

If there's no activity (no requests) to this web service IIS might decide to unload the application domain which of course will result in killing all threads. The default value is 20 minutes and could be configured in the properties of the application pool in IIS. There are also other factors that might cause the app pool to be recycled such as system running on low memory. So hosting such thing in IIS might not be reliable. You might consider hosting long running tasks in Windows Services.

IIS6 has a setting that will shut down the app pool after a predefined time with no requests, the default is 20 minutes. It seems like that is what you are running into. You can find this setting under App Pool properties => Performance Tab => Remove checkmark in "Shutdown worker processes after being idle for".
In general, it is considered a bad idea to host long-running tasks under IIS, since there are many things that may abort the thread or shutdown the process altogether. Application Pool recycles being the most prominent one.

You could have a Windows Service host a WCF endpoint that kicks off your long running task. Windows Services are meant to be running a long, long time and are ideal for this situation.

What happens to other users if the .NET worker process crashes?

My knowledge of how processes are handled by the ASP.Net worker process is woefully inadequate. I'm hoping some of the experts out there can fill me in.
If I crash the worker process with a System.OutOfMemoryException, what would the user experience be for other users who were being served by the same process? Would they get a blank screen? 503 error?
I'm going to attempt to test this scenario with some other folks in our lab, but I thought I would float this out there. I will update with our results.
UPDATE: Our results varied. If we artificially induced a OOM exception (for example by loading larger and larger PDFs into memory), other threads being served by that worker process would "hang" temporarily and then complete, while others seemingly would never return. Thank you for your responses.

W3WP.exe is the process
IIS runs all web apps in a generic worker process - w3wp.exe. Whether you write in ASP.NET, or ISAPI, or some other framework, the process that serves the web request is w3wp.exe. In the ASP.NET case, w3wp.exe loads the ASP.NET JIT-compiled DLLs and services the requests through them. In other cases, it works differently. But the key point is, w3wp.exe is the process. This model started in IIS6.0 and continues in IIS7.0.
Unexpected Failures
If the W3WP.exe fails unexpectedly, for any reason, all transactions it was handling will likely get 500 errors (Server error). IIS will start a new worker process in its place (MS calls this "Health Monitoring"), which means the web app will continue to run. Users that did not have a request being served by the failing process at the time of failure, will be unaware of any of this.
The HTTP 500 error that a client receives in this case will be indistinguishable from a 500 error that the client receives in the case of an application error, let's say an uncaught exception in your ASPNET application code.
For those requests that were in the failing process, there's no way to recover them. They will result in 500 errors at the browser. A 503 Server Busy results from IIS actively refusing the connection due to a threshold on the number of connections. A 503 does not result from an application failure, so you shouldn't expect to see 503 for in-flight transactions in the out-of-memory-and-crash scenario. On a heavily loaded system, you may see 503's as the process-crash-and-restart happens, as a secondary effect. If this is really what you're seeing, you need a larger margin of safety to handle the load in the single-error condition.
The Request Queue
IIS has a hand-off approach for requests. As they arrive on the network layer (Http.sys), they are placed in a queue, to be picked up by a worker process. Any requests waiting in the IIS queue to be handled by a WP will continue unaffected, though they might see a slight temporary increase in latency (service time) due to resource contention, since one fewer process is running on the server. Wait time in this queue is generally very very short, on a system that is configured properly.
It is when this queue is full that you will see 503 errors.
Auto restart of W3WP.exe
IIS has an auto-restart (or "nanny") facility, through which it restarts worker processes after they have exceeded configured thresholds, such as memory size, number of requests, or time-of-running. In all those cases, IIS will quiesce and restart worker processes when the configured threshold is reached. These pro-active restarts normally do not result in any disruption of requests. When IIS decides that a restart of a worker process is necessary, it prevents any new requests from arriving at that to-be-quiesced WP. Existing requests are drained: any in-flight transactions in that WP are allowed to complete normally. When all requests in the WP complete, then the WP dies and IIS starts a new one in its place. This new process then immediately begins picking up new requests from the dispatch queue. This is all transparent to users or browsers.
I say normally because it's possible that the worker process has become truly sick at the same time as the threshold has been reached. In that case the w3wp.exe may not respond to IIS within the configured "quiesce" timeout, and thus IIS has to eventually kill the process even though it hasn't reported that all of its in-flight requests have completed. This should be exceedingly rare, because it's two distinct exceptional conditions, but it happens. In this case, the in-flight requests will once again, get 500 errors.
Web gardens
Also - IIS allows multiple worker processes on a single server. MS calls this a "web garden", a play on words from "web farm". If you have a web garden set up, then transactions being served by w3wp.exe instances other than the failing one, will continue unaffected. "Unaffected" presumes though, that the out-of-memory error is localized, and not a system-wide problem.
Bottom Line
The bottom line is that there is no substitute for your own testing. The configuration options are pretty broad - from restart thresholds to web gardens and so on. Also the failure modes tend to be pretty complex and varied, whether it's memory, timeout, too busy, and so on. You'll want to understand what to expect.
ps: this Q&A really belongs on serverfault.com !!
references:
http://blogs.iis.net/thomad/archive/2008/05/07/the-iis-process-model-features.aspx

A new worker thread will be started and the user would not know anything happened. Unless it shuts down completely via rapid fail (http://technet.microsoft.com/en-us/library/cc779127(WS.10).aspx)

If it's an out of memory situation, iis usually just recycles the app pool.

As the other answers say, in most cases everything just restarts, and most users who did not have a pending request at the time will not notice much more than a delay.
However, if your application uses session variables with In-Proc session state, all session variables for all users will be lost when the app pool restarts. This may or may not have a negative effect on the users, depending on what you're doing with the session variables. You can avoid this by switching to StateServer or SQL Server session storage.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.