SignalR (Version 1) hub connection to server during application pool recycle - c#

We have several servers running an ASP.NET Web API application. The Web API is accessed from a Windows desktop client application written in C#. The client uses a SignalR HubConnection object to start a reverse channel with the server through which it receives live feedback (log statements, status, etc.). Last year, one of our users reported that the client was periodically encountering the following error:
An existing connection was forcibly closed by the remote host
This error is received when the HubConnection.Error event is raised. I dug into the Windows Event Viewer logs on the server side and discovered that these errors coincided exactly with the occurrence of the following events:
A process serving application pool 'ASP.NET v4.0' exceeded time limits during shut down. The process id was ‘xxxx'.
This event immediately followed 90 seconds after the application pool recycling event:
A worker process with process id of ‘xxxx' serving application pool 'ASP.NET v4.0' has requested a recycle because the worker process reached its allowed processing time limit.
So clearly, the old worker process serving the ASP.NET v4.0 application pool was failing to shut down within the ShutdownTimeLimit of 90 seconds. I did some further experiments and discovered the following request was being retained in the request queue was causing the forced shutdown of the worker process:
The version of the SignalR libraries we were using at the time was 1.0.20228.0.
Last year, we upgraded to SignalR Version 2.2.31215.272 on both the client and server. It seems that this change resolved the problem I described above. The 'signalr/connect' request is still retained for the life of the hub connection, but when the application pool recycles the client and server gracefully reconnect it without any issues. Apparently some fix was made between SignalR V1 and V2 which allows it to handle application pool recycle events in a much more graceful manner.
Just for my own understanding, why was this issue being caused with V1 of the SignalR libraries, and what changed between V1 and V2 which resolved this issue?
Thanks.

Related

SignalR to prevent gateway timeout

We have an ASP.Net MVC app that is running with a gateway policy that any web requests that go over 5 minutes are terminated. One of the features is exporting some data. It's been running just above 5 minutes. Would SignalR help? Does having a persistent connection between the client and server be enough for the gateway to think that it is active and will not terminate it?
We face the same issue in our project where we have to process some data in API and UI can't wait for such long processing time interval response from API.
We use SignalR to notify the requester UI/Client when data get successfully processed.

ASP.Net WebApi Rest Service - using Task.Factory.StartNew

We have an WebApi json rest service written in C# for .Net 4.0 running in AWS. The service has a /log endpoint which receives logs and forwards the logs onto logstash via tcp for storage.
The /log endpoint uses Task.Factory.StartNew to send the logs to logstash async and returns StatusCode.OK immediately. This is because we don't want to client to wait for the log to be sent to logstash.
All exceptions are observed and handled, also we don't care if logs are lost because the service is shutdown or recycled from time to time as they are not critical.
At first the flow of logs was very low, probably 20 or 30 per hour during peek time. However we have recently started sending larger amounts of logs through, can be well over a thousand per hour. So the question now is that by using Task.Factoring.StartNew are we generating a large number of threads, i.e. 1 per request to the /log endpoint or is this managed somehow by a thread pool?
We use nLog for internal logging but are wondering if we can pass the logs from the /log endpoint to nlog to take advantage of its async batching features and have it send the logs to logstash? We have a custom target that will send logs to a tcp port.
Thanks in advance.
A Task in .NET does not equal one thread. It's safe to create as many as you need (almost). .NET will manage how many threads are created. .NET will not start more tasks than the hardware can handle.

ASP.NET Websocket behavior on Pool recycle

I'm currently evaluating using asp.net websockets for connecting a few thousands clients that will stay connected with the app pretty much 24x7, except for when the server goes offline for patching etc. Generally, the expectation from the system is that websockets should not disconnect unnecessarily and the clients will basically stay connected and ping the server every few mins.
While I was researching asp.net websocket viability for our new architecture, I came across another stackoverflow post: IIS App Pool cannot recycle when there is an open ASP.NET 4.5 Websocket which seems to suggest that IIS doesn't recycle the pool if there is an active websocket connection. Is this by design or did the other person experience an anomaly? If IIS does recycle the pool while the websocket connections are active, what's the expected behavior? Does Http.sys keep the connections, recycles the pool and things resume as if nothing happened (from the client's perspective)? Should I just create a separate app pool for websockets and disable recycling on it?
From my experience no, the WebSockets on the old worker process are not transitioned to the new worker process. I have observed that the old WebSockets are put into a non-Open state and it is up to your code to check this and stop maintaining those threads - my application was creating a heartbeat to the client over WebSockets and if the heartbeat failed I needed to close that (already closed) WebSocket context as soon as possible so the old worker process could fully unload and die.

Signalr .net client in a windows service

I have a problem with the .net client in a windows service. After some time it freezes .
The usecase is this.
A user uploads one or more files to our website. The service detects this and starts to process the files, sending a signal to the website when processing starts and when it ends.
The service checks for new files every 10 seconds. In this iteration we make a new connection to the server and stops it again, most of the time no messages is sent.
I suspect the connect/disconnect to be causing this. Is it better to open the connection when the services start and then use this connection in every iteration? The service must be able to run without restarts for months.

Azure web role trouble accessing external web service

I have an Azure web role that accesses an external WCF based SOAP web service (port 80) for various bits of data. The response from this service is highly erratic. I routinely get the following error.
There was no endpoint listening at
http://www.myexternalservice.com/service.svc that could accept the message. This is
often caused by an incorrect address or SOAP action.
To isolate the problem I created a simple console app to repetitively call this service in 1 second intervals and log all responses.
using (var svc = new MyExternalService())
{
stopwatch.Start();
var response = svc.CallService();
stopwatch.Stop();
Log(response, stopwatch.ElapsedMilliseconds);
}
If I RDP to one of my Azure web instances and run this app it takes 10 to 20 attempts before it gets a valid response from the external service. These first attempts are always accompanied by the above error. After this "warm up period" it runs fine. If I stop the app and then immediately restart, it has to go back through the same "warm up" period.
However, if I run this same app from any other machine I receive valid responses immediately. I have run this logger app on servers running in multiple data centers (non Azure), desktops on different networks, etc... These test runs are always very stable.
I am not sure why this service would react this way in the Azure environment. Unfortunately, for the short term I am forced to call this service but my users cannot tolerate this inconsistency.
A capture of network traffic on the Azure server indicates a large number of SynReTransmit's in 10 second intervals during the same time I experience the connection errors. Once the "warm up" is complete the SynReTransmit's no longer occur.
The Windows Azure data center region where the Windows Azure application is deployed might not be near the external Web Service. The local machine you're trying (which works fine) might be close to the web service. That’s why there might be huge latency in Azure which would likely cause it to fail.
Success accessing WSDL from a browser in Azure VM might be due to browser caching. Making a function call from browser would tell you if it is actually making a connection.
We found a solution for this problem although I am not completely happy with it. After exhausting all other courses of action we changed the load balancer to Layer-7 Load Balancing from Layer-4 Load Balancing. While this fixed the problem of lost requests I am not sure why this made a difference.

Categories

Resources