Our server application is listening on a port, and after a period of time it no longer accepts incoming connections. (And while I'd love to solve this issue, it's not what I'm asking about here;)
The strange this is that when our app stops accepting connections on port 44044, so does IIS (on port 8080). Killing our app fixes everything - IIS starts responding again.
So the question is, can an application mess up the entire TCP/IP stack? Or perhaps, how can an application do that?
Senseless detail: Our app is written in C#, under .Net 2.0, on XP/SP2.
Clarification: IIS is not "refusing" the attempted connections. It is never seeing them. Clients are getting a "server did not respond in a timely manner" message (using the .Net TCP Client.)
You may well be starving the stack. It is pretty easy to drain in a high open/close transactions per second environment e.g. webserver serving lots of unpooled requests.
This is exhacerbated by the default TIME-WAIT delay - the amount of time that a socket has to be closed before being recycled defaults to 90s (if I remember right)
There are a bunch of registry keys that can be tweaked - suggest at least the following keys are created/edited
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
TcpTimedWaitDelay = 30
MaxUserPort = 65534
MaxHashTableSize = 65536
MaxFreeTcbs = 16000
Plenty of docs on MSDN & Technet about the function of these keys.
You haven't maxed out the available port handles have you ?
netstat -a
I saw something similar when an app was opening and closing ports (but not actually closing them correctly).
Use netstat -a to see the active connections when this happens. Perhaps, your server app is not closing/disposing of 'closed' connections.
Good suggestions from everyone, thanks for your help.
So here's what was going on:
It turns out that we had several services competing for the same port, and most of the time the "proper" service would get the port. Occasionally a second service would grab the port away, and the first service would try to open a different port. From that time on, the services would keep grabbing new ports every time they serviced a request (since they weren't using their preferred ports) and eventually we would exhaust all available ports.
Of course, the actual question was: "Can an application mess up the entire TCP/IP stack?", and the answer to that question is: Yes. One way to do it is to listen on a whole bunch of ports.
I guess the port number comment from RichS is correct.
Other than that, the TCP/IP stack is just a module in your operating system and, as such, can have bugs that might allow an application to kill it. It wouldn't be the first driver to be killed by a program.
(A tip to the hat towards Andrew Tanenbaum for insisting that operating systems should be modular instead of monolithic.)
I've been in a couple of similar situations myself. A good troubleshooting step is to attempt a connection from the affected machine to good known destination that isn't at that moment experiencing any connectivity issues. If the connection attempt fails, you are very likely to get more interesting details in the error message/code. For example, it could say that there aren't enough handles, or memory.
From a support and sys admin standpoint, I have only seen this on the rarest of occasions (more than once), but it certainly can happen.
When you are diagnosing the problem, you should carefully eliminate the possible causes, rather than blindly rebooting the system at the first sign of trouble. I only say this because many customers I work with are tempted to do that.
Related
I'm now trying to read data from a lot of Azure blobs in parallel using Azure Function and fail to do so, because my service plan does not allow more than ~4000 TCP connections (which I get an error in the portal about), however when I try to run it locally all of the following:
netstat with all possible flags
Wireshark
TCPView
network inspector in Windows task manager
just show a couple dozens of items. Is there a tool or maybe code snippet which will allow me to emulate locally the situation that I have once my app is deployed?
Even better would be knowing if it is possible to somehow limit the number of TCP connections that my Azure Function is trying to open (using .NET Azure SDK, or Azure portal, or some settings.json file or whatever)
Edit1: I've rewritten the whole thing to be sequential and also I've split blob reads into chunks of 100 items, this seemed to somewhat help the number of TCP connections (it's about 500 at peak now, so still a lot, but at least fitting the app service plan, the app, of course, became slow as hell as a result), but it still tries to allocate ~4000 of "Socket handles" and fails, still can't find a way to see locally the same amount of socket handles allocated - Handles column is Details tab of windows task manager shows roughly the same amount of handles during the whole process execution
To answer the question itself: I wasn't able to find a way to see locally the TCP-related metrics that I get when actually running my functions in Azure. For now it feels like some important development tools and/or docs are missing. The "serverless" experience turned out be the deepest dive into Windows system programming I ever had as a .NET developer.
The solution for the problem itself was the following:
I've rewritten the whole thing to be sequential and managed it to get establishing about a hundred simultaneous connections. Then I just used binary search playing with MaxDegreeOfParallelism until I found a value suitable for my plan.
You may be bumping up against the HTTP standard implementation within HttpClient which restricts the number of open connections to 2 by default. The HTTP/1.1 specification limits the number of connections from an application to two connections per server. You can override that default using the DefaultConnectionLimit property of the ServicePointManager. Microsoft has an article on it here.
I have a TCP server that gets data from one (and only one) client. When this client sends the data, it makes a connection to my server, sends one (logical) message and then does not send any more on that connection.
It will then make another connection to send the next message.
I have a co-worker who says that this is very bad from a resources point of view. He says that making a connection is resource intensive and takes a while. He says that I need to get this client to make a connection and then just keep using it for as long as we need to communicate (or until there is an error).
One benefit of using separate connections is that I can probably multi-thread them and get more throughput on the line. I mentioned this to my co-worker and he told me that having lots of sockets open will kill the server.
Is this true? Or can I just allow it to make a separate connection for each logical message that needs to be sent. (Note that by logical message I mean an xml file that is of variable length.)
It depends entirely on the number of connections that you are intending to open and close and the rate at which you intend to open them.
Unless you go out of your way to avoid the TIME_WAIT state by aborting the connections rather than closing them gracefully you will accumulate sockets in TIME_WAIT state on either the client or the server. With a single client it doesn't actually matter where these accumulate as the issue will be the same. If the rate at which you use your connections is faster than the rate at which your TIME_WAIT connections close then you will eventually get to a point where you cannot open any new connections because you have no ephemeral ports left as all of them are in use with sockets that are in TIME_WAIT.
I write about this in much more detail here: http://www.serverframework.com/asynchronousevents/2011/01/time-wait-and-its-design-implications-for-protocols-and-scalable-servers.html
In general I would suggest that you keep a single connection and simply reopen it if it gets reset. The logic may appear to be a little more complex but the system will scale far better; you may only have one client now and the rate of connections may be such that you do not expect to suffer from TIME_WAIT issues but these facts may not stay the same for the life of your system...
The initiation sequence of a TCP connection is a very simple 3 way handshake which has very low overhead. No need to maintain a constant connection.
Also having many TCP connections won't kill your server so fast. modern hardware and operating systems can handle hundreds of concurrect TCP connections, unless you are afraid of Denial of service attacks which are out of the scope of this question obviously.
If your server has only a single client, I can't imagine in practice there'd be any issues with opening a new TCP socket per message. Sounds like your co-worker likes to prematurely optimize.
However, if you're flooding the server with messages, it may become an issue. But still, with a single client, I wouldn't worry about it.
Just make sure you close the socket when you're done with it. No need to be rude to the server :)
In addition to what everyone said, consider UDP. It's perfect for small messages where no response is expected, and on a local network (as opposed to Internet) it's practically reliable.
From the servers perspective, it not a problem to have a very large number of connections open.
How many socket connections can a web server handle?
From the clients perspective, if measuring shows you need to avoid the time initiate connections and you want parallelism, you could create a connection pool. Multiple threads can re-use each of the connections and release them back into the pool when they're done. That does raise the complexity level so once again, make sure you need it. You could also have logic to shrink and grow the pool based on activity - it would be ashame to hold connections open to the server over night while the app is just sitting their idle.
I've written an IP multicasting application in C#. It compiles fine, but at runtime this line:
sock.SetSocketOption(SocketOptionLevel.IP,
SocketOptionName.AddMembership,
new MulticastOption(IPAddress.Parse("224.100.0.1")));
throws an unhandled socket exception:
An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full
I searched for the error in Google and people have suggested to remove the 3GB switch (my OS is Windows 7) which may have been enabled. I did that, but still get the same error. What could be the issue?
It could be port exhaustion.
When application(s) makes too many outgoing connections in short time frame or does not dispose outgoing connections properly - you run out of ports.
Here is the link to rather lengthy explanation and a way to diagnose the issue
This seems to happen when you run out of resources (sockets?) or memory.
At the command prompt run:
netstat -ab
I'm not sure off hand what the socket limit is. I'm currently fighting an issue like this myself.
Notes on socket limits:
http://support.microsoft.com/kb/196271
You can encounter this error message if resource limits are being exceeded. A System.Net.Sockets.Socket implements IDisposable. Are you disposing of your socket(s) once you're finished with them?
Leaving them around for the garbage collector is an excellent way to leak resources.
I have this same error, but only on Windows 7. If I run my same multicast app on Vista it works. It pops up a system dialog to unblock the behavior from the app, but it runs. This is probably a changed networking permissions thing in win7. I'm still looking for a solution. If someone else finds one, please post.
I've had the same issue (a.k.a error 10055) when trying to connect to a local MySQL database. I believe you need to raise the number of dynamic ports that the operating system allows.
The solution that worked for me was mentioned here
I believe it may help you as well, since you are using Windows.
Best of luck!
I had the same error on Windows Server 2008. In my case, after restart the server (with 2 years uptime) problem solved.
see - http://support.microsoft.com/kb/2553549 and https://support.microsoft.com/en-us/kb/929851 (you determine how many dynamic outbout ports you want). Along with this second article, set the TcpTimedWaitDelay to dword decimal value of 30. So when sockets get released to the system, the clear faster.
See - technet.microsoft.com/en-us/library/cc938217.aspx
Got this error when contacting an external email provider in Sweden called Loopia on a Windows Server 2012 R2 Datacenter x64.
The server had not been rebooted for almost a year. After rebooting everything worked.
An operation on a socket could not be performed because the system
lacked sufficient buffer space or because a queue was full
194.9.94.72:993
Description: An unhandled exception occurred during the execution of
the current web request. Please review the stack
trace for more information about the error and where it originated in
the code.
Exception Details: System.Net.Sockets.SocketException: An
operation on a socket could not be performed because the system lacked
sufficient buffer space or because a queue was full 194.9.94.72:993
I had the same issue and noticed that there were a lot of old/forgotten custom background processes running on the server and consuming sockets. Even the same processes run remotely by different users.
A brief cleanup via Task Manager did the job. Or reboot could be an option if you can.
I need to able to block any and all connections to my pc from a specific IP address , i know this is possible with a firewall but i need to do this in c#. Any idea how (need code).
Update :
Its a generic C# app not asp.net , target platform is WinXp till Win7
Need more information... if you're talking socket communication, you can simply close the connection to a client as soon as it connects if the IP address is blocked, or process the Connection Request and evaluate there.
Edit: Simplest way for you would probably just be to interact with Windows Firewall API... here's how:
http://www.shafqatahmed.com/2008/01/controlling-win.html
Your question is unclear but I'll try to answer the best I can, within my understanding.
Do you want to control machines from connecting to any port on your machine? if so, you need to control the built-in windows firewall or find yourself a filter driver you can control. In order to write your own filter driver, you must leave the land of managed code, so I am guessing that's not an option.
To learn how to control the firewall, here's a link:
http://www.shafqatahmed.com/2008/01/controlling-win.html
more on google.
Do you want to control remote machines from connection to a port on your machines that your application owns? You cannot do that either (see #1 above). However you can take action after the connection, and close the connection if you don't like the remote IP (check the remote endpoint's IP).
two caveats with this approach:
It doesn't save you from a DoS attack.
You will need to be careful if you need ipv6 support (you can't just check the IPV4 address in that case)
HTH
A "firewall" in c#?
First you would have to access the network interface on a low level, eg.: http://msdn.microsoft.com/en-us/library/ms817945.aspx
Then you have to parse all incoming packets and maybe discard them.
It's not an easy task and I don't recommend you to write a driver and a firewall in C#, because the .NET Framework will be loaded every time you start your machine.
Also traffic parsing can be tricky... I implemented a router/traffic analyzer in C# some time ago and it took me about one year to gain the experience with network programming to gain the knowledge to do this.
I have a web service slowdown.
My (web) service is in gsoap & managed C++. It's not IIS/apache hosted, but speaks xml.
My client is in .NET
The service computation time is light (<0.1s to prepare reply). I expect the service to be smooth, fast and have good availability.
I have about 100 clients, response time is 1s mandatory.
Clients have about 1 request per minute.
Clients are checking web service presence by tcp open port test.
So, to avoid possible congestion, I turned gSoap KeepAlive to false.
Until there everything runs fine : I bearly see connections in TCPView (sysinternals)
New special synchronisation program now calls the service in a loop.
It's higher load but everything is processed in less 30 seconds.
With sysinternals TCPView, I see that about 1 thousands connections are in TIME_WAIT.
They slowdown the service and It takes seconds for the service to reply, now.
Could it be that I need to reset the SoapHttpClientProtocol connection ?
Someone has TIME_WAIT ghosts with a web service call in a loop ?
Sounds like you aren't closing the connection after the call and opening new connections on each request. Either close the connection or reuse the open connections.
Be very careful with the implementations mentioned above. There are serious problems with them.
The implementation described in yakkowarner.blogspot.com/2008/11/calling-web-service-in-loop.html (COMMENT ABOVE):
PROBLEM: All your work will be be wiped out the next time you regenerate the web service using wsdl.exe and you are going to forget what you did not to mention that this fix is rather hacky relying on a message string to take action.
The implementation described in forums.asp.net/t/1003135.aspx (COMMENT ABOVE):
PROBLEM: You are selecting an endpoint between 5000 and 65535 so on the surface this looks like a good idea. If you think about it there is no way (at least none I can think of) that you could reserve ports to be used later. How can you guarantee that the next port on your list is not currently used? You are sequentially picking up ports to use and if some other application picks a port that is next on your list then you are hosed. Or what if some other application running on your client machine starts using random ports for its connections - you would be hosed at UNPREDICTABLE points in time. You would RANDOMLY get an error message like "remote host can't be reached or is unavailable" - even harder to troubleshoot.
Although I can't give you the right solution to this problem, some things you can do are:
Try to minimize the number of web service requests or spread them out more over a longer period of time
For your type of app maybe web services wasn't the correct architecture - for something with 1ms response time you should be using a messaging system - not a web service
Set your OS's number of connections allowed to 65K using the registry as in Windows
Set you OS's time that sockets remain in TIME_WAIT to some lower number (this presents its own list of problems)