Handling Timeouts in a Socket Server

Handling Timeouts in a Socket Server - c#

I have an asynchronous socket server that contains a thread-safe collection of all connected clients. If there's no activity coming from a client for a set amount of time (i.e. timeout), the server application should disconnect the client. Can someone suggest the best way to efficiently track this timeout for each connected client and disconnect when the client times out? This socket server must be very high performance and at any given time, hundreds of client could be connected.
One solution is to have each client associated with a last activity timestamp and have a timer periodically poll the collection to see which connection has timed out based on that timestamp and disconnect it. However, this solution to me isn't very good because that would mean the timer thread has to lock the collection whenever it polls (preventing any other connections/disconnections) for the duration of this process of checking every connected client and disconnecting when timed out.
Any suggestions/ideas would be greatly appreciated. Thanks.

If this is a new project or if you're open to a major refactor of your project, have a look at Reactive Extensions. Rx has an elegant solution for timeouts in asynchronous calls:
var getBytes = Observable.FromAsyncPattern<byte[], int, int, int>(_readStream.BeginRead, _readStream.EndRead);
getBytes(buffer, 0, buffer.Length)
.Timeout(timeout);
Note, the code above is just intended to demonstrate how to timeout in Rx and it would be more complex in a real project.
Performance-wise, I couldn't say unless you profile your specific use case. But I have seen a talk where they used Rx in a complex data-driven platform (probably like your requirements) and they said that their software was able to make decisions within less than 30ms.
Code-wise, I find that using Rx makes my code look more elegant and less verbose.

Your solution is quite OK for many cases when the number of connected clients is small compared to the performance of the enumerating the collection of their contexts. If you allow some wiggle room in the timeout (i.e. it's OK to disconnect the client somewhere between 55 and 65 seconds of inactivity), I'd just run the purging routine every so often.
The other approach which worked great for me was using a queue of activity tokens. Lets use C#:
class token {
ClientContext theClient; // this is the client we've observed activity on
DateTime theTime; // this is the time of observed activity
};
class ClientContext {
// ... what you need to know about the client
DateTime lastActivity; // the time the last activity happened on this client
}
Every time an activity happens on a particular client, a token is generated and pushed into FIFO queue and lastActivity is updated in the ClientContext.
There is another thread which runs the following loop:
extracts the oldest token from the FIFO queue;
checks if theTime in this token matches the theClient.lastActivity;
if it does, shuts down the client;
looks at the next oldest token in the queue, calculates how much time is left till it needs to be shutdown;
Thread.Sleep(<this time>);
repeat
The price of this approach is a little, but constant, i.e. O(1) overhead. One can come up with faster best case solutions, but it seems to me that it's hard to come up with a one with faster worst case performance.

I think it is easier to control timeout form within the connection thread. So you may have something like this:
// accept socket and open stream
stream.ReadTimeout = 10000; // 10 seconds
while (true)
{
int bytes = stream.Read(buffer, 0, length)
// process the data
}
The stream.Read() call will either return data (which means client is alive) or throw IO exception (abnormal disconnect) or return 0 (client closed the socket).

Related

Establish remote SSL connection after or before local user connection for SSL wrapper?

I'm trying to make a stunnel clone in C# just for fun. The main loop goes something like this (ignore the catch-everything-and-do-nothing try-catches just for now)
ServicePointManager.ServerCertificateValidationCallback = Validator;
TcpListener a = new TcpListener (9999);
a.Start ();
while (true) {
Console.Error.WriteLine ("Spinning...");
try {
TcpClient remote = new TcpClient ("XXX.XX.XXX.XXX", 2376);
SslStream ssl = new SslStream(remote.GetStream(), false, new RemoteCertificateValidationCallback(Validator));
ssl.AuthenticateAsClient("mirai.ca");
TcpClient user = a.AcceptTcpClient ();
new Thread (new ThreadStart(() => {
Thread.CurrentThread.IsBackground = true;
try{
forward(user.GetStream(), ssl); //forward is a blocking function I wrote
}catch{}
})).Start ();
} catch {
Thread.Sleep (1000);
}
}
I found that if I do the remote SSL connection, as I did, before waiting for the user, then when the user connects the SSL is already set up (this is for tunneling HTTP so latency is pretty important). On the other hand, my server closes long-inactive connections, so if no new connection happens in, say, 5 minutes, everything locks up.
What is the best way?
Also, I observe my program generating as much as 200 threads, which of course means that context-switching overhead is pretty big and sometimes results in the whole thing just blocking for seconds, even with just one user tunneling through the program. My forward function goes, in a gist, like
new Thread(new ThreadStart(()=>in.CopyTo(out))).Start();
out.CopyTo(in);
of course with lots of error handling to prevent broken connections from holding up forever. This seems to stall a lot though. I can't figure how to use asynchronous methods like BeginRead which should help according to google.

For any kind of proxy server (including an stunnel clone), opening the backend connection after you accept the frontend connection is clearly much simpler to implement.
If you pre-open backend connections in anticipation of receiving frontend connections, you can certainly save an RTT (which is good for latency), but you have to deal with the issue you hinted at: the backend will close idle connections. At any time that you receive a frontend connections, you run the risk that the backend connection that you are about to associate with this frontend connection and which has been opened some time ago is too old to use and may be closed by the backend. You will have to manage a pool of currently open backend connections and periodically close and refresh them when they become idle for too long. There is even a race condition where if the backend decided the connection has been idle too long and decides to close it but the proxy server receives a new frontend connection at the same time, the frontend may decide to forward a request through the backend connection while the backend is closing this connection. That means that you must be able to know a priori how long backend connections can be idle for before the backend will close them (you must know what the timeout values that are configured on the backend are set to) so you can give them up just before the backend will decide they are too old.
So in summary: pre-opening backend connections will save an RTT versus opening them only on demand, but it is a lot of work, including subtle connection pool management that it quite tough to implement bug-free. Up to you to judge if the extra complexity is worth it.
By the way, concerning your comment about handling several hundred simultaneous connections, I recommend implementing such an I/O-bound program as a proxy server based around an event loop instead of based around threads. Basically, you use non-blocking sockets and process events in a single thread (e.g. "this socket has new data waiting to be forwarded to the other side") instead of spawning a thread for each connection (which can get expensive both in thread creation and context switches). In order to scale such an event-based model to multiple CPU cores, you can start a small number of parallel threads of processes (more or less one per CPU core) which each handle many hundreds (or thousands) of simultaneous connections.

Handling user-timeouts in TCP server in C#

I'm writing a simple C# tcp message-server that needs to respond to the fact that a connected client has been silent the last TimeSpan timeout. In other words
Client A connects.
Client A sends stuff.
Server responds to client A.
Client B connects.
timeout time passes without client A sending anything.
Server sends "ping" (not as in network-ping, but as in a message, SendPing) to A.
Client B sends stuff.
Server responds.
pingTimeout time after ping was sent to A, connection to A is dropped, and the client is removed.
Same happens if B is silent too long.
Simple story short. If no word has been heard from client[n] within timeout, send ping. If ping is replied to, simply update client[n].LastReceivedTime, however, if client[n] fails to respond within pingTimeout, drop the connection.
As far as I understand, this must be done with some kind of scheduler, cause simply making a loop which says something like this
while(true) {
foreach(var c in clients) {
if(DateTime.Now.Subtract(c.LastReceivedTime) >= timeout && !c.WaitingPing)
c.SendPing();
else if(DateTime.Now.Subtract(c.LastReceivedTime) >= timeout + pingTimeout && c.WaitingPing)
c.Drop();
}
}
would simply fry the CPU and would be no good at all. Is there a good simple algorithm/class for handling cases like this that can easily be implemented in C#? And it needs to support 100-500 clients at once (as a minimum, it is only positive if it can handle more).

Your solution would be Ok I think if you use a dedicated thread and put a Thread.Sleep(1000) in there so you dont as you say fry the CPU. Avoid blocking calls on this thread eg make sure your calls to SendPing and Drop are asynchronous so this thread only does one thing.
The other solution is to use a System.Timers.Timer per client connection which has an interval equal to your ping timer. I'm using this method and have tested this with 500 clients with no issues. (20 sec interval). If your interval is much shorter I would not recommend this and look at other solutions using a single thread to check (like your solution)

ObjectDisposedException when using Multiple Asynchronous Clients to Multiple Servers

I've been looking into the Asynchronous Client and Asynchronous Server Socket examples on MSDN and have happily punched up the example that works flawlessly when one Client connects to one Server. My problem is that I need to synchronise a chunk of work with a number of machines so they execute at about the same time (like millisecond difference). The action is reasonably simple, talk to the child servers (all running on the same machine but on different ports for initial testing), simulate its processing and send a 'Ready' signal back to the caller. Once all the Servers have returned this flag (or a time-out occurs), a second message to is passed from the client to the acknowledged servers telling them to execute.
My approach so far has been to create two client instances, stored within a list, and start the routine by looping through the list. This works well but not particularly fast as each client's routine is ran synchronously. To speed up the process, I created a new thread and executed the routine on that for each client. Now this does work allowing two or more servers to return back and synchronise appropriately. Unfortunately, this is very error prone and the code errors with the 'ObjectDisposedException' exception on the following line of the 'ReceiveCallback' method...
// Read data from the remote device.
int bytesRead = client.EndReceive(ar);
With some investigation and debugging I tracked the sockets being passed to the routine (using its handle) and found while it isn't connected, it is always the second socket to return that fails and not the first that does successfully read its response. In addition, these socket instances (based upon the handle value) appear to be separate instances, but somehow the second (and subsequent responses) continue to error out on this line.
What is causing these sockets to inappropriately dispose of themselves before being legitmately processed? As they are running in separate threads and there are no shared routines, is the first socket being inappropriately used on the other instances? Tbh, I feel a bit lost at sea and while I could band-aid up these errors, the reliability of the code and potentially losing returning acknowledgements is not a favourable goal. Any pointers?
Kind regards

Turns out the shared / static ManualResetEvent was being set across the different instances so thread 1 would set the ManualResetEvent disposing the socket on the second thread. By ensuring that no methods / properties were shared / static - each thread and socket would execute under its own scope.

WebService and Polling

I'd like to implement a WebService containing a method whose reply will be delayed for less than 1 second to about an hour (it depends if the data is already cached or neeeds to be fetched).
Basically my question is what would be the best way to implement this if you are only able to connect from the client to the WebService (no notification possible)?
AFAIK this will only be possible by using some kind of polling. But polling is bad and so I'd rather like to avoid using it. The other extreme could be to just let the connection stay open as long as the method isn't done. But i guess this could end up in slowing down the webserver and the network. I considerd to combine these two technics. Then the client would call the method and the server will return after at least 10 seconds either with the message that the client needs to poll again or the actual result.
What are your thoughts?

You probably want to have a look at comet

I would suggest a sort of intelligent polling, if possible:
On first request, return a token to represent the request. This is what gets presented in future requests, so it's easy to check whether or not that request has really completed.
On future requests, hold the connection open for a certain amount of time (e.g. a minute, possibly specified on the client) and return either the result or a result of "still no results; please try again at X " where X is the best guess you have about when the response will be completed.
Advantages:
You allow the client to use the "hold a connection open" model which is relatively expensive (in terms of connections) but allows the response to be served as soon as it's ready. Make sure you don't hold onto a thread each connection though! (And have some kind of time limit...)
By saying when the client should come back, you can implement a backoff policy - even if you don't know when it will be ready, you could have a "backoff for 1, 2, 4, 8, 16, 30, 30, 30, 30..." minutes policy. (You should potentially check that the client isn't ignoring this.) You don't end up with masses of wasted polls for long misses, but you still get quick results quickly.

I think that for something which could take an hour to respond a web service is not the best mechanism to use.
Why is polling bad? Surely if you adjust the frequency of the polling it won't be so bad. Prehaps double the time between polls with a max of about five minutes.

Some web services I've worked with return a "please try again in " xml message when they can't respond immediately. I realise that this is just a refinement of the polling technique, but if your server can determine at the time of the request what the likely delay is going to be, it could tell the client that and then forget about it, leaving the client to ask again once the polling interval has expired.

There are timeouts on IIS and client - side, which will prevent you from leaving the connection open.
This is also not practical, because resources/connections are blocked on the server.
Why do you want the user to wait for such a long running task? Let them look up the status of the operation somewhere.

NetworkStream.Write returns immediately - how can I tell when it has finished sending data?

Despite the documentation, NetworkStream.Write does not appear to wait until the data has been sent. Instead, it waits until the data has been copied to a buffer and then returns. That buffer is transmitted in the background.
This is the code I have at the moment. Whether I use ns.Write or ns.BeginWrite doesn't matter - both return immediately. The EndWrite also returns immediately (which makes sense since it is writing to the send buffer, not writing to the network).
bool done;
void SendData(TcpClient tcp, byte[] data)
{
NetworkStream ns = tcp.GetStream();
done = false;
ns.BeginWrite(bytWriteBuffer, 0, data.Length, myWriteCallBack, ns);
while (done == false) Thread.Sleep(10);
}
 
public void myWriteCallBack(IAsyncResult ar)
{
NetworkStream ns = (NetworkStream)ar.AsyncState;
ns.EndWrite(ar);
done = true;
}
How can I tell when the data has actually been sent to the client?
I want to wait for 10 seconds(for example) for a response from the server after sending my data otherwise I'll assume something was wrong. If it takes 15 seconds to send my data, then it will always timeout since I can only start counting from when NetworkStream.Write returns - which is before the data has been sent. I want to start counting 10 seconds from when the data has left my network card.
The amount of data and the time to send it could vary - it could take 1 second to send it, it could take 10 seconds to send it, it could take a minute to send it. The server does send an response when it has received the data (it's a smtp server), but I don't want to wait forever if my data was malformed and the response will never come, which is why I need to know if I'm waiting for the data to be sent, or if I'm waiting for the server to respond.
I might want to show the status to the user - I'd like to show "sending data to server", and "waiting for response from server" - how could I do that?

I'm not a C# programmer, but the way you've asked this question is slightly misleading. The only way to know when your data has been "received", for any useful definition of "received", is to have a specific acknowledgment message in your protocol which indicates the data has been fully processed.
The data does not "leave" your network card, exactly. The best way to think of your program's relationship to the network is:
your program -> lots of confusing stuff -> the peer program
A list of things that might be in the "lots of confusing stuff":
the CLR
the operating system kernel
a virtualized network interface
a switch
a software firewall
a hardware firewall
a router performing network address translation
a router on the peer's end performing network address translation
So, if you are on a virtual machine, which is hosted under a different operating system, that has a software firewall which is controlling the virtual machine's network behavior - when has the data "really" left your network card? Even in the best case scenario, many of these components may drop a packet, which your network card will need to re-transmit. Has it "left" your network card when the first (unsuccessful) attempt has been made? Most networking APIs would say no, it hasn't been "sent" until the other end has sent a TCP acknowledgement.
That said, the documentation for NetworkStream.Write seems to indicate that it will not return until it has at least initiated the 'send' operation:
The Write method blocks until the requested number of bytes is sent or a SocketException is thrown.
Of course, "is sent" is somewhat vague for the reasons I gave above. There's also the possibility that the data will be "really" sent by your program and received by the peer program, but the peer will crash or otherwise not actually process the data. So you should do a Write followed by a Read of a message that will only be emitted by your peer when it has actually processed the message.

TCP is a "reliable" protocol, which means the data will be received at the other end if there are no socket errors. I have seen numerous efforts at second-guessing TCP with a higher level application confirmation, but IMHO this is usually a waste of time and bandwidth.
Typically the problem you describe is handled through normal client/server design, which in its simplest form goes like this...
The client sends a request to the server and does a blocking read on the socket waiting for some kind of response. If there is a problem with the TCP connection then that read will abort. The client should also use a timeout to detect any non-network related issue with the server. If the request fails or times out then the client can retry, report an error, etc.
Once the server has processed the request and sent the response it usually no longer cares what happens - even if the socket goes away during the transaction - because it is up to the client to initiate any further interaction. Personally, I find it very comforting to be the server. :-)

In general, I would recommend sending an acknowledgment from the client anyway. That way you can be 100% sure the data was received, and received correctly.

If I had to guess, the NetworkStream considers the data to have been sent once it hands the buffer off to the Windows Socket. So, I'm not sure there's a way to accomplish what you want via TcpClient.

I can not think of a scenario where NetworkStream.Write wouldn't send the data to the server as soon as possible. Barring massive network congestion or disconnection, it should end up on the other end within a reasonable time. Is it possible that you have a protocol issue? For instance, with HTTP the request headers must end with a blank line, and the server will not send any response until one occurs -- does the protocol in use have a similar end-of-message characteristic?
Here's some cleaner code than your original version, removing the delegate, field, and Thread.Sleep. It preforms the exact same way functionally.
void SendData(TcpClient tcp, byte[] data) {
NetworkStream ns = tcp.GetStream();
// BUG?: should bytWriteBuffer == data?
IAsyncResult r = ns.BeginWrite(bytWriteBuffer, 0, data.Length, null, null);
r.AsyncWaitHandle.WaitOne();
ns.EndWrite(r);
}
Looks like the question was modified while I wrote the above. The .WaitOne() may help your timeout issue. It can be passed a timeout parameter. This is a lazy wait -- the thread will not be scheduled again until the result is finished, or the timeout expires.

I try to understand the intent of .NET NetworkStream designers, and they must design it this way. After Write, the data to send are no longer handled by .NET. Therefore, it is reasonable that Write returns immediately (and the data will be sent out from NIC some time soon).
So in your application design, you should follow this pattern other than trying to make it working your way. For example, use a longer time out before received any data from the NetworkStream can compensate the time consumed before your command leaving the NIC.
In all, it is bad practice to hard code a timeout value inside source files. If the timeout value is configurable at runtime, everything should work fine.

How about using the Flush() method.
ns.Flush()
That should ensure the data is written before continuing.

Bellow .net is windows sockets which use TCP.
TCP uses ACK packets to notify the sender the data has been transferred successfully.
So the sender machine knows when data has been transferred but there is no way (that I am aware of) to get that information in .net.
edit:
Just an idea, never tried:
Write() blocks only if sockets buffer is full. So if we lower that buffers size (SendBufferSize) to a very low value (8? 1? 0?) we may get what we want :)

Perhaps try setting
tcp.NoDelay = true

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.