I wrote a server using TcpListener that is supposed to handle many thousands of concurrent connections.
Since I know that most of the time most connections will be idle (with the occasional ping-pong to make sure the other side is still there) async programming seemed to be the solution.
However after the first few hundred clients performance rapidly deteriorates. So rapidly in fact that I can barely reach 1000 concurrent connections.
The CPU is not maxed out (averaging at ~4%), RAM usage is <100MB, and there's not a lot of network traffic going on.
When I pause the server in Visual Studio and take a look at the 'Tasks' window, there are countless (hundreds) tasks with status "scheduled" and only few (less than 30) "running/active" tasks.
I tried to profile using Visual Studio as well as dotTrace Peformacne, but I couldn't find anything wrong. No lock contention, no "hot path" where a lot of CPU is used.
It seems like the application just slows down overall.
The setup
I have a simple while(true) and inside it there's this:
var client = await tcpListener.AcceptTcpClientAsync().ConfigureAwait(false);
Task.Run(() => OnClient(client));
In order to handle the connection I made a few methods to encapsulate the different stages of the connection.
For example inside the OnClient above there's await HandleLogin(...), and then it enters a while(client.IsConnected) loop that just does await stream.ReadBuffer(1). stream is just the normal NetworkStream that you get from TcpClient.GetStream, and ReadBuffer is a custom method implemented like this:
public static async Task<byte[]> ReadBuffer(this Stream stream, int length)
{
byte[] buffer = new byte[length];
int read = 0;
while (read < length)
{
int remaining = length - read;
int readNow = await stream.ReadAsync(buffer, read, remaining).ConfigureAwait(false);
read += readNow;
if (readNow <= 0)
throw new SocketException((int)SocketError.ConnectionReset);
}
return buffer;
}
I use .ConfigureAwait(false) at every single place where I await anything because I have need for any sort of synchronization context, and I don't want to pay the performance overhead of retreiving/creating a synchronization context everywhere.
One thing I noticed is that when I spawn 50 connections from my test-tool and then randomly just close it (so all connections it made should receive a ConnectionReset SocketException on the server) it takes a long time for the server to react at all oftentimes hanging completely until a new connection arrives.
Could it be that somehow some continuations want to synchronize and run on some specific thread somehow?
It's possible (when disconnecting at the right moment) to make the server application almost unusable with as few as 20 connections.
What am I doing wrong?
If it is some bug (which I assume it is), how would I go about finding it?
I narrowed the problem down to many Tasks just sitting at NetworkStream.ReadAsync(...) even though they should instantly receive a SocketException (ConnectionReset).
I tried starting my test tool (which is just using TcpClient) on a remote machine as well as locally and I get the same results.
Edit 1
My OnClient is defined as async Task OnClient(TcpClient client). Inside it, it awaits the different stages of the connection: authentication, some settings negotiation, then entering a loop where it waits for messages.
I use Task.Run because I do not want to wait until one client is done, but I want to accept all clients as fast as possible, spawning a new Task for each one. I am however unsure if I couldn't/shouldn't just write OnClient(client) without the Task.Run around it and also without awaiting OnClient (would result in a hint that doesn't go away but it is what I want I think, I don't want to wait until the client is done).
The last stage
The last stage the connection enters after authentication and settings negotaion is a loop where the server waits for messages from the client.
However before that the server also does another Task.Run() (with while(is connected) and await Task.Delay...) to send ping packets and a few other "management" things.
All writes into the NetworkStream are synchronized by using the lock mechanism from the Nito AsyncEx library to make sure no packets are somehow interleaved.
If any exception happen anywhere (when reading or writing) I always call .Close on the TcpClient to make sure all other pending incomplete reads and writes throw an exception.
I narrowed the problem down to many Tasks just sitting at NetworkStream.ReadAsync(...) even though they should instantly receive a SocketException (ConnectionReset).
This is an incorrect assumption. You have to write to the socket to detect dropped connections.
This is one of many pitfalls of TCP/IP programming, which is why I recommend people use SignalR if at all possible.
Other pitfalls that jump out from your code/description:
You're attempting to use asynchronous APIs, but your code also has Task.Run. So it's still doing a thread jump right away. This may be desirable or it may not. (Assuming OnClient is an async method; if it's using sync-over-async, then it's definitely not a good pattern).
while(client.IsConnected) is a common incorrect pattern. You should have both a read loop and write queue processor running simultaneously. In particular, IsConnected is absolutely meaningless - it literally only means that the socket was connected at some point in the past. It does not mean that it is still connected. If code has IsConnected, then there's a bug.
Related
I need to reimplement a database connection driver for some legacy cobol database for one of my customers. The way the application is built, i cannot use async/await (just leave it like that, i know it is stupid).
The whole application is an ASP.NET API.
The old driver uses a c++ dll, that is included with inter-op methods. The idea behind the old system is: use one connection to the db for everything, have multiple threads send a packet and have one thread that receives the answers and delegates them to the right thread.
To keep the connection alive, one needs to send some sort of ping message to database and handle its pong message.
I reimplemented that as POC in c#, have one connection, open a background thread and use AutoResetEvents to notify the right threads that the answer is ready to be processed. I set the ReceiveTimeout to 5 seconds, and while there was nobody sending data to the server, the receive timeout helped me to send the ping-message to the server.
A reason for the rewrite is, that the one-connection-solution does not scale.
So, my idea is to use a socket pool and ReceiveAsync with SocketAsyncEventArgs on the sockets.
The solution works so far, but not really good. Here are some questions:
As ReceiveTimeout is not compatible with ReceiveAsync, is there a other way then a timer to send my ping-messages
when using ReceiveAsync, can i still use normal Send to send data, or do i have to use SendAsync?
when ReceiveAsync does not receive all required data, may i use Receive to read the rest of it, or is it better to use ReceiveAsync again for the missing data?
Maybe not relevant: I use Artillery to fire some performance tests on the new driver; from time to time they timeout after 30 seconds (thats the db-transaction timeout i set); when i try to debug that Artillery gets ESOCKETTIMEDOUT even though no breakpoint is hit - is this a known behaviour when debugging an IIS process under load?
use AutoResetEvents to notify the right threads that the answer is ready to be processed.
May I suggest a thread-safe queue? BlockingCollection<T> or BufferBlock<T>?
I set the ReceiveTimeout to 5 seconds, and while there was nobody sending data to the server, the receive timeout helped me to send the ping-message to the server.
This is weird. I assume the entire protocol is ping-pong based, or else using a receive timeout to send messages would not work.
my idea is to use a socket pool and ReceiveAsync with SocketAsyncEventArgs on the sockets
If you can't use async/await, I would advise switching to the Begin*/End* style of asynchronous API. Going straight from synchronous to SocketAsyncEventArgs is quite a leap; SocketAsyncEventArgs is the most difficult form of socket async programming.
is there a other way then a timer to send my ping-messages
I would recommend a timer; that's the normal solution for heartbeat messages. The desired semantics should be "we want to send data at least this often". So use a timer that you can reset when sending regular messages (not receiving messages).
when using ReceiveAsync, can i still use normal Send to send data, or do i have to use SendAsync?
You should be able to use synchronous for one stream and asynchronous for the other. I've never tried this, though; all systems I've worked on are fully asynchronous.
when ReceiveAsync does not receive all required data, may i use Receive to read the rest of it, or is it better to use ReceiveAsync again for the missing data?
This question doesn't make as much sense to me. If you're asynchronously reading, you shouldn't block the calling thread.
Also, I think this question is framed from the wrong perspective. It seems like the code wants to "receive the next message", but this is a problematic way to approach reading from a socket. Instead, I recommend that your code have a loop that endlessly reads from the socket and passes that data to another type that buffers it as necessary and pushes out messages as they finish.
is this a known behaviour when debugging an IIS process under load?
I would not expect so, but I don't have much IIS load testing experience.
I have always pondered about this.
Let's say we have a simple asynchronous web request using the HttpWebRequest class
class webtest1
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("www.google.com");
public webtest1()
{
this.StartWebRequest();
}
void StartWebRequest()
{
webRequest.BeginGetResponse(new AsyncCallback(FinishWebRequest), null);
}
void FinishWebRequest(IAsyncResult result)
{
webRequest.EndGetResponse(result);
}
}
The same can be achieved easily with a synchronous operation:
class webtest1
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("www.google.com");
public webtest1()
{
webRequest.GetResponse();
}
}
So why would I want to use the more convoluted async operation when a much simpler sync operation would suffice? To save system resources?
If you make an asynchronous request you can do other things while you wait for the response to your request. If you make a synchronous request, you have to wait until you recieve your response until you can do something else.
For simple programs and scripts it may not matter so much, in fact in many of those situations the easier to code and understand synchronous method would be a better design choice.
However, for non-trivial programs, such as a desktop application, a synchronous request which locks up the entire application until the request is finished causes an unacceptable user expierence.
A synchronous operation will prevent you from doing anything else while waiting for the request to complete or time out. Using an asynchronous operation would let you animate something for the user to show the program is busy, or even let them carry on working with other areas of functionality.
The synchronous version is simpler to code but it masks a very serious problem. Network communication, or really an I/O operation, can block and for extended periods of time. Many network connections for example have a timeout of 2 minutes.
Doing a network operation synchronously means your application and UI will block for the entire duration of that operation. A not uncommon network hiccup could cause your app to block for several minutes with no ability to cancel. This leads to very unhappy customers.
Asynchronous becomes especially useful when you have more things going on than you have cores - for example, you might have a number of active web requests, a number of file access operations, and few DB calls, and maybe some other network operations (WCF or redis maybe). If all of those are synchronous, you are creating a lot of threads, a lot of stacks, and suffering a lot of context switches. If you can use an asynchronous API you can usually exploit pool threads for the brief moments when each operation is doing something. This is great for high throughput server environments. Having multiple cores is great, but being efficient is better.
In C# 5 this becomes, via await, no more work than your second example.
I was reading this the other day and a similar question has been pondered before:
Performance difference between Synchronous HTTP Handler and Asynchronous HTTP Handler
1) You are stuck to a single-threaded environment such as silverlight. Here you have no choice but to use async calls or the entire user thread will lock up.
2) You have many calls that take a long time to process. Why block your entire thread when it can go on and do other things while waiting for the return? For example if I have five function calls that each take 5 seconds, I would like to start all of them right away and have them return as necessary.
3) Too much data to process in the output synchronously. If I have a program that writes 10 gigabytes of data to the console and I want to read the output, I have a chance asynchronously to process line by line. If I do this synchronously then I will run out of buffer space and lock up the program.
I have a .NET 4 C# service that is using the TPL libraries for threading. We recently switched it to also use connection pooling, since one connection was becoming a bottle neck for processing.
Previously, we were using a lock clause to control thread safety on the connection object. As work would back up, the queue would exist as tasks, and many threads (tasks) would be waiting on the lock clause. Now, in most scenarios, threads wait on database IO and work processes MUCH faster.
However, now that I'm using connection pooling, we have a new issue. Once the max number of connections is reached (100 default), if further connections are requested, there is a timeout (see Pooling info). When this happens, an exception is thrown saying "Connection request timed out".
All of my IDisposables are within using statements, and I am properly managing my connections. This scenario happens due to more work being requested than the pool can process (which is expected). I understand why this exception is thrown, and am aware of ways of handling it. A simple retry feels like a hack. I also realize that I can increase the timeout period via the connection string, however that doesn't feel like a solid solution. In the previous design (without pooling), work items would process because of the lock within the application.
What is a good way of handling this scenario to ensure that all work gets processed?
Another approach is to use a semaphore around the code that retrieves connections from the pool (and, hopefully, returns them). A sempahore is like a lock statement, except that it allows a configurable number of requestors at a time, not just one.
Something like this should do:
//Assuming mySemaphore is a semaphore instance, e.g.
// public static Semaphore mySemaphore = new Semaphore(100,100);
try {
mySemaphore.WaitOne(); // This will block until a slot is available.
DosomeDatabaseLogic();
} finally {
mySemaphore.Release();
}
You could look to control the degree of parallelism by using the Parallel.ForEach() method as follows:
var items = ; // your collection of work items
var parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 100 };
Parallel.ForEach(items, parallelOptions, ProcessItem)
In this case I chose to set the degree to 100, but you can choose a value that makes sense for your current connection pool implementation.
This solution of course assumes that you have a collection of work items up front. If, however, you're creating new Tasks through some external mechanism such as incoming web requests the exception is actually a good thing. At that point I would suggest that you make use of concurrent Queue data structure where you can place the work items and pop them off as worker threads become available.
The simplest solution is to increase the connection timeout to the length of time you are willing to block a request before returning failure. There must be some length of time that is "too long".
This effectively uses the connection pool as a work queue with a timeout. It's a lot easier than trying to implement one yourself. You would have to check the connection pool is fair ( FIFO ).
It's not a question really, i'm just looking for some guidelines :)
I'm currently writing some abstract tcp server which should use as low number of threads as it can.
Currently it works this way. I have a thread doing listening and some worker threads. Listener thread is just sits and wait for clients to connect I expect to have a single listener thread per server instance. Worker threads are doing all read/write/processing job on clients socket.
So my problem is in building efficient worker process. And I came to some problem I can't really solve yet. Worker code is something like that(code is really simple just to show a place where i have my problem):
List<Socket> readSockets = new List<Socket>();
List<Socket> writeSockets = new List<Socket>();
List<Socket> errorSockets = new List<Socket>();
while( true ){
Socket.Select( readSockets, writeSockets, errorSockets, 10 );
foreach( readSocket in readSockets ){
// do reading here
}
foreach( writeSocket in writeSockets ){
// do writing here
}
// POINT2 and here's the problem i will describe below
}
it works all smothly accept for 100% CPU utilization because of while loop being cycling all over again, if I have my clients doing send->receive->disconnect routine it's not that painful, but if I try to keep alive doing send->receive->send->receive all over again it really eats up all CPU. So my first idea was to put a sleep there, I check if all sockets have their data send and then putting Thread.Sleep in POINT2 just for 10ms, but this 10ms later on produces a huge delay of that 10ms when I want to receive next command from client socket.. For example if I don't try to "keep alive" commands are being executed within 10-15ms and with keep alive it becomes worse by atleast 10ms :(
Maybe it's just a poor architecture? What can be done so my processor won't get 100% utilization and my server to react on something appear in client socket as soon as possible? Maybe somebody can point a good example of nonblocking server and architecture it should maintain?
Take a look at the TcpListener class first. It has a BeginAccept method that will not block, and will call one of your functions when someone connects.
Also take a look at the Socket class and its Begin methods. These work the same way. One of your functions (a callback function) is called whenever a certain event fires, then you get to handle that event. All the Begin methods are asynchronous, so they will not block and they shouldn't use 100% CPU either. Basically you want BeginReceive for reading and BeginSend for writing I believe.
You can find more on google by searching for these methods and async sockets tutorials. Here's how to implement a TCP client this way for example. It works basically the same way even for your server.
This way you don't need any infinite looping, it's all event-driven.
Are you creating a peer-to-peer application or a client server application? You got to consider how much data you are putting through the sockets as well.
Asynchronous BeginSend and BeginReceive is the way to go, you will need to implement the events but it's fast once you get it right.
Probably don't want to set your Send and Receive timeouts too high as well, but there should be a timeout so that if nothing is receive after a certain time, it will come out of the block and you can handle it there.
Microsoft has a nice async TCP server example. It takes a bit to wrap your head around it. It was a few hours of my own time before I was able to create the basic TCP framework for my own program based on this example.
http://msdn.microsoft.com/en-us/library/fx6588te.aspx
The program logic goes kind of like this. There is one thread that calls listener.BeginAccept and then blocks on allDone.WaitOne. The BeginAccept is an async call which gets offloaded to the threadpool and handled by the OS. When a new connection comes in, the OS calls the callback method passed in from BeginAccept. That method flips allDone to let the main listening thread to know it can listen once again. The callback method is just a transitionary method and continues on to call yet another async call to receive data.
The callback method supplied, ReadCallback, is the primary work "loop"(effectively recursive async calls) for the async calls. I use the term "loop" loosely because each method calls actually finishes, but not before calling the next async method. Effectively, you have a bunch of async calls all calling each other and you pass around your "state" object. This object is your own object and you can do whatever you want with it.
Every callback method will only get two things returned when the OS calls your method:
1) Socket Object representing the connection
2) State object with which you use for your logic
With your state object and socket object, you can effectively handle your "connections" asynchronously. The OS is VERY good at this.
Also, because your main loop blocks waiting for a connection to come it and off-loads those connections to the thread pool via async calls, it remains idle most of the time. The thread pool for your sockets is handled by the OS via completion ports, so they don't do any real work until data comes in. Very little CPU is used and it's effectively threaded via the thread pool.
P.S. From what I understand, you don't want to do any hard work with these methods, just handling the movement of the data. Since the thread pool is the pool for your Network IO and is shared by other programs, you should offload any hard work via threads/tasks/async as to not cause the socket thread pool to get bogged down.
P.P.S. I haven't found a way of closing the listening connection other than just disposing "listener". Because the async call for beginListen is called, that method will never return until a connection comes in, which means, I can't tell it to stop until it returns. I think I'll post a question on MSDN about it. and link if I get a good response.
Everything is fine is your code exept timeout value. You set it to 10 microseconds (10*10^-6) so your while routine iterates very often. You should set and adequate value (10 seconds for example) and your code will not eat 100% CPU.
List<Socket> readSockets = new List<Socket>();
List<Socket> writeSockets = new List<Socket>();
List<Socket> errorSockets = new List<Socket>();
while( true ){
Socket.Select( readSockets, writeSockets, errorSockets, 10*1000*1000 );
foreach( readSocket in readSockets ){
// do reading here
}
foreach( writeSocket in writeSockets ){
// do writing here
}
// POINT2 and here's the problem i will describe below
}
I have a web application which runs multiple threads on button click each thread making IO call on different ipAddresses ie(login windows account and then making file operations). There is a treshold value of 30 seconds. I assume that while login attempt if the treshold is exceeded, device on ipAddress does not match my conditions thus I dont care it. Thread.Abort() does not fit my situation where it waits for the IO call to finish which might take long time.
I tried doing the db operations acording to states of the threads right after the treshold timeout. It worked fine but when I checked out the log file, I noticed that the thread.IsAlive property of the nonresponding threads were still true. After several debuggings on my local pc, I encountered a possible deadlock situation (which ı suspect) that my pc crashed badly.
In short, do you have any idea about killing (forcefully) nonresponding threads (waiting for the IO opreation) right after the execution of the button_click?
(PS: I am not using the threadpool)
Oguzhan
EDIT
For further clarification,
I need to validate the given local administrator credentials on each ipAddress, and insert DB record for the successive ones. The rest, I dont care.
In my validation method, I first made a call to logonuser method of win32 by importing advapi32.dll for impersonating the administrator user. After that I attampt to create a temp dir on remote sys drive via Directory.CreateDirectory method just to check authorization. If any exception was thrown (UnauthorizedAccessException or IOException) then the remote machine is out of interest, else we got it so insert into DB.
I called the validation method syncronous way for a given ip range and it worked fine for a bunch of successive endpoints. But when I tested the method for some irrelevant ipAddress range, each validation attempt took 20 secs to 5 mins to complete.
I then turned my design into a multithreaded fashion in which I decided to run each validation in seperate thread and abort the nonresponding threads at the end of the treshold amount. The problem was the thread.abort did not suit the situation well, which in fact waits for the return of the IO instruction (which I dont want to) and raises a ThreadAbortException after that.
In order to complete the execution of the successive threads, I ignored the nonresponding threads and proceed with the DB operations and return back from the button click method (nonresponding threds were still alive at that point of time). Everyting seemd ok until I got a bad system crash after executing the button click (in debug mode) several times. The porblem was probably the increasing number of living threads under IIS service.
SOLUTION
Cause of the threads not to respond in a timely fashion is the network path does not found situation. My solution is to check the connectivity via TCP on port 135 (port 135 is mandatory for RPC on windows) before making the IO call. Default timeout period is 20 secs. If you need to set the timout use BeginConenct. An other option would be Pinging (if the ICMP is enabled in network)
Why do you really need to abort the threads anyway? Why not just let them complete normally but ignore the results? (Keep a token to indicate which "batch" of requests it's in, and then remember which batch you're actually interested in at the moment.)
Another option is to keep hold of whatever you're using to make an IO call (e.g. a socket) and close it; that should cause an exception on the thread making the request.
Yet another option is to avoid putting the requests on different threads, but instead use asynchronous IO - you'll still get parallelism, but without tying up threads (just IO completion ports). Also, can't you put a timeout on the IO operation itself? The request should just time out naturally that way.