I have a program that begins itself by listening for connections. I wanted to implement a pattern in which the server would accept a connection, pass that individual connection to a user class for processing: future packet reception, and handling of the data.
I ran into trouble with the synchronous pattern before I found out that asynchronous use of the Socket class isn't scary. But then I ran into more trouble. It seemed that, in a while (true) loop, since BeginAccept() is asynchronous, the program would constantly move through this loop and eventually run into an OutOfMemoryException. I needed something to listen for a connection, and immediately hand off responsibility of that connection to some other class.
So I read Microsoft's example and found out about ManualResetEvent. I could actually specify when I was ready for the loop to begin listening again! But after reading some questions here on Stack Overflow, I have become confused.
My worry is that even though I have asynchronously accepted a connection, the entire program will block while it's trying to listen for a new connection upon re-entering the loop. This isn't ideal if I'm handling multiple users.
I'm very new to the world of asynchronous I/O, so I would appreciate even the angriest of comments about my vocabulary or a misuse of a phrase.
Code:
static void Main(string[] args)
{
MainSocket = new Socket(SocketType.Stream, ProtocolType.Tcp);
MainSocket.Bind(new IPEndPoint(IPAddress.Parse("192.168.1.74"), 1626));
MainSocket.Listen(10);
while (true)
{
Ready.Reset();
AcceptCallback = new AsyncCallback(ConnectionAccepted);
MainSocket.BeginAccept(AcceptCallback, MainSocket);
Ready.WaitOne();
}
}
static void ConnectionAccepted(IAsyncResult IAr)
{
Ready.Set();
Connection UserConnection = new Connection(MainSocket.EndAccept(IAr));
}
The Microsoft example, in which they use the old-style WaitHandle based events, will work but frankly it is a very odd and awkward way to implement asynchronous code. I get the feeling that the events are there in the example mainly as a way of artificially synchronizing the main thread so it has something to do. But it's not really the right approach.
One option is to just not even accept sockets asynchronously. Instead, use the asynchronous I/O for when the socket is connected and use a synchronous loop in the main thread to accept sockets. This winds up being pretty much exactly what the Microsoft sample does anyway, but keeps all of the accept logic in the main thread instead of switching back and forth between the main thread (which starts the accept operation) and some IOCP thread that handles the completion.
Another option is to just give the main thread something else to do. For a simple example, this could be simply waiting for some user input to signal that the program should shut down. Of course, in a real program the main thread could be something useful (e.g. handling the message loop in a GUI program).
If the main thread is given something else to do, then you can use the asynchronous BeginAccept() in the way it was intended: you call the method to start the accept operation, and then don't call it again until that operation completes. The initial call happens when you initialize your server, but all subsequent calls happen in the completion callback.
In that case, your completion callback method looks more like this:
static void ConnectionAccepted(IAsyncResult IAr)
{
Connection UserConnection = new Connection(MainSocket.EndAccept(IAr));
MainSocket.BeginAccept(ConnectionAccepted, MainSocket);
}
That is, you simply call the BeginAccept() method in the completion callback itself. (Note that there's no need to create the AsyncCallback object explicitly; the compiler will implicitly convert the method name to the correct delegate type instance on your behalf).
Related
I am coming from Java programming language. I am just beginning my adventure with C# and .Net.
I am creating a Server socket application using C# - based on Microsoft example (https://learn.microsoft.com/en-us/dotnet/framework/network-programming/asynchronous-server-socket-example).
This is the sample code (from the page above):
while (true)
{
// Set the event to nonsignaled state.
allDone.Reset();
// Start an asynchronous socket to listen for connections.
Console.WriteLine("Waiting for a connection...");
listener.BeginAccept(
new AsyncCallback(AcceptCallback),
listener);
// Wait until a connection is made before continuing.
allDone.WaitOne();
}
... where allDone is a ManualResetEvent defined in the class scope.
I can't understand what actually listener.BeginAccept does. For instance if I don't call method allDone.WaitOne() to wait until connection is made, then what would have happend - how many times the call to listener.BeginAccept method would execute (be successful) inside the loop until it starts waiting (or maybe it would crash eventually). I am probably missing something here. Can someone explain this to me please?
Regards,
Janusz
The intent here is for the AcceptCallback method to set the allDone event when a connection is accepted. The loop can then go on to accept another incoming connection while the just accepted one continues with whatever it needs to do.
You could have done other useful work on the listening thread after the call to BeginAccept, if you had any that made sense.
Oddly the documentation does not explicitly state (that I could find) what happens if you just repeatedly call BeginAccept without waiting, but my recommendation would be to not do that.
There is a similar question When should I use UdpClient.BeginReceive? When should I use UdpClient.Receive on a background thread?
On that post Marc Gravell wrote
"Another advantage of the async approach is that you can always get the data you need, queue another async fetch, and then process the new data on the existing async thread, giving some more options for parallelism (one reading, one processing)."
Would you be able to give me an example of what you mean with this?
My problem is that I am listening to UDP packets but I don't have time to process them in the receiving thread as I want to return to my Receive as soon as possible as to not lose any packets in the meanwhile (being that the socket will drop any packets I don't receive, being that it isn't TCP) what would be the best way to do this?
With asynchronous IO, you can return to your caller as soon as you start the IO process. Because the nature of IO bound work is done asynchronously via the operating system, we can take advantage of this.
When you use blocking api such as UdpClient.Recieve on a different thread in order to keep your application responsive that thread will mostly block waiting for the UdpClient to complete its recieve method. With async IO, as mark said, you can free the thread until the IO operation completes and do different work in the meanwhile.
For example, we can use UdpClient.RecieveAsync, which returns a Task<UdpRecieveResult>. Because a task is awaitable (see this for more on awaitables), we can take advantage of async io:
public async Task RecieveAndDoWorkAsync()
{
var udpClient = new UdpClient(); // Initialize client
var recieveTask = udpclient.RecieveAsync();
// Do some more work
// Wait for the operation to complete, meanwhile returning control to tge calling method (without creating any new threads)
await recieveTask
}
Ok, I think I have understood the whole async/await thing. Whenever you await something, the function you're running returns, allowing the current thread to do something else while the async function completes. The advantage is that you don't start a new thread.
This is not that hard to understand as it's somewhat how Node.JS works, except Node uses alot of callbacks to make this happen. This is where I fail to understand the advantage however.
The socket class doesn't currently have any Async methods (that work with async/await). I can of course pass a socket to the stream class, and use the async methods there, however this leaves a problem with the accepting of new sockets.
There are two ways of doing this, as far as I know. In both cases I accept new sockets in an infinite loop on the main thread. In the first case I can start a new task for every socket that I accept, and run the stream.ReceiveAsync within that task. However, won't an await actually block that task, since the task will have nothing else to do? Which again will result in more threads spawned on the threadpool, which again is no better than using synchronous methods inside a task?
My second option is to put all accepted sockets in one of several lists (one list per thread), and inside those threads run a loop, running await stream.ReceiveAsync for every socket. This way, whenever i run into await, stream.ReceiveAsync and start receiving from all other sockets.
I guess my real question is if this is in any way more effective than a threadpool, and in the first case, if it really will be worse than just using the APM methods.
I also know you can wrap APM methods into functions using await/async, but the way I see it, you still get the "disadvantage" of APM methods, with the extra overhead of state machines in async/await.
The async socket API is not based around Task[<T>], so it isn't directly usable from async/await - but you can bridge that fairly easily - for example (completely untested):
public class AsyncSocketWrapper : IDisposable
{
public void Dispose()
{
var tmp = socket;
socket = null;
if(tmp != null) tmp.Dispose();
}
public AsyncSocketWrapper(Socket socket)
{
this.socket = socket;
args = new SocketAsyncEventArgs();
args.Completed += args_Completed;
}
void args_Completed(object sender, SocketAsyncEventArgs e)
{
// might want to switch on e.LastOperation
var source = (TaskCompletionSource<int>)e.UserToken;
if (ShouldSetResult(source, args)) source.TrySetResult(args.BytesTransferred);
}
private Socket socket;
private readonly SocketAsyncEventArgs args;
public Task<int> ReceiveAsync(byte[] buffer, int offset, int count)
{
TaskCompletionSource<int> source = new TaskCompletionSource<int>();
try
{
args.SetBuffer(buffer, offset, count);
args.UserToken = source;
if (!socket.ReceiveAsync(args))
{
if (ShouldSetResult(source, args))
{
return Task.FromResult(args.BytesTransferred);
}
}
}
catch (Exception ex)
{
source.TrySetException(ex);
}
return source.Task;
}
static bool ShouldSetResult<T>(TaskCompletionSource<T> source, SocketAsyncEventArgs args)
{
if (args.SocketError == SocketError.Success) return true;
var ex = new InvalidOperationException(args.SocketError.ToString());
source.TrySetException(ex);
return false;
}
}
Note: you should probably avoid running the receives in a loop - I would advise making each socket responsible for pumping itself as it receives data. The only thing you need a loop for is to periodically sweep for zombies, since not all socket deaths are detectable.
Note also that the raw async socket API is perfectly usable without Task[<T>] - I use that extensively. While await may have uses here, it is not essential.
This is not that hard to understand as it's somewhat how Node.JS works, except Node uses alot of callbacks to make this happen. This is where I fail to understand the advantage however.
Node.js does use callbacks, but it has one other significant facet that really simplifies those callbacks: they are all serialized to the same thread. So when you're looking at asynchronous callbacks in .NET, you're usually dealing with multithreading as well as asynchronous programming (except for EAP-style callbacks).
Asynchronous programming using callbacks is called "continuation-passing style" (CPS). It's the only real option for Node.js but is one of many options on .NET. In particular, CPS code can get extremely complex and difficult to maintain, so the async/await compiler transform was introduced so you could write "normal-looking" code and the compiler would translate it to CPS for you.
In both cases I accept new sockets in an infinite loop on the main thread.
If you're writing a server, then yes, somewhere you will be repeatedly accepting new client connections. Also, you should be continuously reading from each connected socket, so each socket also has a loop.
In the first case I can start a new task for every socket that I accept, and run the stream.ReceiveAsync within that task.
You wouldn't need a new task. That's the whole point of asynchronous programming.
My second option is to put all accepted sockets in one of several lists (one list per thread), and inside those threads run a loop, running await stream.ReceiveAsync for every socket.
I'm not sure why you'd need multiple threads, or any dedicated threads at all.
You seem a bit confused on how async and await work. I recommend reading my own introduction, the MSDN overview, the Task-Based Asynchronous Pattern guidance, and the async FAQ, in that order.
I also know you can wrap APM methods into functions using await/async, but the way I see it, you still get the "disadvantage" of APM methods, with the extra overhead of state machines in async/await.
I'm not sure what disadvantage you're referring to. The overhead of state machines, while non-zero, is negligible in the face of socket I/O.
If you're looking to do socket I/O, you have several options. For reads, you can either do them in an "infinite" loop using APM or Task wrappers around the APM or Async methods. Alternatively, you could convert them into a stream-like abstraction using Rx or TPL Dataflow.
Another option is a library I wrote a few years ago called Nito.Async. It provides EAP-style (event-based) sockets that handle all the thread marshaling for you, so you end up with something simpler like Node.js. Of course, like Node.js, this simplicity means it won't scale as well as a more complex solution.
While attempting to send a message for a queue through the BeginSend call seem te behave as a blocking call.
Specificly I have:
public void Send(MyMessage message)
{
lock(SEND_LOCK){
var state = ...
try {
log.Info("Begin Sending...");
socket.BeginSend(message.AsBytes(),0, message.ByteLength, SocketFlags.None,
(r) => EndSend(r), state);
log.Info("Begin Send Complete.");
}
catch (SocketException e) {
...
}
}
}
The callback would be something like this:
private void EndSend(IAsyncResult result) {
log.Info("EndSend: Ending send.");
var state = (MySendState) result.AsyncState;
...
state.Socket.EndSend(result, out code);
log.Info("EndSend: Send ended.");
WaitUntilNewMessageInQueue();
SendNextMessage();
}
Most of the time this works fine, but sometimes it hangs. Logging indicates this happens when BeginSend en EndSend are excecuted on the same Thread. The WaitUntilNewMessageInQueue blocks until there is a new message in the queue, so when there is no new message it can wait quit a while.
As far as I can tell this should not really be a problem, but in the some cases BeginSend blocks causing a deadlock situation where EndSend is blocking on WaitUntilNewMessageInQueue (expected), but Send is blocking on BeginSend in return as it seems te be waiting for the EndSend callback te return (not expected).
This behaviour was not what I was expecting. Why does BeginSend sometimes block if the callback does not return in timely fashion?
First of all, why are you locking in your Send method? The lock will be released before the send is complete since you are using BeginSend. The result is that multiple sends can be executing at the same time.
Secondly, do not write (r) => EndSend(r), just write EndSend (without any parameters).
Thrid: You do not need to include the socket in your state. Your EndSend method is working like any other instance method. You can therefore access the socket field directly.
As for your deadlocks, it's hard to tell. You delegate may have something to do with it (optimizations by the compiler / runner). But I have no knowledge in that area.
Need more help? Post more code. but I suggest that you fix the issues above (all four of them) and try again first.
Which operating system are you running on?
Are you sure you're seeing what you think you're seeing?
The notes on the MSDN page say that Send() CAN block if there's no OS buffer space to initiate your async send unless you have put the socket in non blocking mode. Could that be the case? Are you potentially sending data very quickly and filling the TCP window to the peer? If you break into the debugger what does the call stack show?
The rest is speculation based on my understanding of the underlying native technologies involved...
The notes for Send() are likely wrong about I/O being cancelled if the thread exits, this almost certainly depends on the underlying OS as it's a low level IO Completion Port/overlapped I/O issue that changed with Windows Vista (see here: http://www.lenholgate.com/blog/2008/02/major-vista-overlapped-io-change.html) and given that they're wrong about that then they could be wrong about how the completions (calls to EndSend() are dispatched on later operating systems). From Vista onwards it's possible that the completions could be dispatched on the issuing thread if the .Net sockets wrapper is enabling the correct options on the socket (see here where I talk about FILE_SKIP_COMPLETION_PORT_ON_SUCCESS)... However, if this were the case then it's likely that you'd see this behaviour a lot as initially most sends are likely to complete 'in line' and so you'd see most completions happening on the same thread - I'm pretty sure that this is NOT the case and that .Net does NOT enable this option without asking...
This is how you check if it completed synchronously so you avoid the callback on another thread.
For a single send:
var result = socket.BeginSend(...);
if (result.CompletedSynchronously)
{
socket.EndSend(result);
}
For a queue of multiple sends, you can just loop and finalize all synchronous sends:
while (true)
{
var result = socket.BeginSend(...);
if (!result.CompletedSynchronously)
{
break;
}
socket.EndSend(result);
}
I am using the TcpClient class in C#.
Each time there is a new tcp connection request, the usual practice is to create a new thread to handle it. And it should be possible for the main thread to terminate these handler threads anytime.
My solution for each of these handler thread is as follows:
1 Check NetworkStream's DataAvailable method
1.1 If new data available then read and process new data
1.2 If end of stream then self terminate
2 Check for terminate signal from main thread
2.1 If terminate signal activated then self terminate
3 Goto 1.
The problem with this polling approach is that all of these handler threads will be taking up significant processor resources and especially so if there is a huge number of these threads. This makes it highly inefficient.
Is there a better way of doing this?
See Asynchronous Server Socket Example to learn how to do this the ".NET way", without creating new threads for each request.
Believe it or not that 1000 tick sleep will really keep things running smooth.
private readonly Queue<Socket> sockets = new Queue<Socket>();
private readonly object locker = new object();
private readonly TimeSpan sleepTimeSpan = new TimeSpan(1000);
private volatile Boolean terminate;
private void HandleRequests()
{
Socket socket = null;
while (!terminate)
{
lock (locker)
{
socket = null;
if (sockets.Count > 0)
{
socket = sockets.Dequeue();
}
}
if (socket != null)
{
// process
}
Thread.Sleep(sleepTimeSpan);
}
}
I remember working on a similar kind of Windows Service. It was a NTRIP Server that can take around 1000 TCP connections and route the data to a NTRIP Caster.
If you have a dedicated server for this application then it will not be a problem unless you add more code to each thread (File IO, Database etc - although in my case I also had Database processing to log the in/out for each connection).
The things to watch out for:
Bandwidth when the threads goes up to 600 or so. You will start seeing disconnections when the TCP Buffer window is choked for some reason or the available bandwidth falls short
The operating system on which you are running this application might have some restrictions, which can cause disconnections
The above might not be applicable in your case but I just wanted it put it here because I faced then during development.
You're right that you do not want all of your threads "busy waiting" (i.e. running a small loop over and over). You either want them blocking, or you want to use asynchronous I/O.
As John Saunders mentioned, asynchronous I/O is the "right way" to do this, since it can scale up to hundreds of connections. Basically, you call BeginRead() and pass it a callback function. BeginRead() returns immediately, and when data arrives, the callback function is invoked on a thread from the thread pool. The callback function processes the data, calls BeginRead() again, and then returns, which releases the thread back into the pool.
However, if you'll only be holding a handful of connections open at a time, it's perfectly fine to create a thread for each connection. Instead of checking the DataAvailable property in a loop, go ahead and call Read(). The thread will block, consuming no CPU, until data is available to read. If the connection is lost, or you close it from another thread, the Read() call will throw an exception, which you can handle by terminating your reader thread.