Non-blocking Sockets vs BeginXXX vs SocketAsyncEventArgs

Non-blocking Sockets vs BeginXXX vs SocketAsyncEventArgs - c#

Can anyone please enlighten me about current .NET socket techniques?
Non-blocking sockets
If I set Socket.Blocking = false and use async operations - what will happen?
Is there any method of polling multiple non-blocking sockets instead of checking them for availability one-by-one (something like trivial select() or any other mechanism, some IOCP-related may be) aside from Socket.Select()?
BeginXXX and SocketAsyncEventArgs
Are they operating on blocking sockets under the hood and just hide thread creation?
Will manual creation of threads be equal to using BeginXXX methods?
Is there any other pros on using SocketAsyncEventArgs other then it allows to create pool of sockets and everything related to them?
And one final question: if app is working as some kind of heavily loaded binary proxy with most logic done in single thread - what provides better scalability: non-blocking approach or async operations?

1: Socket.Select should do that, although I don't tend to use that approach personally; in particular those IList get annoying at high volumes
2: no, other way around; the blocking operations are essentially using the non-blocking in the background, but with gates. No, they don't create threads under the hood - unless you count the callback when something is inbound. I have an example here that is serving 12k connections using SocketAsyncEventArgs - the thread count is something like 20. Among the intentions of SocketAsyncEventArgs is that:
it is far easier to pool effectively, without having lots of objects created/collected per operation
you can handle the "data is available now" scenario very efficiently without needing a callback at all (if the method returns false, you are meant to process the data immediately - no callback will be forthcoming)
For scalability: async

Related

Does .NET event-based asynchronous pattern for I/O operation blocked threads in underlying?

In typical .NET world, we use event-based asynchronous pattern(Event Handler) for most I/O operations, more specific as I know, the I/O completion port was introduced for improve the efficiency of scheduling the threads, like the ThreadPool, thus we don't need to manually maintain(init and destroy) the threads to handle the massive I/O responses.
Meanwhile, I naturally thought the waiting for I/O response don't need block any thread in modern Windows system because of hardware interrupt until I saw some pieces of C++ code in my recent project and even some sample code in web.
I Don't have any C++ experience
The first code piece is about a serial port listening, the pseudo C++ code(i input it in C# style) is like:
// loop checking the status
while(serialPort.Buffer.Count==0)
{
Thread.Sleep(100);
}
byte[] data = serialPort.Buffer;
// processing the actual data...
The second code piece is about the usage of I/O completion port in C++:
while (::GetQueuedCompletionStatus(port,
&bytesCopied,
&completionKey,
&overlapped,
INFINITE))
{
if (0 == bytesCopied && 0 == completionKey && 0 == overlapped)
{
break;
}
else
{
// Process completion packet
}
}
Obviously, they all blocking the thread.
So my question are:
Why those code didn't choose the Event-based no thread blocking way?
If .NET underlying use the second sample's code, so actually there're threads blocked when doing I/O operations?
(May a little off topic) Does .NET I/O operation callback allow concurrently re-enter when previous callback still under executing?(from my limited tests, the answer is NO) and why?

Well, first, blocking is not, in itself, bad. The 'main' GUI thread in a Windows app fires its 'OnClick' etc. events in response to messages received from a Windows message queue - a blocking producer-consumer queue. When no messages are received, the thread blocks on the queue. Same with most 'non-blocking' select() based servers - select is a blocking call,, (though it can be made to poll by setting a low/zero timeout - a poor design).
1) Asynchronous designs are intrinsically more complex. Per-socket context data, (eg. buffers), cannot be maintained in stack-based auto vars and must be maintained across events by either maintaining a global container of context objects, (that have to be looked up by socket handle in the events when they are fired), or by issuing context objects with the I/O requests and retrieving them from callback parameters in the events. Asynchronous designs should be totally asynchronous - calling anything that might block for any extended period has to be avoided if possible. Calls to opaque external libraries, DB queries and the like can be troublesome in this respect, blocking the supposedly asynchronous thread and preventing it from responding to events.
The first code snippet is just horrible and I struggle to find any justification for it at all. The sleep() loop polling has a built-in average 50ms latency in responding to input. Just mega-lame when better synch and async solutions exist. A dedicated read thread, queued APCs, (completion routines) and IOCP are all available for serial ports.
The second code-snippet IS, effectively, event-based async. You could make it look even more 'event-based' by having the handler threads call an event-handler with the parameters returned by the completion message.
IOCP is the preferred high-performance I/O system for Windows. It can handle many types of I/O operations and it's threadpool-based handlers can withstand occasional blocking or lengthy operations without holding up the processing of further I/O completions. Passing user-buffers in with the call allows the driver/s to load them directly in kernel space and removes a layer of copying. What it does not do is avoid the need to maintain context across the asynchronous calls.
Synchronous thread-per-client is commonly used where the requirements for scalability are swamped by the simple in-line code and immunity from blocking calls that are inherent in such designs. Handling serial comms is not something where scalability to thousands of ports is ever an issue.
2) IOCP handler threads block while waiting for completion messages, sure. If there is nothing to be done, threads should block:)
3) They should do. Adding on an extra layer of signaling to ensure that the callbacks are handled serially involves more overhead and adds back in the vulnerability to any kind of blocking in the callback holding up the handling of other callbacks from other IOCP handler threads that would not need to block. Since context is passed in as a parameter, there is no intrinsic requirement for IOCP-driven callbacks to be run in a serial manner. The code in the callback handler can just operate on the passed information in the manner of a state-machine.
That said, I would not be surprised if MS .NET did indeed provide signaling/queueing to enforce serial, non-reeentrant callbacks. Insufficiently experienced devs. often do things in multithreaded callbacks that they should not do, eg. accessing global/persistent state without any locking, or accessing thread-bound GUI controls directly. Serializing the calls, either by wrapping them into Windows messages or otherwise, removes this risk at the expense of performance.

Probably because asynchronous programming is hard.
.Net, on an I/O method mostly exposes a synchronous operation and an asynchronous operation when it can. So for example you have TcpClient.Connect and TcpClient.ConnectAsync/TcpClient.BeginConnect. Whatever starts with "Begin" or ends with "Async" is at least supposed to be async, which means there are no blocked threads. TcpClient.Connect is blocking and hence less scalable.
I'm not really sure, but i think they can. The question is why you want to do that? And how can you match a callback with its call?

C# Tcp communication Threadpool or asyn call

I have a C# application which listens for incoming TCP connections and receive data from previously accepted connections. Please help me whether i use Threadpool or Async methods to write the program?? Note that, once a connection is accepted, it doesn't close it and continuously receive data from the connection, at the same time it accept more connections

A threadpool thread works best when the code takes less than half a second and does not a lot of I/O that will block the thread. Which is exactly the opposite scenario you describe.
Using Socket.BeginReceive() is strongly indicated here. Highly optimized at both the operating level and the framework, your program uses a single thread to wait for all pending reads to complete. Scaling to handle thousands of active connections is quite feasible.
Writing asynchronous code cleanly can be quite difficult, variables that you'd normally make local variables in a method that runs on the threadpool thread turn into fields of a class. You need a state machine to keep track of the connection state. You'll greatly benefit from the async/await support available in C# version 5 which allows you to turn those state variables back into local variables. The little wrappers you find in this answer or this blog post will help a great deal.

It mainly depends on what do you want to do with your connections. If you have unknown number of connections which you don't know how long they will be open, I think it's better to do it with async calls.
But if you at least know the avg. number of connection and the connections are short-term connections like a web server's connections, then it's better to do it with threadpool since you won't waste time creating threads for each socket.

First off, if you possibly can, don't use TCP/IP. I recommend you self-host WebAPI and/or SignalR instead. But if you do decide to use TCP/IP...
You should always use asynchronous APIs for sockets. Ideally, you want to be constantly reading from the socket and periodically writing (keepalive messages, if nothing else). What you don't want to do is to have time where you're only reading (e.g., waiting for the next message), or time where you're only writing (e.g., sending a message). When you're reading, you should be periodically writing; and when you're writing, you should be continuously reading.
This helps you detect half-open connections, and also avoids deadlocks.
You may find my TCP/IP .NET Sockets FAQ helpful.

Definately use asynchronous sockets... It's never a good idea to block a thread waiting for IO.
If you decide you have high performance needs, you should consider using the EAP design pattern for your sockets.
This will allow you to create an asynchronous solution with a lower memory profile. However, some find that using events with sockets is awkard and a bit clunky... if you fall into this category, you could take a look at this blog post to use .NET 4.5's async/await keywords with it: http://blogs.msdn.com/b/pfxteam/archive/2011/12/15/10248293.aspx#comments

One server many clients: Threads or classes

I'm doing an application in C#, with a server and some clients (not more than 60), and I would like to be able to deal with each client independently. The communication between server and client is simple but I have to wait for some ack's and I don't want to block any query.
So far, I've done two versions of the server side, one it's based on this:
http://aviadezra.blogspot.com.es/2008/07/code-sample-net-sockets-multiple.html
and in the other one, I basically create a new thread for each client. Both versions work fine...but I would like to know pros and cons of the two methods.
Any programming pattern to follow in this sort of situation?

To answer your question it's both. You have threads and classes running in those threads. Whether you use WCF, async, sockets, or whatever, you will be running some object in a thread (or shuffled around a threadpool like with async). With WCF you can configure the concurrency model, and if you have to wait for ack's or other acknowledgement you'd be best to set it to multiple threads so you don't block other requests.
In the example you linked to the author is using AsyncCallback as the mechanism for telling you that a socket has data. But, from the MSDN you can see:
Use an AsyncCallback delegate to process the results of an asynchronous operation in a separate thread
So it's really no different for small scale apps. Using async like this can help you avoid allocating stack space for each thread, if you were to do a large application this would matter. But for a small app I think it just adds complexity. C# 4.5+ and F# do a cleaner job with async, so if you can use something like that then maybe go for it.
Doing it the way you have, you have a single thread that is responsible for socket management. It'll sit and accept new connections. When it gets a request it hands that socket to a new dedicated thread that will then sit on that socket and read from it. This thread is your client connection. I like to encapsulate the socket client reading into a base class that can do the low level io required and then act as a router for requests. I.e. when I get request XYZ I'll do request ABC. You can even have it dispatch events and subscribe to those events elsewhere (like in the async example). Now you've decoupled your client logic from your socket reading logic.
If you do things with WCF you don't need sockets and all that extra handling, but you should still be aware that calls are multi-threaded and properly synchronize your application when applicable.
For 60 clients I think you should choose whatever works best for you. WCF is easy to set up and easy to work with, I'd use that, but sockets are fine too. If you are concerned about the number of threads running, don't be. While it's bad to have too many threads running, most of your threads will actually be blocked while they are waiting on IO. Threads that are in a wait state aren't scheduled by the OS and don't really matter. Not to mention the waiting is most likely is using io completion ports under the hood so the wait overhead is pretty much negligible for a small application like yours.
In the end, I'd go with whatever is easiest to write, maintain, and extend.

What are scalable methods of queueing work items in asynchronous TCP server callbacks in C#?

According to this post:
How to write a scalable Tcp/Ip based server
jerrylvl states:
----------*
Processing
When you get the callback from the Begin call you made, it is very important to realise that the code in the callback will execute on the low-level IOCP thread. It is absolutely essential that you avoid lengthy operations in this callback. Using these threads for complex processing will kill your scalability just as effectively as using 'thread-per-connection'.
The suggested solution is to use the callback only to queue up a work item to process the incoming data, that will be executed on some other thread. Avoid any potentially blocking operations inside the callback so that the IOCP thread can return to its pool as quickly as possible. In .NET 4.0 I'd suggest the easiest solution is to spawn a Task, giving it a reference to the client socket and a copy of the first byte that was already read by the BeginReceive call. This task is then responsible for reading all data from the socket that represent the request you are processing, executing it, and then making a new BeginReceive call to queue the socket for IOCP once more. Pre .NET 4.0, you can use the ThreadPool, or create your own threaded work-queue implementation.
----------*
My question is how exactly would I be doing this in .Net 4.0? Could someone please provide me with a code example which would work well in a scalable environment?
Thanks!

My question was answered in more depth here: C# - When to use standard threads, ThreadPool, and TPL in a high-activity server
This goes into specifics on using each the TPL, ThreadPool, and standard threads to perform work items, and when to use each method.

What is the recommended way to pass data back and forth between two threads using C#

I am trying to make an app that will pass data between two servers Connection1 and Conenction2 using sockets.What i would like to do is receive data from Connection1 and pass it to Connection2 and vice-versa.Connection1 and Conenction2 are on different threads. What is the best way to call methods on different threads in order to pass data back and forth between them.Both threads will use the same message object type to communicate in both directions between them.
Thanks

You should use immutable data transfer objects.
As long as a simple object is deeply immutable (meaning that neither it nor any of it's properties can change), there is nothing wrong with using it on multiple threads.
To pass the instances between threads, you might want to use a pseudo-mutable thread-safe stack. (This depends on your design)

If .NET 4 is an option, I'd strongly recommend having a look at the ConcurrentQueue<T> and possibly even wrapping it with a BlockingCollection<T> if that suits your needs.

That depends on what those threads are doing. While passing data between threads is relatively straight forward, waking the threads to process the data can be more tricky. When you design communication with a thread per/connection paradigm, your thread is almost all the time stuck in a Read method, like Socket.Receive. While in this state, other threads cannot actually wake this thread to have him send the data they want it sent. One solution is to have the Receive time out every second and check if it has data to transmit, but that just plain sucks.
Another idea is to have 2 threads per socket, one to Send one to Receive. But then all the advantages of having a thread per socket are gone: you are no longer able to have a simple state management of the 'session' in the thread code, you have a state shared between two threads and it's just a mess.
You can consider using async Receive instead: the socket thread posts a BeginReceive then waits on an event. The event is signaled by either the Receive completion or by the send queue having something 'dropped' in (or you can wait on multiple events, same thing basically). Now this would work, but at this moment you have a half-breed, part async part one-thread -per-socket. If you go down this path, I'd go the whole 9 yards: make the server fully async.
Going fully async would be the best solution. Instead of exchanging data between threads, completion routines operate on locked data. The Connection1 BeginReceive completes when it receives data, you parse the received data and analyze the content, then decide to send it on Connection2. So you invoke BeginSend on Connection2's socket, meaning the thread that received the data also send the data. This is much more efficient ans scales better than the thread-per-socket model, but the big disadvantage is that is just plain complicated if you're mot familiar with async and multithreaded programming.
See Asynchronous Server Socket Example and Asynchronous Client Socket Example for a primer.

What you are describing as asynchronous messaging. Microsoft has already written an app for this called MSMQ

I would use WCF on .NET 3.5 for this task, it will be more scalable. I'm using WCF for a lot of my works and its flawless. The good thing about it is you can share your data across any platform.
http://msdn.microsoft.com/en-us/netframework/aa663324.aspx

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.