IOCP thread vs threadpool to process messages

IOCP thread vs threadpool to process messages - c#

When you're dealing with socket IO using BeginReceive/EndReceive, the callback is invoked by an IOCP thread.
Once you're done receiving you need to process the data.
Should you do it on the callback's calling thread?
Or should you run the task using the ThreadPool.QueueUserWorkItem
Normally samples do the work on the callback thread, which is a little bit confusing.
If you're dealing with several hundred active connections, running the processing on the IOCP thread ends up with a process with hundreds of threads. Would ThreadPool help limiting the number of concurrent threads?

I don't know that there's a generic 'right answer' for everyone - it will depend on your individual use case.
These are some things to consider if you do consider going down the ThreadPool path.
Is Out of Order / Concurrent message processing something you can support?
If Socket A receives messages 1, 2, and 3 in rapid succession - they may end up being processed concurrently, or out of order.
.NET has per-CPU thread pools, if one CPU runs out of work, it may 'steal' tasks from other CPUs. This may mean that your messages might be executed in some random order.
Don't forget to consider what out-of-order processing might do to the client - eg if they send three messages which require responses, sending the responses out of order may be a problem.
If you must process each message in order, then you'll probably have to implement your own thread pooling.
Do you need to be concerned about potential Denial of Service?
If one socket suddenly sends a swarm of data, accepting all of that for processing may clog up internal queues and memory. You may have to limit the amount of un-processed data you accept.

IOCP threads has meant for serving IO operations. if your work executed by iocp callback can be long running or incures locks or other facilities that my block execution, you'd better hand it of to a worker thread.

Related

Why does a blocking thread consume more then async/await?

See this question and answer;
Why use async controllers, when IIS already handles the request concurrency?
Ok, a thread consumes more resources then the async/await construction, but why? What is the core difference? You still need to remember all state etc, don't you?
Why would a thread pool be limited, but can you have tons of more idle async/await constructions?
Is it because async/await knows more about your application?

Well, let's imagine a web-server. Most of his time, all he does is wait. it doesn't really CPU-bound usually, but more of I/O bound. It waits for network I/O, disk I/O etc. After every time he waits, he has something (usually very short to do) and then all he does is waiting again. Now, the interesting part is what happend while he waits. In the most "trivial" case (that of course is absolutely not production), you would create a thread to deal with every socket you have.
Now, each of those threads has it's own cost. Some handles, 1MB of stack space... And of course, not all those threads can run in the same time - so the OS scheduler need to deal with that and choose the right thread to run each time (which means A LOT of context switching). It will work for 1 clients. It'll work for 10 clients. But, let's imagine 10,000 clients at the same time. 10,000 threads means 10GB of memory. That's more than the average web server in the world.
All of these resources, is because you dedicated a thread for a user. BUT, most of this threads does nothing! they just wait for something to happen. and the OS has API for async IO that allows you to just queue an operation that will be done once the IO operation completed, without having dedicated thread waiting for it.
If you use async/await, you can write application that will easily use less threads, and each of the thread will be utilized much more - less "doing nothing" time.
async/await is not the only way of doing that. You could have done this before async/await was introduced. BUT, async/await allows you to write code that's very readable and very easy to write that does that, and look almost as it runs just on a single thread (not a lot of callbacks and delegates moving around like before).
By combining the easy syntax of async/await and some features of the OS like async I/O (by using IO completion port), you can write much more scalable code, without losing readability.
Another famous sample is WPF/WinForms. You have the UI thread, that all he does is to process events, and usually has nothing special to do. But, you can't block it or the GUI will hang and the user won't like it. By using async/await and splitting each "hard" work to short operations, you can achieve responsible UI and readable code. If you have to access the DB to execute a query, you'll start the async operation from the UI thread, and then you'll "await" it until it ends and you have results that you can process in the UI thread (because you need to show them to the user, for example). You could have done it before, but using async/await makes it much more readable.
Hope it helps.

Creating a new thread allocates a separate memory area exclusive for this thread holding its resources, mainly its call stack which in Windows takes up 1MB of memory.
So if you have a 1000 idle threads you are using up at least 1GB of memory doing nothing.
The state for async operations takes memory as well but it's just the actual size needed for that operation and the state machine generated by the compiler and it's kept on the heap.
Moreover, using many threads and blocking them has another cost (which IMO is bigger). When a thread is blocked it is taken out of the CPU and switched with another (i.e. context-switch). That means that your threads aren't using their time-slices optimally when they get blocked. Higher rate of context switching means your machine does more overhead of context-switching and less actual work by the individual threads.
Using async-await appropriately enables using all the given time-slice since the thread, instead of blocking, goes back to the thread pool and takes another task to execute while the asynchronous operation continues concurrently.
So, in conclusion, the resources async await frees up are CPU and memory, which allows your server to handle more requests concurrently with the same amount of resources or the same amount of requests with less resources.

The important thing to realize here is that a blocked thread is not usable to do any other work until it becomes unblocked. A thread that encounters an await is free to return to the threadpool and pick up other work until the value being awaited becomes available.

When you call a synchronous I/O method, the thread executing your code is blocked waiting for the I/O to complete. To handle 1000 concurrent requests, you will need 1000 threads.
When you call an asynchronous I/O method, the thread is not blocked. It initializes the I/O operation and can work on something else. It can be the rest of your method (if you don't await), or it can be some other request if you await the I/O method. The thread pool doesn't need to create new threads for new requests, as all the threads can be used optimally and keep the CPUs busy.
Async I/O operations are actually implemented asynchronously at the OS level.

More TCPClients per Thread or More Threads

I need some guidance on a project we are developing. When triggered, my program needs to contact 1,000 devices by TCP and exchange about 200 bytes of information. All the clients are wireless on a private network. The majority of the time the program will be sitting idle, but then needs to send these messages as quickly as possible. I have come up with two possible methods:
Method 1
Use thread pooling to establish a number of worker threads and have these threads process their way through the 1,000 conversations. One thread handles one conversation until completion. The number of threads in the thread pool would then be tuned for best use of resources.
Method 2
A number of threads would be used to handle multiple conversations per thread. For example a thread process would open 10 socket connections start the conversation and then use asynchronous methods to wait for responses. As a communication is completed, a new device would be contacted.
Method 2 looks like it would be more effective in that operations wouldn’t have to wait with the server device responded. It would also save on the overhead of starting the stopping all those threads.
Am I headed in the right direction here? What am I missing or not considering?

There is a well-established way to deal with this problem. Simply use async IO. There is no need to maintain any threads at all. Async IO uses no threads while the IO is in progress.
Thanks to await doing this is quite easy.
The select/poll model is obsolete in .NET.

When does the CLR create the IO thread when I call BeginXXX()

Suppose I call HttpWebRequest.BeginGetRequestStream() method.
I know that this method is asynchronous and when it's finished, an callback method should be called.
What I'm very sure is that when call the callback method, an I/O thread is created by CLR and this I/O thread do the callback.
What I'm not sure is when I call HttpWebRequest.BeginGetRequestStream(), is there any I/O thread created by CLR? Or just a worker thread created to send the request to the device?

Async IO is thread-less. There is no thread being created or blocked. That is the entire point of using async IO! What would it help you to unblock one thread and block another?
The internals of this have been discussed many times. Basically, the OS notifies the CLR when the IO is done. This causes the CLR to queue the completion callback you specified onto the thread-pool.

Short answer: You don't care.
Long answer:
There is no thread. More exactly, there is no thread for each of the asynchronous request you create. Instead, there's a bunch of I/O threads on the thread pool, which are mostly "waiting" on IOCP - waiting for the kernel to wake them up when data is available.
The important point is that each of these can respond to any notification, they're not (necessarily) selective. If you need to process 1000 responses, you can still do so with just the one thread - and it is only at this point that a thread from the thread pool is requested; it has to execute the callback method. For example, using synchronous methods, you'd have to keep a thousand threads to handle a thousand TCP connections.
Using asynchronous methods, you only need the one IOCP thread, and you only have to spawn (or rather, borrow) new threads to execute the callbacks - which are only being executed after you get new data / connection / whatever, and are returned back to the thread pool as soon as you're done. In practice, this means that a TCP server can handle thousands of simultaneous TCP connections with just a handful of threads. After all, there's little point in spawning more threads than your CPU can handle (which these days is usually around twice as much as you've got cores), and all the things that don't require CPU (like asynchronous I/O) don't require you to spawn new threads. If the handful of processing threads isn't enough, adding more will not help, unlike in the synchronous I/O scenario.

Does smtp.SendMailAsync(message) runs in new thread

Does Async functions runs in new thread. And it continues with normal execution.
smtp.SendMailAsync(message);
If there are 100 messages in the Message list: msgList, and we put foreach for that, Is it that it will create 100 threads and will run parallely.
foreach (var item in msgList)
{
smtp.SendMailAsync(item);
}
Please explain me, and also the performance issues.
And please let me know if there is a better way to send mass emails at once.

Firstly, SendMailAsync is not TAP. You cannot await it. Secondly there is no need for a thread to exist when sending an email, most of the "wait" time is in the latency for the server to respond. Finally, "is a better way to send mass emails at once"? What problems have you found?
The best way to find out if there are performance problems is to try it.

SendMailAsync and all methods that use the Task Parallel Library execute in a threadpool thread, although you can make them use a new thread if you need to. This means that instead of creating a new thread, an available thread is picked from the pool and returned there when the method finishes.
The number of threads in the pool varies with the version of .NET and the OS, the number of cores etc. It can go from 25 threads per core in .NET 4 to hundreds or per core in .NET 4.5 on a server OS.
An additional optimization for IO-bound (disk, network) tasks is that instead of using a thread, an IO completion port is used. Roughly, this is a callback from the IO stack when an IO operation (disk or network) finishes. This way the framework doesn't waste a thread waiting for an IO call to finish.
When you start an asynchronous network operation, .NET make the network call, registers for the callback and releases the threadpool thread. When the call finishes, the framework gets notified and schedules the rest of the asynchronous method (essentially what comes after the await or ContinueWith) on a threadpool thread.
Submitting a 100 asynchronous operations doesn't mean that 100 threads will be used nor that all 100 of them will execute in parallel. Rather, the framework will take into account the number of cores, the load and the number of available threads to execute as many of them as possible, without hurting overall performance. Waiting on the network calls may not even use a thread at all, while processing the messages themselves will execute on threadpool threads

SendMailAsync is just a TPL wrapper around the SendAsync method, but neither method uses a thread. Instead it uses a model known as an IO completion port (IOCP).
When you call SendMailAsync, your thread writes the mail message to the socket connecting to the SMTP server and registers a callback with the operating system which will be executed when the client receives a response back from the server. This callback is triggered by the "completion" event handled by the IO completion port.
The callback itself is invoked on one of a number of IO completion threads which are managed by the thread-pool. This pool of threads just handles the callbacks from IO completion events. Part of completing the callback marks the Task returned by the call to SendMailAsync as "completed", allowing any awaiting code to start execution in their own context.

Server multithreading overkill?

I'm creating a server-type application at the moment which will do the usual listening for connections from external clients and, when they connect, handle requests, etc.
At the moment, my implementation creates a pair of threads every time a client connects. One thread simply reads requests from the socket and adds them to a queue, and the second reads the requests from the queue and processes them.
I'm basically looking for opinions on whether or not you think having all of these threads is overkill, and importantly whether this approach is going to cause me problems.
It is important to note that most of the time these threads will be idle - I use wait handles (ManualResetEvent) in both threads. The Reader thread waits until a message is available and if so, reads it and dumps it in a queue for the Process thread. The Process thread waits until the reader signals that a message is in the queue (again, using a wait handle). Unless a particular client is really hammering the server, these threads will be sat waiting. Is this costly?
I'm done a bit of testing - had 1,000 clients connected continually nagging - the server (so, 2,000+ threads) and it seemed to cope quite well.

I think your implementation is flawed. This kind of design doesn't scale because creating threads is expensive and there is a limit on how many threads can be created.
That is the reason that most implementations of this type use a thread pool. That makes it easy to put a cap on the maximum amount of threads while easily managing new connections and reusing the threads when the work is finished.
If all you are doing with your thread is putting items in a queue, then use the
ThreadPool.QueueUserWorkItem method to use the default .NET thread pool.
You haven't given enough information in your question to specify for definite but perhaps you now only need one other thread, constantly running clearing down the queue, you can use a wait handle to signal when something has been added.
Just make sure to synchronise access to your queue or things will go horribly wrong.

I advice to use following patter. First you need thread pool - build in or custom. Have a thread that checks is there something available to read, if yes it picks Reader thread. Then reading thread puts into queue and then thread from pool of processing threads will pick it. it will minimize number of threads and minimize time spend in waiting state

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.