Does Async functions runs in new thread. And it continues with normal execution.
smtp.SendMailAsync(message);
If there are 100 messages in the Message list: msgList, and we put foreach for that, Is it that it will create 100 threads and will run parallely.
foreach (var item in msgList)
{
smtp.SendMailAsync(item);
}
Please explain me, and also the performance issues.
And please let me know if there is a better way to send mass emails at once.
Firstly, SendMailAsync is not TAP. You cannot await it. Secondly there is no need for a thread to exist when sending an email, most of the "wait" time is in the latency for the server to respond. Finally, "is a better way to send mass emails at once"? What problems have you found?
The best way to find out if there are performance problems is to try it.
SendMailAsync and all methods that use the Task Parallel Library execute in a threadpool thread, although you can make them use a new thread if you need to. This means that instead of creating a new thread, an available thread is picked from the pool and returned there when the method finishes.
The number of threads in the pool varies with the version of .NET and the OS, the number of cores etc. It can go from 25 threads per core in .NET 4 to hundreds or per core in .NET 4.5 on a server OS.
An additional optimization for IO-bound (disk, network) tasks is that instead of using a thread, an IO completion port is used. Roughly, this is a callback from the IO stack when an IO operation (disk or network) finishes. This way the framework doesn't waste a thread waiting for an IO call to finish.
When you start an asynchronous network operation, .NET make the network call, registers for the callback and releases the threadpool thread. When the call finishes, the framework gets notified and schedules the rest of the asynchronous method (essentially what comes after the await or ContinueWith) on a threadpool thread.
Submitting a 100 asynchronous operations doesn't mean that 100 threads will be used nor that all 100 of them will execute in parallel. Rather, the framework will take into account the number of cores, the load and the number of available threads to execute as many of them as possible, without hurting overall performance. Waiting on the network calls may not even use a thread at all, while processing the messages themselves will execute on threadpool threads
SendMailAsync is just a TPL wrapper around the SendAsync method, but neither method uses a thread. Instead it uses a model known as an IO completion port (IOCP).
When you call SendMailAsync, your thread writes the mail message to the socket connecting to the SMTP server and registers a callback with the operating system which will be executed when the client receives a response back from the server. This callback is triggered by the "completion" event handled by the IO completion port.
The callback itself is invoked on one of a number of IO completion threads which are managed by the thread-pool. This pool of threads just handles the callbacks from IO completion events. Part of completing the callback marks the Task returned by the call to SendMailAsync as "completed", allowing any awaiting code to start execution in their own context.
Related
It's very hard to find detailed but simple description of worker and I/O threads in .NET
What's clear to me regarding this topic (but may not be technically precise):
Worker threads are threads that should employ CPU for their work;
I/O threads (also called "completion port threads") should employ device drivers for their work and essentially "do nothing", only monitor the completion of non-CPU operations.
What is not clear:
Although method ThreadPool.GetAvailableThreads returns number of available threads of both types, it seems there is no public API to schedule work for I/O thread. You can only manually create worker thread in .NET?
It seems that single I/O thread can monitor multiple I/O operations. Is it true? If so, why ThreadPool has so many available I/O threads by default?
In some texts I read that callback, triggered after I/O operation completion is performed by I/O thread. Is it true? Isn’t this a job for worker thread, considering that this callback is CPU operation?
To be more specific – do ASP.NET asynchronous pages user I/O threads? What exactly is performance benefit in switching I/O work to separate thread instead of increasing maximum number of worker threads? Is it because single I/O thread does monitor multiple operations? Or Windows does more efficient context switching when using I/O threads?
The term 'worker thread' in .net/CLR typically just refers to any thread other than the Main thread that does some 'work' on behalf of the application that spawned the thread. 'Work' could really mean anything, including waiting for some I/O to complete. The ThreadPool keeps a cache of worker threads because threads are expensive to create.
The term 'I/O thread' in .net/CLR refers to the threads the ThreadPool reserves in order to dispatch NativeOverlapped callbacks from "overlapped" win32 calls (also known as "completion port I/O"). The CLR maintains its own I/O completion port, and can bind any handle to it (via the ThreadPool.BindHandle API). Example here: http://blogs.msdn.com/junfeng/archive/2008/12/01/threadpool-bindhandle.aspx. Many .net APIs use this mechanism internally to receive NativeOverlapped callbacks, though the typical .net developer won't ever use it directly.
There is really no technical difference between 'worker thread' and 'I/O thread' -- they are both just normal threads. But the CLR ThreadPool keeps separate pools of each simply to avoid a situation where high demand on worker threads exhausts all the threads available to dispatch native I/O callbacks, potentially leading to deadlock. (Imagine an application using all 250 worker threads, where each one is waiting for some I/O to complete).
The developer does need to take some care when handling an I/O callback in order to ensure that the I/O thread is returned to the ThreadPool -- that is, I/O callback code should do the minimum work required to service the callback and then return control of the thread to the CLR threadpool. If more work is required, that work should be scheduled on a worker thread. Otherwise, the application risks 'hijacking' the CLR's pool of reserved I/O completion threads for use as normal worker threads, leading to the deadlock situation described above.
Some good references for further reading:
win32 I/O completion ports: http://msdn.microsoft.com/en-us/library/aa365198(VS.85).aspx
managed threadpool: http://msdn.microsoft.com/en-us/library/0ka9477y.aspx
example of BindHandle: http://blogs.msdn.com/junfeng/archive/2008/12/01/threadpool-bindhandle.aspx
I'll begin with a description of how asynchronous I/O is used by programs in NT.
You may be familiar with the Win32 API function ReadFile (as an example), which is a wrapper around the Native API function NtReadFile. This function allows you to do two things with asynchronous I/O:
You can create an event object and pass it to NtReadFile. This event will then be signaled when the read operation completes.
You can pass an asynchronous procedure call (APC) function to NtReadFile. Essentially what this means is that when the read operation completes, the function will be queued to the thread which initiated the operation and it will be executed when the thread performs an alertable wait.
There is however a third way of being notified when an I/O operation completes. You can create an I/O completion port object and associate file handles with it. Whenever an operation is completed on a file which is associated with the I/O completion port, the results of the operation (like I/O status) is queued to the I/O completion port. You can then set up a dedicated thread to remove results from the queue and perform the appropriate tasks like calling callback functions. This is essentially what an "I/O worker thread" is.
A normal "worker thread" is very similar; instead of removing I/O results from a queue, it removes work items from a queue. You can queue work items (QueueUserWorkItem) and have the worker threads execute them. This prevents you from having to spawn a thread every single time you want to perform a task asynchronously.
Simply put a worker thread is meant to perform a short period of work and will delete itself when it has completed it. A callback may be used to notify the parent process that it has completed or to pass back data.
An I/O thread will perform the same operation or series of operations continuously until stopped by the parent process. It is so called because it typically device drivers run continuously monitor the device port. An I/O thread will typically create Events whenever it wishes to communicate to other threads.
All processes run as threads.
Your application runs as a thread.
Any thread may spawn worker threads or I/O threads (as you call them).
There is always a fine balance between performance and the number or type of threads used. Too many callbacks or Events handled by a process will severely degrade its performance due to the number of interruptions to its main process loop as it handles them.
Examples of a worker thread would be to add data into a database after user interaction or to perform a long mathematical calculation or write data to a file. By using a worker thread you free up the main application, this is most useful for GUIs as it doesn't freeze whilst the task is being performed.
Someone with more skills than me is going to jump in here to help out.
Worker threads have a lot of state, they are scheduled by the processor etc. and you control everything they do.
IO Completion Ports are provided by the operating system for very specific tasks involving little shared state, and thus are faster to use. A good example in .Net is the WCF framework. Every "call" to a WCF service is actually executed by an IO Completion Port because they are the fastest to launch and the OS looks after them for you.
I'm of the belief that you should never have to use Task.Run for any operation in .net core web context. If you have a long running task or CPU intensive task, you can offload it to a message queue for async processing. If you have a sync operation, that has no equivalent async method, then offloading to background thread does nothing for you, it actually makes in slightly worse.
What am I missing? Is there a genuine reason to use Task.Run in a high throughput server application?
Some quick examples:
A logging system where each worker thread can write to a queue and a worker thread is responsible for dequeuing items and writing them to a log file.
To access an apartment-model COM server with expensive initialization, where it may be better to keep a single instance on its own thread.
For logic that runs on a timer, e.g. a transaction that runs every 10 minutes to update an application variable with some sort of status.
CPU-bound operations where individual response time is more important than server throughput.
Logic that must continue to run after the HTTP response has been completed, e.g. if the total processing time would otherwise exceed an HTTP response timeout.
Worker threads for system operations, e.g. a long running thread that checks for expired cache entries.
Just to backup your belief:
Do not: Call Task.Run and immediately await it. ASP.NET Core already runs app code on normal Thread Pool threads, so calling
Task.Run only results in extra unnecessary Thread Pool scheduling.
Even if the scheduled code would block a thread, Task.Run does not
prevent that.
This is the official recommendation/best practice from Microsoft. Although it doesn't point out something you might have missed, it does tell you that it is a bad idea and why.
Suppose I call HttpWebRequest.BeginGetRequestStream() method.
I know that this method is asynchronous and when it's finished, an callback method should be called.
What I'm very sure is that when call the callback method, an I/O thread is created by CLR and this I/O thread do the callback.
What I'm not sure is when I call HttpWebRequest.BeginGetRequestStream(), is there any I/O thread created by CLR? Or just a worker thread created to send the request to the device?
Async IO is thread-less. There is no thread being created or blocked. That is the entire point of using async IO! What would it help you to unblock one thread and block another?
The internals of this have been discussed many times. Basically, the OS notifies the CLR when the IO is done. This causes the CLR to queue the completion callback you specified onto the thread-pool.
Short answer: You don't care.
Long answer:
There is no thread. More exactly, there is no thread for each of the asynchronous request you create. Instead, there's a bunch of I/O threads on the thread pool, which are mostly "waiting" on IOCP - waiting for the kernel to wake them up when data is available.
The important point is that each of these can respond to any notification, they're not (necessarily) selective. If you need to process 1000 responses, you can still do so with just the one thread - and it is only at this point that a thread from the thread pool is requested; it has to execute the callback method. For example, using synchronous methods, you'd have to keep a thousand threads to handle a thousand TCP connections.
Using asynchronous methods, you only need the one IOCP thread, and you only have to spawn (or rather, borrow) new threads to execute the callbacks - which are only being executed after you get new data / connection / whatever, and are returned back to the thread pool as soon as you're done. In practice, this means that a TCP server can handle thousands of simultaneous TCP connections with just a handful of threads. After all, there's little point in spawning more threads than your CPU can handle (which these days is usually around twice as much as you've got cores), and all the things that don't require CPU (like asynchronous I/O) don't require you to spawn new threads. If the handful of processing threads isn't enough, adding more will not help, unlike in the synchronous I/O scenario.
When you're dealing with socket IO using BeginReceive/EndReceive, the callback is invoked by an IOCP thread.
Once you're done receiving you need to process the data.
Should you do it on the callback's calling thread?
Or should you run the task using the ThreadPool.QueueUserWorkItem
Normally samples do the work on the callback thread, which is a little bit confusing.
If you're dealing with several hundred active connections, running the processing on the IOCP thread ends up with a process with hundreds of threads. Would ThreadPool help limiting the number of concurrent threads?
I don't know that there's a generic 'right answer' for everyone - it will depend on your individual use case.
These are some things to consider if you do consider going down the ThreadPool path.
Is Out of Order / Concurrent message processing something you can support?
If Socket A receives messages 1, 2, and 3 in rapid succession - they may end up being processed concurrently, or out of order.
.NET has per-CPU thread pools, if one CPU runs out of work, it may 'steal' tasks from other CPUs. This may mean that your messages might be executed in some random order.
Don't forget to consider what out-of-order processing might do to the client - eg if they send three messages which require responses, sending the responses out of order may be a problem.
If you must process each message in order, then you'll probably have to implement your own thread pooling.
Do you need to be concerned about potential Denial of Service?
If one socket suddenly sends a swarm of data, accepting all of that for processing may clog up internal queues and memory. You may have to limit the amount of un-processed data you accept.
IOCP threads has meant for serving IO operations. if your work executed by iocp callback can be long running or incures locks or other facilities that my block execution, you'd better hand it of to a worker thread.
Trying to figure out whether or not I should use async methods or not such as:
TcpListener.BeginAcceptTcpClient
TcpListener.EndcceptTcpClient
and
NetworkStream.BeginRead
NetworkStream.EndRead
as opposed to their synchronous TcpListener.AcceptTcpClient and NetworkStream.Read versions. I've been looking at related threads but I'm still a bit unsure about one thing:
Question: The main advantage of using an asynchronous method is that the GUI is not locked up. However, these methods will be called on separate Task threads as it is so there is no threat of that. Also, TcpListener.AcceptTcpClient blocks the thread until a connection is made so there is no wasted CPU cycles. Since this is the case, then why do so many always recommend using the async versions? It seems like in this case the synchronous versions would be superior?
Also, another disadvantage of using asynchronous methods is the increased complexity and constant casting of objects. For example, having to do this:
private void SomeMethod()
{
// ...
listener.BeginAcceptTcpClient(OnAcceptConnection, listener);
}
private void OnAcceptConnection(IAsyncResult asyn)
{
TcpListener listener = (TcpListener)asyn.AsyncState;
TcpClient client = listener.EndAcceptTcpClient(asyn);
}
As opposed to this:
TcpClient client = listener.AcceptTcpClient();
Also it seems like the async versions would have much more overhead due to having to create another thread. (Basically, every connection would have a thread and then when reading that thread would also have another thread. Threadception!)
Also, there is the boxing and unboxing of the TcpListener and the overhead associated with creating, managing, and closing these additional threads.
Basically, where normally there would just be individual threads for handling individual client connections, now there is that and then an additional thread for each type of operation performed (reading/writing stream data and listening for new connections on the server's end)
Please correct me if I am wrong. I am still new to threading and I'm trying to understand this all. However, in this case it seems like using the normal synchronous methods and just blocking the thread would be the optimal solution?
TcpListener.AcceptTcpClient blocks the thread until a connection is made so there is no wasted CPU cycles.
But there is also no work getting done. A Thread is a very expensive operating system object, about the most expensive there is. Your program is consuming a megabyte of memory without it being used while the thread blocks on connection request.
However, these methods will be called on separate Task threads as it is so there is no threat of that
A Task is not a good solution either, it uses a threadpool thread but the thread will block. The threadpool manager tries to keep the number of running TP threads equal to the number of cpu cores on the machine. That won't work well when a TP thread blocks for a long time. It prevents other useful work from being done by other TP threads that are waiting to get their turn.
BeginAcceptTcpClient() uses a so-called I/O completion callback. No system resources are consumed while the socket is listening. As soon as a connection request comes in, the operating system runs an APC (asynchronous procedure call) which grabs a threadpool thread to make the callback. The thread itself is in use for, typically, a few microseconds. Very efficient.
This kind of code will get a lot simpler in the next version of C# with the next async and await keywords. End of the year, maybe.
If you call AcceptTcpClient() on any thread, that thread is useless until you get a connection.
If you call BeginAcceptTcpClient(), the calling thread can stop immediately, without wasting the thread.
This is particularly important when using the ThreadPool (or the TPL), since they use a limited number of pool threads.
If you have too many threads waiting for operations, you can run out of threadpool threads, so that new work items will have to wait until one of the other threads finish.