When using Task what happens if the ThreadPool is full/busy? - c#

When I am using the .Net 4 Task class which uses the ThreadPool, what happens if all Threads are busy?
Does the TaskScheduler create a new thread and extend the ThreadPool maximum number of threads or does it sit and wait until a thread is available?

The maximum number of threads in the ThreadPool is set to around 1000 threads on a .NET4.0 32-bit system. It's less for older versions of .NET. If you have 1000 threads going, say they're blocking for some reason, when you queue the 1001st task, it won't ever execute.
You will never hit the max thread count in a 32 bit process. Keep in mind that each thread takes at least 1MB of memory (that's the size of the user-mode stack), plus any other overhead. You already lose a lot of memory from the CLR and native DLLs loaded, so you'll hit an OutOfMemoryException before you use that many threads.
You can change the number of threads the ThreadPool can use, by calling the ThreadPool.SetMaxThreads method. However, if you're expecting to use that many threads, you have a far larger problem with your code. I do NOT recommend you mess around with ThreadPool configurations like that. You'll most likely just get worse performance.
Keep in mind with Task and ThreadPool.QueueUserWorkItem, the threads are re-used when they're done. If you create a task or queue a threadpool thread, it may or may not create a new thread to execute your code. If there are available threads already in the pool, it will re-use one of those instead of creating (an expensive) new thread. Only if the methods you're executing in the tasks never return, should you worry about running out of threads, but like I said, that's an entirely different problem with your code.

By default, the MaxThreads of the ThreadPool is very high. Usually you'll never get there, your app will crash first.
So when all threads are busy the new tasks are queued and slowly, at most 1 per 500 ms, the TP will allocate new threads.

It won't increase MaxThreads. When there are more tasks than available worker threads, some tasks will be queued and wait until the thread pool provides an available thread. It does some pretty advanced stuff to scale with a large number of cores (work-stealing, thread injection, etc).

Related

What is the advantage of creating a thread outside threadpool?

Okay, So I wanted to know what happens when I use TaskCreationOptions.LongRunning. By this answer, I came to know that for long running tasks, I should use this options because it creates a thread outside of threadpool.
Cool. But what advantage would I get when I create a thread outside threadpool? And when to do it and avoid it?
what advantage would I get when I create a thread outside threadpool?
The threadpool, as it name states, is a pool of threads which are allocated once and re-used throughout, in order to save the time and resources necessary to allocate a thread. The pool itself re-sizes on demand. If you queue more work than actual workers exist in the pool, it will allocate more threads in 500ms intervals, one at a time (this exists to avoid allocation of multiple threads simultaneously where existing threads may already finish executing and can serve requests). If many long running operations are performed on the thread-pool, it causes "thread starvation", meaning delegates will start getting queued and ran only once a thread frees up. That's why you'd want to avoid a large amount of threads doing lengthy work with thread-pool threads.
The Managed Thread-Pool docs also have a section on this question:
There are several scenarios in which it is appropriate to create and
manage your own threads instead of using thread pool threads:
You require a foreground thread.
You require a thread to have a particular priority.
You have tasks that cause the thread to block for long periods of time. The thread pool has a maximum number of threads, so a large
number of blocked thread pool threads might prevent tasks from
starting.
You need to place threads into a single-threaded apartment. All ThreadPool threads are in the multithreaded apartment.
You need to have a stable identity associated with the thread, or to dedicate a thread to a task.
For more, see:
Thread vs ThreadPool
When should I not use the ThreadPool in .Net?
Dedicated thread or thread-pool thread?
"Long running" can be quantified pretty well, a thread that takes more than half a second is running long. That's a mountain of processor instructions on a modern machine, you'd have to burn a fat five billion of them per second. Pretty hard to do in a constructive way unless you are calculating the value of Pi to thousands of decimals in the fraction.
Practical threads can only take that long when they are not burning core but are waiting a lot. Invariably on an I/O completion, like reading data from a disk, a network, a dbase server. And often the reason you'd start considering using a thread in the first place.
The threadpool has a "manager". It determines when a threadpool thread is allowed to start. It doesn't happen immediately when you start it in your program. The manager tries to limit the number of running threads to the number of CPU cores you have. It is much more efficient that way, context switching between too many active threads is expensive. And a good throttle, preventing your program from consuming too many resources in a burst.
But the threadpool manager has the very common problem with managers, it doesn't know enough about what is going on. Just like my manager doesn't know that I'm goofing off at Stackoverflow.com, the tp manager doesn't know that a thread is waiting for something and not actually performing useful work. Without that knowledge it cannot make good decisions. A thread that does a lot of waiting should be ignored and another one should be allowed to run in its place. Actually doing real work.
Just like you tell your manager that you go on vacation, so he can expect no work to get done, you tell the threadpool manager the same thing with LongRunning.
Do note that it isn't quite a bad as it, perhaps, sounds in this answer. Particularly .NET 4.0 hired a new manager that's a lot smarter at figuring out the optimum number of running threads. It does so with a feedback loop, collecting data to discover if active threads actually get work done. And adjusts the optimum accordingly. Only problem with this approach is the common one when you close a feedback loop, you have to make it slow so the loop cannot become unstable. In other words, it isn't particularly quick at driving up the number of active threads.
If you know ahead of time that the thread is pretty abysmal, running for many seconds with no real cpu load then always pick LongRunning. Otherwise it is a tuning job, observing the program when it is done and tinkering with it to make it more optimal.

Is my understanding of the C# threadpool correct?

I'm reading Essential C# 5.0 which says,
The thread pool also assumes that all the work will be relatively
short-running (that is, consuming milliseconds or seconds of processor
time, not hours or days). By making this assumption, it can ensure
that each processor is working full out on a task, and not
inefficiently time slicing multiple tasks. The thread pool attempts to
prevent excessive time slicing by ensuring that thread creation is
"throttled" and so that no one processor is "oversubscrived" with too
many threads.
I've always thought one of the benefits of multithreading was time slicing.
If you >1 processor, than you can concurrently run threads and achieve true multithreading. But unless that's the case, you'd have to resort to time slicing for applications with multiple threads right?
So, if the ThreadPool in C# doesn't time slice, then,
a. Does it mean the ThreadPool is only used to get around the overhead in creating new threads?
b. Does it mean the ThreadPool can't run multiple threads simultaneously unless the processor has multiple cores, where each core can run a single process?
The .NET thread pool will create multiple threads per core, but has heuristics to keep the number of threads as low as possible while performing the maximum amount of work.
This means if your code is CPU-bound, you may end up with a single thread per core. If your code is I/O-bound and blocks, or queues up a huge amount of work items, you may end up with many threads per core.
It's not just thread creation that is expensive: context switching between hundreds of threads takes up a lot of time that you'd rather be spent running your own code. More threads is almost never better.
The quote you mention refers to the rate at which the Threadpool creates new threads when all of the threads that it has already created are already allocated to a task. The new thread creation rate is throttled (and new tasks are queued) to avoid creating many threads (and swamping the CPU) when a large burst of tasks are put on the Threadpool.
The current algorithm does indeed create many threads per CPU core, but it creates them relatively slowly, in the hope that the current backlog of tasks will be quickly satisfied by the threads that it has already created, and adding threads will not be needed.
Q. Does it mean the ThreadPool is only used to get around the overhead in creating new threads?
A. No. Thread creation is usually not the problem. The goal is make your CPU work at 100%, while using as as few concurrently executing threads as it is possible. The low number of threads will avoid excessive context switching, and improve overall execution time.
Q. Does it mean the ThreadPool can't run multiple threads simultaneously unless the processor has multiple cores, where each core can run a single process?
A. It runs multiple threads simultaneously. Ideally it runs exactly one CPU-intensive task per core, in which case you CPU works at 100%

Multithreading - New Thread vs ThreadPool

I have read in several blogs that we should create our own threads for long running, or blocking tasks and not consume from the thread pool.
My question: if I set setmaxthreads to 250 and I have 25 long running tasks, should I still create my own threads? I still have the remainder threads for other small tasks.
If they are long-running tasks, you should not use the ThreadPool at all. You really should not usually tweak the thread pool settings; certainly not to avoid this. Note that the thread pool size is limited for a reason; too many threads running at one time is a bad thing, too.
So, let the ThreadPool do what it's supposed to do, and just create your own thread for your long-running tasks. (assuming you aren't creating dozens or hundreds of these; in which case you have a different problem)

Can you have too many Delegate.BeginInvoke calls at once?

I am cleaning up some old code converting it to work asynchronously.
psDelegate.GetStops decStops = psLoadRetrieve.GetLoadStopsByLoadID;
var arStops = decStops.BeginInvoke(loadID, null, null);
WaitHandle.WaitAll(new WaitHandle[] { arStops.AsyncWaitHandle });
var stops = decStops.EndInvoke(arStops);
Above is a single example of what I am doing for asynchronous work. My plan is to have close to 20 different delegates running. All will call BeginInvoke and wait until they are all complete before calling EndInvoke.
My question is will having so many delegates running cause problems? I understand that BeginInvoke uses the ThreadPool to do work and that has a limit of 25 threads. 20 is under that limit but it is very likely that other parts of the system could be using any number of threads from the ThreadPool as well.
Thanks!
No, the ThreadPool manager was designed to deal with this situation. It won't let all thread pool threads run at the same time. It starts off allowing as many threads to run as you have CPU cores. As soon as one completes, it allows another one to run.
Every half second, it steps in if the active threads are not completing. It assumes they are stuck and allows another one to run. On a 2 core CPU, you'd now have 3 threads running.
Getting to the maximum, 500 threads on a 2 core CPU, would take quite a while. You would have to have threads that don't complete for over 4 minutes. If your threads behave that way then you don't want to use threadpool threads.
The current default MaxThreads is 250 per processor, so effectively there should be no limit unless your application is just spewing out calls to BeginInvoke. See http://msdn.microsoft.com/en-us/library/system.threading.threadpool.aspx. Also, the thread pool will try to reuse existing threads before creating new ones in order to reduce the overhead of creating new threads. If you invokes are all fast you will probably not see the thread pool create a lot of threads.
For long running tasks or blocking tasks it is usually better to avoid the thread pool and managed the threads yourself.
However, trying to schedule that many threads will probably not yield the best results on most current machines.

How to set WCF threads to schedual differently

I'm running a winservice that has 2 main objectives.
Execute/Handle exposed webmethods.
Run Inner processes that consume allot of CPU.
The problem is that when I execute many inner processes |(as tasks) that are queued to the threadpool or taskpool, the execution of the webmethods takes much more time as WCF also queues its executions to the same threadpool. This even happens when setting the inner processes task priority to lowest and setting the webmethods thread priority to heights.
I hoped that Framework 4.0 would improve this, and they have, but still it takes quite allot of time for the system to handle the WCF queued tasks if the CPU is handling other inner tasks.
Is it possible to change the Threadpool that WCF uses to a different one?
Is it possible to manually change the task queue (global task queue, local task queue).
Is it possible to manually handle 2 task queues that behave differently ?
Any help in the subject would be appropriated.
Gilad.
Keep in mind that the ThreadPool harbors two distinct types of threads: worker threads and I/O completion threads. WCF requests will be serviced by I/O threads. Tasks that you run via ThreadPool.QueueUserWorkItem will run on worker threads. So in that respect the WCF requests and the other CPU tasks are working from different queues already.
Some of your performance issues may be caused by your ThreadPool settings. From MSDN:
The thread pool maintains a minimum number of idle threads. For worker threads, the default value of this minimum is the number of processors. The GetMinThreads method obtains the minimum numbers of idle worker and I/O completion threads. When all thread pool threads have been assigned to tasks, the thread pool does not immediately begin creating new idle threads. To avoid unnecessarily allocating stack space for threads, it creates new idle threads at intervals. The interval is currently half a second, although it could change in future versions of the .NET Framework. If an application is subject to bursts of activity in which large numbers of thread pool tasks are queued, use the SetMinThreads method to increase the minimum number of idle threads. Otherwise, the built-in delay in creating new idle threads could cause a bottleneck.
I have certainly experienced the above bottleneck in the past. There is a method called SetMinThreads that will allow you to change these settings. By the way, you mention setting thread priorites; however, I am not familiar with the mechanism for changing thread priorities of the ThreadPool. Could you please elaborate? Also, I've read that setting thread priorities can be fraught with danger.
Coding Horror : Thread Priorities are Evil
By the way, how many processors/cores is your machine running?
Since you are using .NET 4.0, you could run your long running processes through the TPL. By default, the TPL uses the .NET thread pool to execute tasks but you can also provide your own TaskScheduler implementation. Take a look at the example scheduler implementations in the samples for the TPL. I have not used it personally, but the QueuedTaskScheduler seems to assign tasks to a queue and uses its own thread pool to process the tasks. You can use this to define the maximum amount of threads you want to use for your long running tasks.

Categories

Resources