Limit number of processors used in ThreadPool - c#

Is there any way to limit the number of processors that the ThreadPool object will use? According to the docs, "You cannot set the number of worker threads or the number of I/O completion threads to a number smaller than the number of processors in the computer."
So how can I limit my program to not consume all the processors?

After some experiments, I think I have just the thing. I've noticed that the ThreadPool considers the number of processors in the system as the number of processors available to the current process. This can work to your advantage.
I have 4 cores in my CPU. Trying to call SetMaxThreads with 2:
ThreadPool.SetMaxThreads(2, 2);
fails since I have 4 cores and so the numbers remain at their initial values (1023 and 1000 for my system).
However, like I said initially, the ThreadPool only considers the number of processors available to the process, which I can manage using Process.ProcessorAffinity. Doing this:
Process.GetCurrentProcess().ProcessorAffinity = new IntPtr(3);
limits the available processors to the first two cores (since 3 = 11 in binary). Calling SetMaxThreads again:
ThreadPool.SetMaxThreads(2, 2);
should work like a charm (at least it did for me). Just make sure to use the affinity setting right at program start-up!
Of course, I would not encourage this hack, since anyway your process will be stuck with a limited number of cores for the entire duration of its execution.

Thread pool was specifically designed to remove all the headache of manual managing of threads. OS task scheduler has been worked on for many years and is doing a very good job at scheduling tasks for execution. There are multiple options that the scheduler considers, such as thread priority, memory location and proximity to the processor etc. Unless you have deep understanding of these processes, you better off from setting thread affinity.
If I may quote docs
Thread affinity forces a thread to run on a specific subset of
processors. Setting thread affinity should generally be avoided,
because it can interfere with the scheduler's ability to schedule
threads effectively across processors. This can decrease the
performance gains produced by parallel processing. An appropriate use
of thread affinity is testing each processor.
It is possible to set a Thread affinity though (relevant post), for which you will need to create your own thread object (and not the one from thread pool). To my limited knowledge, you cannot set a thread affinity for a thread from a pool.

Since SetMaxThreads doesn't allow you to set no. of threads lower than no. of processors your net best bet is to write your own TaskScheduler in TPL that limits the concurrency by queuing the tasks. And then instead of adding items to threadpool, you can create tasks with that scheduler.
See
How to: Create a Task Scheduler That Limits the Degree of Concurrency

You don't typically limit the threadpool, you throttle the number of threads your program spawns at a time. The Task Parallel Library lets you set MaxDegreeOfParallelism, or you can use a Semaphore.

Related

Is my understanding of the C# threadpool correct?

I'm reading Essential C# 5.0 which says,
The thread pool also assumes that all the work will be relatively
short-running (that is, consuming milliseconds or seconds of processor
time, not hours or days). By making this assumption, it can ensure
that each processor is working full out on a task, and not
inefficiently time slicing multiple tasks. The thread pool attempts to
prevent excessive time slicing by ensuring that thread creation is
"throttled" and so that no one processor is "oversubscrived" with too
many threads.
I've always thought one of the benefits of multithreading was time slicing.
If you >1 processor, than you can concurrently run threads and achieve true multithreading. But unless that's the case, you'd have to resort to time slicing for applications with multiple threads right?
So, if the ThreadPool in C# doesn't time slice, then,
a. Does it mean the ThreadPool is only used to get around the overhead in creating new threads?
b. Does it mean the ThreadPool can't run multiple threads simultaneously unless the processor has multiple cores, where each core can run a single process?
The .NET thread pool will create multiple threads per core, but has heuristics to keep the number of threads as low as possible while performing the maximum amount of work.
This means if your code is CPU-bound, you may end up with a single thread per core. If your code is I/O-bound and blocks, or queues up a huge amount of work items, you may end up with many threads per core.
It's not just thread creation that is expensive: context switching between hundreds of threads takes up a lot of time that you'd rather be spent running your own code. More threads is almost never better.
The quote you mention refers to the rate at which the Threadpool creates new threads when all of the threads that it has already created are already allocated to a task. The new thread creation rate is throttled (and new tasks are queued) to avoid creating many threads (and swamping the CPU) when a large burst of tasks are put on the Threadpool.
The current algorithm does indeed create many threads per CPU core, but it creates them relatively slowly, in the hope that the current backlog of tasks will be quickly satisfied by the threads that it has already created, and adding threads will not be needed.
Q. Does it mean the ThreadPool is only used to get around the overhead in creating new threads?
A. No. Thread creation is usually not the problem. The goal is make your CPU work at 100%, while using as as few concurrently executing threads as it is possible. The low number of threads will avoid excessive context switching, and improve overall execution time.
Q. Does it mean the ThreadPool can't run multiple threads simultaneously unless the processor has multiple cores, where each core can run a single process?
A. It runs multiple threads simultaneously. Ideally it runs exactly one CPU-intensive task per core, in which case you CPU works at 100%

.Net Parallel, Task APIs Vs regular Threads

As I understand Parallel API use the Thread pool internally and they queue up the items for parallel processing, however, when I checked up the execution of one such parallel loop using SOS debugger, then my understanding is that if I have 10 tasks lined up then all of them might not go in parallel and CLR would decide how many threads to dispatch for the given tasks to be executed, so it may be 4 or 5 or 6 (varied number in each execution)
However in case my total task number if not very high like 10 and I want all of them to go in parallel, since all of them are long running, then it is preferable to have them on Traditional threads, which will ensure 1 thread per task and they all go in parallel
In case the number of tasks are good number like 100, then usage of Parallel or Threadpool is a practical solution, as we do not want to invoke 100 individual threads per process
Please share your view, I understand the benefit of Parallel API making complete Parallel programming very easy to implement, but here my aim is different
By default, the .NET thread pool initializes a number of worker threads that corresponds to the number of logical cores on your machine. It subsequently employs a hill-climbing heuristic that adjusts this number based on the current task workload, firing up new worker threads when a task takes too long to complete.
You are correct in wishing for your long-running tasks to be executed simultaneously through thread oversubscription (i.e. running multiple threads per logical core). In fact, the Task Parallel Library infrastructure (TPL) provides specifically for this scenario through the LongRunning option, which (under the current implementation) spawns a new dedicated thread for each task marked thusly.
Task.Factory.StartNew(myLongRunningAction, TaskCreationOptions.LongRunning);

multithreading using threadPool or new thread() in server app

I have done a bunch of reading on the dynamics and implications of multi-threading in a server app (starving the clr thread pool, etc.) but let's say for sake of argument I have EXACTLY 4 async processes I need to accomplish per request of my (asp.net) page... Now let's say time is the more critical element and my site should not experience heavy traffic. In this scenario, is it preferable to spawn 4 threads using the new Thread() approach or the ThreadPool.QueueUserWorkItem method?
My concern (and my point) here is that using the ThreadPool method, it may create a thread pool that is too large than what I really want? When I only need 4 threads, can't I just spawn them myself to keep the number of allocated app domain, clr threads minimal?
Spawning a thread is a very costly and therefore high-latency operation. If you want to manage threads yourself, which would be reasonable but not required, you have to build a custom pool.
Using thread pool work items is not without danger because it does not guarantee you a concurrency level of 4. If you happen to get 2 or 3 you will have much more latency in your HTTP request.
I'd use the thread-pool and use SetMinThreads to ensure that threads are started without delay and that there are always enough.
I would definitely go for the ThreadPool approach. It's designed for exactly this kind of scenario. The thread pool will internally manage the number of threads required, making sure not to overburden the system. Quoting from MSDN:
The thread pool provides new worker threads or I/O completion threads
on demand until it reaches the minimum for each category. When a
minimum is reached, the thread pool can create additional threads in
that category or wait until some tasks complete. Beginning with the
.NET Framework 4, the thread pool creates and destroys worker threads
in order to optimize throughput, which is defined as the number of
tasks that complete per unit of time. Too few threads might not make
optimal use of available resources, whereas too many threads could
increase resource contention.
If you're really paranoid you can limit it manually with SetMaxThreads. Going for the manual threading management will only introduce potential bugs.
If you have access to .net 4.0 you can use the TPL Task class (it also uses the ThreadPool under the hood), as it has even more appealing features.

When using Task what happens if the ThreadPool is full/busy?

When I am using the .Net 4 Task class which uses the ThreadPool, what happens if all Threads are busy?
Does the TaskScheduler create a new thread and extend the ThreadPool maximum number of threads or does it sit and wait until a thread is available?
The maximum number of threads in the ThreadPool is set to around 1000 threads on a .NET4.0 32-bit system. It's less for older versions of .NET. If you have 1000 threads going, say they're blocking for some reason, when you queue the 1001st task, it won't ever execute.
You will never hit the max thread count in a 32 bit process. Keep in mind that each thread takes at least 1MB of memory (that's the size of the user-mode stack), plus any other overhead. You already lose a lot of memory from the CLR and native DLLs loaded, so you'll hit an OutOfMemoryException before you use that many threads.
You can change the number of threads the ThreadPool can use, by calling the ThreadPool.SetMaxThreads method. However, if you're expecting to use that many threads, you have a far larger problem with your code. I do NOT recommend you mess around with ThreadPool configurations like that. You'll most likely just get worse performance.
Keep in mind with Task and ThreadPool.QueueUserWorkItem, the threads are re-used when they're done. If you create a task or queue a threadpool thread, it may or may not create a new thread to execute your code. If there are available threads already in the pool, it will re-use one of those instead of creating (an expensive) new thread. Only if the methods you're executing in the tasks never return, should you worry about running out of threads, but like I said, that's an entirely different problem with your code.
By default, the MaxThreads of the ThreadPool is very high. Usually you'll never get there, your app will crash first.
So when all threads are busy the new tasks are queued and slowly, at most 1 per 500 ms, the TP will allocate new threads.
It won't increase MaxThreads. When there are more tasks than available worker threads, some tasks will be queued and wait until the thread pool provides an available thread. It does some pretty advanced stuff to scale with a large number of cores (work-stealing, thread injection, etc).

How to set WCF threads to schedual differently

I'm running a winservice that has 2 main objectives.
Execute/Handle exposed webmethods.
Run Inner processes that consume allot of CPU.
The problem is that when I execute many inner processes |(as tasks) that are queued to the threadpool or taskpool, the execution of the webmethods takes much more time as WCF also queues its executions to the same threadpool. This even happens when setting the inner processes task priority to lowest and setting the webmethods thread priority to heights.
I hoped that Framework 4.0 would improve this, and they have, but still it takes quite allot of time for the system to handle the WCF queued tasks if the CPU is handling other inner tasks.
Is it possible to change the Threadpool that WCF uses to a different one?
Is it possible to manually change the task queue (global task queue, local task queue).
Is it possible to manually handle 2 task queues that behave differently ?
Any help in the subject would be appropriated.
Gilad.
Keep in mind that the ThreadPool harbors two distinct types of threads: worker threads and I/O completion threads. WCF requests will be serviced by I/O threads. Tasks that you run via ThreadPool.QueueUserWorkItem will run on worker threads. So in that respect the WCF requests and the other CPU tasks are working from different queues already.
Some of your performance issues may be caused by your ThreadPool settings. From MSDN:
The thread pool maintains a minimum number of idle threads. For worker threads, the default value of this minimum is the number of processors. The GetMinThreads method obtains the minimum numbers of idle worker and I/O completion threads. When all thread pool threads have been assigned to tasks, the thread pool does not immediately begin creating new idle threads. To avoid unnecessarily allocating stack space for threads, it creates new idle threads at intervals. The interval is currently half a second, although it could change in future versions of the .NET Framework. If an application is subject to bursts of activity in which large numbers of thread pool tasks are queued, use the SetMinThreads method to increase the minimum number of idle threads. Otherwise, the built-in delay in creating new idle threads could cause a bottleneck.
I have certainly experienced the above bottleneck in the past. There is a method called SetMinThreads that will allow you to change these settings. By the way, you mention setting thread priorites; however, I am not familiar with the mechanism for changing thread priorities of the ThreadPool. Could you please elaborate? Also, I've read that setting thread priorities can be fraught with danger.
Coding Horror : Thread Priorities are Evil
By the way, how many processors/cores is your machine running?
Since you are using .NET 4.0, you could run your long running processes through the TPL. By default, the TPL uses the .NET thread pool to execute tasks but you can also provide your own TaskScheduler implementation. Take a look at the example scheduler implementations in the samples for the TPL. I have not used it personally, but the QueuedTaskScheduler seems to assign tasks to a queue and uses its own thread pool to process the tasks. You can use this to define the maximum amount of threads you want to use for your long running tasks.

Categories

Resources