Using System.Threading.Tasks.Parallel create new thread in the thread pool? - c#

Maybe I did not understand it right ... all the Parallel class issue :(
But from what I am reading now, I understand that when I use the Parallel I actually mobilize all the threads that exists in the threadPool for some task/mission.
For example:
var arrayStrings = new string[1000];
Parallel.ForEach<string>(arrayStrings, someString =>
{
DoSomething(someString);
});
So the Parallel.ForEach in this case is mobilizing all the threads that exists in the threadPool for the 'DoSomething' task/mission.
But does the call Parallel.ForEach will create any new thread at all?
Its clear that there will be no 1000 new threads. But lets assume that there are 1000 new threads, some case that the threadPool release all the thread that it hold so, in this case ... the Parallel.ForEach will create any new thread?

Short answer: Parallel.ForEach() does not “mobilize all the threads”. And any operation that schedules some work on the ThreadPool (which Parallel.ForEach() does) can cause creation of new thread in the pool.
Long answer: To understand this properly, you need to know how three levels of abstraction work: Parallel.ForEach(), TaskScheduler and ThreadPool:
Parallel.ForEach() (and Parallel.For()) schedule their work on a TaskScheduler. If you don't specify a scheduler explicitly, the current one will be used.
Parallel.ForEach() splits the work between several Tasks. Each Task will process a part of the input sequence, and when it's done, it will request another part if one is available, and so on.
How many Tasks will Parallel.ForEach() create? As many as the TaskScheduler will let it run. The way this is done is that each Task first enqueues a copy of itself when it starts executing (unless doing so would violate MaxDegreeOfParallelism, if you set it). This way, the actual concurrency level is up to the TaskScheduler.
Also, the first Task will actually execute on the current thread, if the TaskScheduler supports it (this is done using RunSynchronously()).
The default TaskScheduler simply enqueues each Task to the ThreadPool queue. (Actually, it's more complicated if you start a Task from another Task, but that's not relevant here.) Other TaskSchedulers can do completely different things and some of them (like TaskScheduler.FromCurrentSynchronizationContext()) are completely unsuitable for use with Parallel.ForEach().
The ThreadPool uses quite a complex algorithm to decide exactly how many threads should be running at any given time. But the most important thing here is that scheduling new work item can cause the creating of a new thread (although not necessarily immediately). And because with Parallel.ForEach(), there is always some item queued to be executed, it's completely up to the internal algorithm of ThreadPool to decide the number of threads.
Put together, it's pretty much impossible to decide how many threads will be used by a Parallel.ForEach(), because it depends on many variables. Both extremes are possible: that the loop will run completely synchronously on the current thread and that each item will be run on its own, newly created thread.
But generally, is should be close to optimal efficiency and you probably don't have to worry about all those details.

Parallel.Foreach does not create new threads, nor does it "mobilize all the threads". It uses a limited number of threads from the threadpool and submits tasks to them for parallel execution. In the current implementation the default is to use one thread per core.

I think you have this the wrong way round.
From PATTERNS OF PARALLEL PROGRAMMING you'll see that Parallel.ForEach is just really syntactic sugar.
The Parallel.ForEach is largely boiled down to something like this,
for (int p = 0; p < arrayStrings.Count(); p++)
{
ThreadPool.QueueUserWorkItem(DoSomething(arrayStrings[p]);
}
The ThreadPool takes care of the scheduling. There are some excellent articles around how the ThreadPool's scheduler behaves to some degree if you're interested, but that's nothing to do with TPL.

Parallel does not deal with threads at all - it schedules TASKS to the task framework. THat then has a scheduler and the default scheduler goes to the threadpool. This one will try to find a goo number of threads (better in 4.5 than 4.0) and the Threadpool may slowly spin up new threads.
But that is not a functoin of parallel.foreach ;)
the Parallel.ForEach will create any new thread ???
It never will. As I said - it has 1000 foreach, then it queues 10.000 tasks, Point. THe Task factory scheduler will do what it is programmed to do ((you can replace it). Generally, default - yes, slowly new threads will spring up WITHIN REASON.

Related

How does the Parallel class dynamically adjust the level of parallelism?

What feedback does TPL use to dynamically adjust the number of worker threads?
My previous understanding was that it measures the rate of task completion to see if adding or removing threads is worth it. But then, why does this code keep increasing the number of threads, even though there is a bottleneck introduced by a semaphore?
Surely, there can be no more than 20 task completions per second, and more than 2 threads will not improve that.
var activeThreads = 0;
var semaphore = new SemaphoreSlim(2);
var res = Parallel.For(0, 1_000_000, i =>
{
Interlocked.Increment(ref activeThreads);
semaphore.Wait();
try
{
Thread.Sleep(100);
Console.WriteLine("Threads: " + activeThreads);
}
finally
{
Interlocked.Decrement(ref activeThreads);
semaphore.Release();
}
});
I believe the ParallelOptions is what you are looking for to specify the amount of parallelism.
Parallel.For(0, 1000, new ParallelOptions
{
MaxDegreeOfParallelism = 2
}, i => { Console.WriteLine(i); });
Personally, I think the TPL library will work in a lot of cases, but it isn't really smart about execution distribution (pardon my english). Whenever you have bottlenecks in the execution of your application, have a look at the pipeline pattern for example. Here is a link that describes different approaches to parallel execution very well imo: https://www.dotnetcurry.com/patterns-practices/1407/producer-consumer-pattern-dotnet-csharp
TL;DR: The thing that you are doing in your code that the TPL uses to justify creating a new thread is blocking. (Synchronizing or sleeping, or performing I/O would all count as blocking.)
A longer explanation...
When your task runs, it takes its thread hostage for 100 ms (because you Sleep(100)). While you are sleeping, that thread cannot be used to run other tasks because it would risk not being in a runnable state when the sleep time period expires. Typically we sleep rather than perform an asynchronous action because we need to keep our call stack intact. We are therefore relying on the stack to maintain our state. And the stack is a one-of-a-kind resource for the thread. (There's not actually a lot more to a thread than its stack.)
So the TPL (Thread pool, specifically) tries to keep occupancy high but the thread count low. One way it achieves this is by making sure that there are approximately just as many runnable threads in the system as there are virtual processors. Each time it needs to increase the thread count, it must create a relatively expensive stack for the thread, so it's best not to have so many. And a thread that is not runnable cannot be scheduled, so when the CPU becomes free, you need something to schedule to make use of the processing resources available. If the thread is sleeping, it cannot be scheduled to run. So instead, a thread will be added to the thread pool and the next task will be scheduled on it.
When you are writing parallel code like this (as in your parallel for loop) that can be partitioned and managed by the TPL you should be careful about putting your thread into a non-runnable state. Performing synchronous I/O, waiting for a synchronization object (e.g. semaphore, event or mutex etc.), or sleeping will put the thread into a state where nothing else can be done on the thread until the I/O completes, the sleep interval expires, or the synchronization object becomes signalled. The thread is no good to the TPL during this period.
In your case, you do several of these things: you wait on a semaphore, you sleep, and you perform I/O by writing to the console. The first thing is waiting on that semaphore. If it's not signalled, then you immediately have the situation where the thread is not runnable and the next task of your million-or-so tasks that need to be run must be scheduled on a different thread. If there isn't one, then the TPL can justify creating a new thread to get more tasks started. After-all, what if it's thread #987,321 that will actually wind up setting the semaphore to unblock task #1? The TPL doesn't know what your code does, so it can delay creating threads for a while in the spirit of efficiency, but for correctness, ultimately it will have to create more threads to start chipping away at the task list. There is a complex, implementation-specific heuristic that it applies to monitor, predict and otherwise get this efficiency guess right.
Now your specific question actually asked what feedback does it use to adjust the number of threads. Like I said, the actual implementation is complex and you should probably think of it as a black-box. But in a nutshell, if there are no runnable threads, it may create another thread to keep chipping away at the task list (or may wait a while before doing so, hoping that things will free up), and if there are too many idle threads, it will terminate the idle threads to reclaim their resources.
And to reiterate, as I said at the top, and to hopefully answer your question this time, the one thing you do that allows the TPL to justify creating a new thread is to block. ...even on that first semaphore.
Ran into an article analysing the thread injection algorithm in 2017. As of 2019-08-01, the hillclimbing.cpp file on GitHub hasn't really changed so the article should still be up to date.
Relevant details:
The .NET thread pool has two main mechanisms for injecting threads: a
starvation-avoidance mechanism that adds worker threads if it sees no
progress being made on queued items and a hill-climbing heuristic that
tries to maximize throughput while using as few threads as possible.
...
It calculates the desired number of threads based on the ‘current
throughput’, which is the ‘# of tasks completed’ (numCompletions)
during the current time-period (sampleDuration in seconds).
...
It also takes the current thread count (currentThreadCount) into
consideration.
...
The real .NET Thread Pool only increases the thread-count by one
thread every 500 milliseconds. It keeps doing this until the ‘# of
threads’ has reached the amount that the hill-climbing algorithm
suggests.
...
The [hill-climbing] algorithm only returns values that respect the limits
specified by ThreadPool.SetMinThreads(..) and
ThreadPool.SetMaxThreads(..)
...
In addition, [the hill-climbing algorithm] will only recommend
increasing the thread count if the CPU Utilization is below 95%
So it turns out the thread pool does have a feedback mechanism based on task completion rate. It also doesn't explicitly check whether its threads are blocked or running, but it does keep an eye on overall CPU utilization to detect blockages. All this also means it should be roughly aware of what the other threads and processes are doing.
On the other hand, it will always eagerly spawn at least as many threads as told by ThreadPool.SetMinThreads(), which defaults to the number of logical processors on the machine.
In conclusion, the test code in question was doing two things which make it keep piling up more threads:
there are lots of tasks queued up and sitting in the queue for ages, which indicates starvation
CPU utilization is negligible, which means that a new thread should be able to use it

Multithreading - New Thread vs ThreadPool

I have read in several blogs that we should create our own threads for long running, or blocking tasks and not consume from the thread pool.
My question: if I set setmaxthreads to 250 and I have 25 long running tasks, should I still create my own threads? I still have the remainder threads for other small tasks.
If they are long-running tasks, you should not use the ThreadPool at all. You really should not usually tweak the thread pool settings; certainly not to avoid this. Note that the thread pool size is limited for a reason; too many threads running at one time is a bad thing, too.
So, let the ThreadPool do what it's supposed to do, and just create your own thread for your long-running tasks. (assuming you aren't creating dozens or hundreds of these; in which case you have a different problem)

Change thread priority from lowest to highest in .net

I am trying to speed things by having one thread write to a linked list and another thread process the linked list.
For some reason if the method that writes to the linked list I make it into a task and the method that reads from the linked list a low priority thread the program finishes as a whole much faster. In other words I experiense fastests results when doing:
Task.Factory.StartNew( AddItems );
new Thread( startProcessingItems ) { Priority = ThreadPriority.Lowest }.Start();
while(completed==false)
Thread.Sleep(0);
Maybe because the first task is doing so much extra work than the other thread that's why everything as a whole will finish faster if I set the second method a low priority.
Anyways now my question is The startProcessingItems runs with ThreadPriority = Lowest. How could I change it's priority to highest? If I create a new Task in that method will it be running with low priority? Basically the startProcessingItems ends with a list and once it has that list I will like to start executing with highest priority.
This is not a good approach. First off, LinkedList<T> is not thread safe, so writing to it and reading from it in two threads will cause race conditions.
A better approach would be to use BlockingCollection<T>. This allows you to add items (producer thread) and read items (consumer thread) without worrying about thread safety, as it's completely thread safe.
The reading thread can just call blockingCollection.GetConsumingEnumerable() in a foreach to get the elements, and the write thread just adds them. The reading thread will automatically block, so there's no need to mess with priorities.
When the writing thread is "done", you just call CompleteAdding, which will, in turn, allow the reading thread to finish automatically.
You can improve the performance of your program by changing the inherent design, rather than by changing thread/process priorities.
A big part of your problem is that you're doing a busywait:
while(completed==false)
Thread.Sleep(0);
This results in it consuming lots of CPU cycles for no productive work, and it's why lowering it's priority makes it execute quicker. If you aren't busywaiting then that won't be an issue any more.
As Reed has suggested, BlockingCollection is tailor made for this situation. You can have a producer thread adding items using Add, and a consumer thread using Take knowing that the method will simply block if there are no more items to be removed.
You can also store the Task you create and use Task.Result or Task.Wait to have the main thread wait for the other task to finish (without wasting CPU cycles). (If you are using threads directly you can use Join.)
In addition to what Reed and Servy have said:
Thread priority is relative to the processes priority.
The Windows scheduler takes into account all other threads when it schedules time for threads. Threads with higher priority take time away from other threads which could artificially slow down the rest of the system. It's not like the system has no reason not to give you thread more priority. The priority would only have an effect if something else took the CPU away from it--which happens for a reason. If nothing took the CPU away from the thread, it won't magically run faster with a priority of Highest.
Setting thread priority to Highest is almost always the wrong thing to do.
Likely the overhead of synchronization between the two threads will kill any performance gain you think you might get.
Also, Thread.Sleep(0) only relinquishes time to threads of equal priority and is ready to run--which could lead to thread starvation. http://msdn.microsoft.com/en-us/library/d00bd51t(v=vs.80).aspx

Conditional Concurrency (Mixing Concurrency with Sequentiality)

Greetings Overflowers,
I'm working with C# .Net 4.
I want to give priority to specific parts of my code to utilize concurrency while other parts can also do the same only if there is room (free cores), otherwise they should switch to sequential execution (in the invoker thread) rather than just being temporarily blocked.
How can I do that ? Any good recent reading on that specific issue ?
Do Parallel.Invoke(...) execute stuff sequentially in the invoker task/thread if no cores are available?
Regards
There's no such thing as a 'free thread'. There's only an unused core. You can simply count them with Environment.ProcessorCount and base you threading strategy on that. Not leveraging the ThreadPool is usually a mistake, it does a good job distributing tp threads across cores. But it doesn't easily give you what you ask for, ThreadPool.GetAvailableThreads is a constantly changing number. Should be anyway.
I got the impression that the parallel framework handed out threads based on physical availability. http://msdn.microsoft.com/en-gb/concurrency/default.aspx
I might be thinking of the parallel extensions ( http://blogs.msdn.com/b/pfxteam/ ), or they might be the same thing or I might be talking rubbish :-)
In my experience with F#, it's possible to control/configure use of concurrency from outside of your core application code. I would expect C# Async CTP to offer similar functionality.
Free threads is a concept that exists only in the .Net thread pool. What you meant probably is free CPU resources!?
If you are declaring your own threads with new Thread() you are not bound by it.
However spawning a lot of them can slow down the process and even the OS.
That is why you should make your own thread manager to handle this.
You could check the CPU usage liek this:
PerformanceCounter cpuCounter = new PerformanceCounter();
cpuCounter.CategoryName = "Processor";
cpuCounter.CounterName = "% Processor Time";
cpuCounter.InstanceName = "_Total";
double cpuUsage = Convert.ToDouble(cpuCounter.NextValue());
And then in you code use the variable to do:
int threadId;
if (cpuUsage > threshold) {
DoWork();
}
else {
threadId = YourThreadManager.Queue(DoWork);
}
You problem is prioritization. You are saying that you have two types of tasks: important tasks (which must run), and leisure tasks (that can run only when there is a free CPU and no important tasks are queued).
The simple solution is to assign a very high priority for important tasks, and very low priority for leisure tasks. This way, you almost guarantee that most of your system's CPU resources go towards running important tasks. However, it does not guarantee that your leisure tasks always run after all the important tasks are run. The system may still schedule some leisure tasks onces in a while.
However, the Task Parallel Library does not allow scheduling tasks with different priorities. You can either use something like this or override the TPL's scheduler with a custom implementation of BlockingCollection that always dequeues the important tasks first.
One solution is to schedule your important tasks with the TPL, but manually schedule your leisure tasks on your own background thread (set to the lowest thread priority) -- and this thread runs the leisure tasks sequentially.

How would I wait for multiple threads to stop?

I have a Main thread that spawns around 20 worker threads.
I need to stop the Main thread until all the other threads are finished.
I know about (thread).Join. But that only works for one thread.
and multiple Joins hurt performance like this.
t1.Join()
t2.Join()
...
t20.Join()
as the program waits one by one for each to stop.
How would I make it such that
the main thread waits for all of a set of threads to end?
You should really look into Task Parallelism (Task Parallel Library). It uses a thread-pool, but also manage task-stealing etc.
Quote: "The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available. In addition, the TPL handles the partitioning of the work, the scheduling of threads on the ThreadPool, cancellation support, state management, and other low-level details." on Task Parallel Library
You can use it like this:
Task[] tasks = new Task[3]
{
Task.Factory.StartNew(() => MethodA()),
Task.Factory.StartNew(() => MethodB()),
Task.Factory.StartNew(() => MethodC())
};
//Block until all tasks complete.
Task.WaitAll(tasks);
Or if you use some kind of a loop to spawn your threads:
Data Parallelism (Task Parallel Library)
The joins are fine if that's what you want it to do. The main thread still has to wait for all the worker threads to terminate. Check out this website which is a sample chapter from C# in a Nutshell. It just so happens to be the threading chapter: http://www.albahari.com/threading/part4.aspx.
I can't see an obvious performance penalty for waiting for the threads to finish one-by-one. So, a simple foreach does what you want without any unnecerrasy bells and whistles:
foreach (Thread t in threads) t.Join();
Note: Of course, there's a Win32 API function that allows waiting for several objects (threads, in this case) at once — WaitForMultipleObjectsEx. There are many helper classes or threading frameworks out there on the Internet that utilize it for what you want. But do you really need them for a simple case?
and multiple Joins hurt performance
like this.
There's no "performance hurting", if you want to wait for all of your threads to exit, you call .join() on the threads.
Stuff your threads in a list and do
foreach(var t in myThread)
t.join();
If you are sure you will always have < 64 threads then you could have each new thread reliably set an Event before it exits, and WaitAll on the events in your main thread, once all threads are started up. The Event object would be created in the main thread and passed to the relevant child thread in a thread-safe way at thread creation time.
In native code you could do the same thing on the thread handles themselves, but not sure how to do this in .Net.
See also this prior question: C#: Waiting for all threads to complete

Categories

Resources