why the primary thread is blocked when using Parallel.ForEach

why the primary thread is blocked when using Parallel.ForEach - c#

Below is my code:
class Program
{
static void Main(string[] args)
{
Test();
}
static void Test()
{
string[] strList = { "first", "second", "third" };
Parallel.ForEach(strList, currStr =>
{
Console.WriteLine($"Thread {Thread.CurrentThread.ManagedThreadId} is handling {currStr}");
if (Thread.CurrentThread.ManagedThreadId != 1) //if not primary thread, sleep for 5 secs
{
Thread.Sleep(5000);
}
});
Console.WriteLine($"Here is thread {Thread.CurrentThread.ManagedThreadId}");
...
doMoreWork();
...
}
}
so Parallel.ForEach fetches two threads from the ThreadPool plus existing primary thread. And the output is:
Thread 1 is handling first
Thread 3 is handling second
Thread 4 is handling third
and after 5 seconds:
Here is Thread 1
Obviously, thread 1(primary thread) was blocked. But why wasbthe primary thread blocked? I can kind of get the idea that primary thread is blocked to wait for other threads to finish their jobs. But isn't that very inefficient, because the primary thread is blocked, it cannot continue to execute doMoreWork() until all other threads finish.

It isn't inefficient, it is simply the way you have coded it. While parallel thread execution is useful, so is sequential execution. The main purpose for the Parallel.Foreach is to iterator over an enumeration by partitioning the enumeration across multiple threads. Lets say for example the Foreach loop calculates a value by applying operations to each item in the enumeration. You then want to use this single value in a call to doMoreWork. If the Foreach loop and the doMoreWork executed in parallel you would have to introduce some form of wait to ensure the foreach completed before calling doMoreWork.
You might want to take a look at the Task Class documentation and examples. If you really want to have a Parallel.Foreach and doMoreWork running in separate threads at the same time you can uses Task.Run to start a function (or lambda), then independently wait on these to finish.
I will note that parallel execution doesn't guarantee efficient or speed. There are many factors to consider such as Amdahl's law, the effect of locking memory to ensure coherence, total system resources, etc. It's a very big topic.

How else could this possibly work? The purpose of a parallel for loop is to speed up a calculation by performing parts of it in parallel. The program cannot continue until all parts of the loop have completed (and the final result of the calculation can be computed). It's purpose is not to hand off work to execute asynchronously while the initiating thread continues on its way. You're using the wrong tool for the job. You should look into Task objects.

Related

Using Task.Wait on a parallel task vs asynchronous task

In chapter 4.4 Dynamic Parallelism, in Stephen Cleary's book Concurrency in C# Cookbook, it says the following:
Parallel tasks may use blocking members, such as Task.Wait,
Task.Result, Task.WaitAll, and Task.WaitAny. In contrast, asynchronous
tasks should avoid blocking members, and prefer await, Task.WhenAll,
and Task.WhenAny.
I was always told that Task.Wait etc are bad because they block the current thread, and that it's much better to use await instead, so that the calling thread is not blocked.
Why is it ok to use Task.Wait etc for a parallel (which I think means CPU bound) Task?
Example:
In the example below, isn't Test1() better because the thread that calls Test1() is able to continue doing something else while it waits for the for loop to complete?
Whereas the thread that calls Test() is stuck waiting for the for loop to complete.
private static void Test()
{
Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
}).Wait();
}
private static async Task Test1()
{
await Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
});
}
EDIT:
This is the rest of the paragraph which I'm adding based on Peter Csala's comment:
Parallel tasks also commonly use AttachedToParent to create parent/child relationships between tasks. Parallel tasks should be created with Task.Run or Task.Factory.StartNew.

You've already got some great answers here, but just to chime in (sorry if this is repetitive at all):
Task was introduced in the TPL before async/await existed. When async came along, the Task type was reused instead of creating a separate "Promise" type.
In the TPL, pretty much all tasks were Delegate Tasks - i.e., they wrap a delegate (code) which is executed on a TaskScheduler. It was also possible - though rare - to have Promise Tasks in the TPL, which were created by TaskCompletionSource<T>.
The higher-level TPL APIs (Parallel and PLINQ) hide the Delegate Tasks from you; they are higher-level abstractions that create multiple Delegate Tasks and execute them on multiple threads, complete with all the complexity of partitioning and work queue stealing and all that stuff.
However, the one drawback to the higher-level APIs is that you need to know how much work you are going to do before you start. It's not possible for, e.g., the processing of one data item to add another data item(s) back at the beginning of the parallel work. That's where Dynamic Parallelism comes in.
Dynamic Parallelism uses the Task type directly. There are many APIs on the Task type that were designed for Dynamic Parallelism and should be avoided in async code unless you really know what you're doing (i.e., either your name is Stephen Toub or you're writing a high-performance .NET runtime). These APIs include StartNew, ContinueWith, Wait, Result, WaitAll, WaitAny, Id, CurrentId, RunSynchronously, and parent/child tasks. And then there's the Task constructor itself and Start which should never be used in any code at all.
In the particular case of Wait, yes, it does block the thread. And that is not ideal (even in parallel programming), because it blocks a literal thread. However, the alternative may be worse.
Consider the case where task A reaches a point where it has to be sure task B completes before it continues. This is the general Dynamic Parallelism case, so assume no parent/child relationship.
The old-school way to avoid this kind of blocking is to split method A up into a continuation and use ContinueWith. That works fine, but it does complicate the code - rather considerably in the case of loops. You end up writing a state machine, essentially what async does for you. In modern code, you may be able to use await, but then that has its own dangers: parallel code does not work out of the box with async, and combining the two can be tricky.
So it really comes down to a tradeoff between code complexity vs runtime efficiency. And when you consider the following points, you'll see why blocking was common:
Parallelism is normally done on Desktop applications; it's not common (or recommended) for web servers.
Desktop machines tend to have plenty of threads to spare. I remember Mark Russinovich (long before he joined Microsoft) demoing how showing a File Open dialog on Windows spawned some crazy number of threads (over 20, IIRC). And yet the user wouldn't even notice 20 threads being spawned (and presumably blocked).
Parallel code is difficult to maintain in the first place; Dynamic Parallelism using continuations is exceptionally difficult to maintain.
Given these points, it's pretty easy to see why a lot of parallel code blocks thread pool threads: the user experience is degraded by an unnoticeable amount, but the developer experience is enhanced significantly.

The thing is if you are using tasks to parallelize CPU-bound work - your method is likely not asynchronous, because the main benefit of async is asynchronous IO, and you have no IO in this case. Since your method is synchronous - you can't await anything, including tasks you use to parallelize computation, nor do you need to.
The valid concern you mentioned is you would waste current thread if you just block it waiting for parallel tasks to complete. However you should not waste it like this - it can be used as one participant in parallel computation. Say you want to perform parallel computation on 4 threads. Use current thread + 3 other threads, instead of using just 4 other threads and waste current one blocked waiting for them.
That's what for example Parallel LINQ does - it uses current thread together with thread pool threads. Note also its methods are not async (and should not be), but they do use Tasks internally and do block waiting on them.
Update: about your examples.
This one:
private static void Test()
{
Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
}).Wait();
}
Is always useless - you offset some computation to separate thread while current thread is blocked waiting, so in result one thread is just wasted for nothing useful. Instead you should just do:
private static void Test()
{
for (int i = 0; i < 100; i++)
{
//do something.
}
}
This one:
private static async Task Test1()
{
await Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
});
}
Is useful sometimes - when for some reason you need to perform computation but don't want to block current thread. For example, if current thread is UI thread and you don't want user interface to be freezed while computation is performed. However, if you are not in such environment, for example you are writing general purpose library - then it's useless too and you should stick to synchronous version above. If user of your library happen to be on UI thread - he can wrap the call in Task.Run himself. I would say that even if you are not writing a library but UI application - you should move all such logic (for loop in this case) into separate synchronous method and then wrap call to that method in Task.Run if necessary. So like this:
private static async Task Test2()
{
// we are on UI thread here, don't want to block it
await Task.Run(() => {
OurSynchronousVersionAbove();
});
// back on UI thread
// do something else
}
Now say you have that synchronous method and want to parallelize the computation. You may try something like this:
static void Test1() {
var task1 = Task.Run(() => {
for (int i = 0; i < 50;i++) {
// do something
}
});
var task2 = Task.Run(() => {
for (int i = 50; i < 100;i++) {
// do something
}
});
Task.WaitAll(task1, task2);
}
That will work but it wastes current thread blocked for no reason, waiting for two tasks to complete. Instead, you should do it like this:
static void Test1() {
var task = Task.Run(() => {
for (int i = 0; i < 50; i++) {
// do something
}
});
for (int i = 50; i < 100; i++) {
// do something
}
task.Wait();
}
Now you perform computation in parallel using 2 threads - one thread pool thread (from Task.Run) and current thread. And here is your legitimate use of task.Wait(). Of course usually you should stick to existing solutions like parallel LINQ, which does the same for you but better.

One of the risks of Task.Wait is deadlocks. If you call .Wait on the UI thread, you will deadlock if the task needs the main thread to complete. If you call an async method on the UI thread such deadlocks are very likely.
If you are 100% sure the task is running on a background thread, is guaranteed to complete no matter what, and that this will never change, it is fine to wait on it.
Since this if fairly difficult to guarantee it is usually a good idea to try to avoid waiting on tasks at all.

I believe that point in this passage is to not use blocking operations like Task.Wait in asynchronous code.
The main point isn't that Task.Wait is preferred in parallel code; it just says that you can get away with it, while in asynchronous code it can have a really serious effect.
This is because the success of async code depends on the tasks 'letting go' (with await) so that the thread(s) can do other work. In explicitly parallel code a blocking Wait may be OK because the other streams of work will continue going because they have a dedicated thread(s).

As I mentioned in the comments section if you look at the Receipt as a whole it might make more sense. Let me quote here the relevant part as well.
The Task type serves two purposes in concurrent programming: it can be a parallel task or an asynchronous task. Parallel tasks may use blocking members, such as Task.Wait, Task.Result, Task.WaitAll, and Task.WaitAny. In contrast, asynchronous tasks should avoid blocking members, and prefer await, Task.WhenAll, and Task.WhenAny. Parallel tasks also commonly use AttachedToParent to create parent/child relationships between tasks. Parallel tasks should be created with Task.Run or Task.Factory.StartNew.
In contrast, asynchronous tasks should avoid blocking members and prefer await, Task.WhenAll, and Task.WhenAny. Asynchronous tasks do not use AttachedToParent, but they can inform an implicit kind of parent/child relationship by awaiting an other task.
IMHO, it clearly articulates that a Task (or future) can represent a job, which can take advantage of the async I/O. OR it can represent a CPU bound job which could run in parallel with other CPU bound jobs.
Awaiting the former is the suggested way because otherwise you can't really take advantage of the underlying I/O driver's async capability. The latter does not require awaiting since it is not an async I/O job.
UPDATE Provide an example
As Theodor Zoulias asked in a comment section here is a made up example for parallel tasks where Task.WaitAll is being used.
Let's suppose we have this naive Is Prime Number implementation. It is not efficient, but it demonstrates that you perform something which is computationally can be considered as heavy. (Please also bear in mind for the sake of simplicity I did not add any error handling logic.)
static (int, bool) NaiveIsPrime(int number)
{
int numberOfDividers = 0;
for (int divider = 1; divider <= number; divider++)
{
if (number % divider == 0)
{
numberOfDividers++;
}
}
return (number, numberOfDividers == 2);
}
And here is a sample use case which run a couple of is prime calculation in parallel and waits for the results in a blocking way.
List<Task<(int, bool)>> jobs = new();
for (int number = 1_010; number < 1_020; number++)
{
var x = number;
jobs.Add(Task.Run(() => NaiveIsPrime(x)));
}
Task.WaitAll(jobs.ToArray());
foreach (var job in jobs)
{
(int number, bool isPrime) = job.Result;
var isPrimeInText = isPrime ? "a prime" : "not a prime";
Console.WriteLine($"{number} is {isPrimeInText}");
}
As you can see I haven't used any await keyword anywhere.
Here is a dotnet fiddle link
and here is a link for the prime numbers under 10 000.

I recommended using await instead of Task.Wait() for asynchronous methods/tasks, because this way the thread can be used for something else while the task is running.
However, for parallel tasks that are CPU -bound, most of the available CPU should be used. It makes sense use Task.Wait() to block the current thread until the task is complete. This way, the CPU -bound task can make full use of the CPU resources.
Update with supplementary statement.
Parallel tasks can use blocking members such as Task.Wait(), Task.Result, Task.WaitAll, and Task.WaitAny as they should consume all available CPU resources. When working with parallel tasks, it can be beneficial to block the current thread until the task is complete, since the thread is not being used for anything else. This way, the software can fully utilize all available CPU resources instead of wasting resources by keeping the thread running while it is blocked.

How can I convert this to async-await code and use only 1 thread?

I have a piece of code like
public static void Main()
{
Console.WriteLine("Press any key to print 'Hello, world!' after 2 seconds");
while(true)
{
Console.ReadKey();
Task.Run(() =>
{
Thread.Sleep(2000);
Console.WriteLine(string.Format("Hello world! (from thread {0})", Thread.CurrentThread.ManagedThreadId);
});
}
}
which of course creates a lot of threads if I press keys fast, and those threads are doing nothing besides waiting for the first 2 seconds that they are used.
I want to change the action to have
await Task.Delay(2000).ConfigureAwait(false);
but I can't figure out how to chain the async-awaits "all the way up" to make this work.

Your requirements aren't entirely clear (do you literally need to use no additional threads? is using Task.Delay() an integral part of your goal?), nor what specifically you're having trouble figuring out.
If you can accept some small number of additional threads, then Task.Delay() would work for you. Of course, you need for it to be in an async method so you can await it. IMHO, this is done more correctly in a named method rather than anonymous:
public static void Main()
{
Console.WriteLine("Press any key to print 'Hello, world!' after 2 seconds");
while(true)
{
Console.ReadKey();
KeyPressed();
}
}
// WARNING: "async void" is discouraged!
private static async void KeyPressed()
{
await Task.Delay(2000);
Console.WriteLine(string.Format("Hello world! (from thread {0})", Thread.CurrentThread.ManagedThreadId);
}
Note:
It is generally not a good idea to use void for async methods. But in your example, an exception is unlikely and you don't appear to want to wait for the method's completion anyway, so I've written the above using void anyway. In your real-world scenario, consider carefully whether this is really appropriate. At the very least, if an exception could occur, you should then put a try/catch handler in the async method to ensure the exception is observed.
In a console program, there's not any need to use ConfigureAwait(), because you don't have a synchronization context for a continuation to resume on in the first place. Continuations will occur in the thread pool.
The above does not literally "use only 1 thread". Since (as I mention just above) the continuation happens in the thread pool (there's no mechanism for it to get back to the main thread), there will still be additional threads used. However, it should be fewer than the number you were using before, because a thread is required only for the continuation. The delay happens using a timer thread (okay, so there's one more thread there…but that's shared with all the timers). It's likely this approach would involve only a handful of extra threads, no matter how frequently you press a key.
I don't see any way to literally get the solution down to just one thread (i.e. one user-created thread….NET creates threads behind your back anyway, so you're never going to literally have a .NET process with only one thread), unless you use polling instead of the ReadKey() method for input. Which you could do, but it'd be a very inefficient use of CPU time. IMHO, the async/await approach strikes the best balance between simplicity and efficiency.

Resolving locking deadlock with Thread.Sleep

If I comment or pass 0 in Thread.Sleep(0) method then there is no deadlock. In other cases there is a deadlock. uptask is executed by a thread from the thread poll and that takes some time. In the mean time, the main thread acquires lockB, lockA and prints the string and releases the locks. After that uptask starts running and it sees that lockA and lockB are free. So in this case there is no deadlock. But if I sleep the main thread in the mean time uptask advances and sees that lockB is locked and a deadlock happens. Can anybody explain better or verify if this is the reason?
class MyAppClass
{
public static void Main()
{
object lockA = new object();
object lockB = new object();
var uptask = Task.Run(() =>
{
lock (lockA)
{
lock (lockB)
{
Console.WriteLine("A outer and B inner");
}
}
});
lock (lockB)
{
//Uncomment the following statement or sleep with 0 ms and see that there is no deadlock.
//But sleep with 1 or more lead to deadlock. Reason?
Thread.Sleep(1);
lock (lockA)
{
Console.WriteLine("B outer and A inner");
}
}
uptask.Wait();
Console.ReadKey();
}
}

You really cannot depend on Thread.Sleep to prevent a deadlock. It worked in your environment for some time. It might not work all the time and might not work in other environments.
Since you are obtaining the locks in reverse order, then the chance of a deadlock is there.
To prevent the deadlock, make sure you obtain the locks in order (e.g. lockA then lockB in both threads).
My best guess for why this is happening is that if you don't sleep, then the main thread will obtain and release both locks before the other thread (from the thread pool) obtains lockA. Please note that scheduling and running a Task on the thread-pool requires some time. In most cases, it is neglectable. But it your case, it made a difference.
To verify this, add the following line right before uptask.Wait():
Console.WriteLine("main thread is done");
And this line after obtaining lockA from the thread-pool thread:
Console.WriteLine("Thread pool thread: obtained lockA");
In case where there is no deadlock, you will see the first message printed to the console before the thread-pool thread prints it's message to the console.

You have two threads. The main thread and the thread that executes the task. If the main thread is able to take lockB and the task thread is able to take lockA then you have a deadlock. You should never lock two different resources in different sequence because that can lead to deadlock (but I expect that you already know this based on the abstract nature of your question).
In most cases the task thread will start slightly delayed and then the main thread will get both locks but if you insert a Thread.Sleep(1) between lockA and lockB then the task thread is able to get lockA before the main thread and BAM! you have a deadlock.
However, the Thread.Sleep(1) is not necessary condition for getting a deadlock. If the operating system decides to schedule the task thread in a way that makes it able to get lockA before the main thread you have your deadlock and just because there is no deadlock on your lightning fast machine you may experience deadlocks on other computers that have fewer processing resources.
Here is an illustration to visually explains why the delay increases the likelihood of getting a deadlock:

From https://msdn.microsoft.com/en-us/library/d00bd51t(v=vs.110).aspx, "millisecondsTimeout"
Type: System.Int32
The number of milliseconds for which the thread is suspended. If the value of the millisecondsTimeout argument is zero, the thread relinquishes the remainder of its time slice to any thread of equal priority that is ready to run. If there are no other threads of equal priority that are ready to run, execution of the current thread is not suspended."

Martin Liversage answered this question concisely.
To paraphrase, the code in your experiment is deadlock prone, even without the Thread.Sleep() statement. Without the Thread.Sleep() statement, the probability window for the deadlock to occur was extremely small and may have taken eons to occur. This is the reason you did not experience it when omitting the Thread.Sleep() statement. By adding any time-consuming logic at line 19 (ie: Thread.Sleep), you expand this window and increase the probability of the deadlock.
Also, this window could expand/decrease by running your code on a different hardware/OS, where task scheduling may be different.

How do you determine if child threads have completed

I am using multiple threads in my application using while(true) loop and now i want to exit from loop when all the active threads complete their work.

Assuming that you have a list of the threads themselves, here are two approaches.
Solution the first:
Use Thread.Join() with a timespan parameter to synch up with each thread in turn. The return value tells you whether the thread has finished or not.
Solution the second:
Check Thread.IsAlive() to see if the thread is still running.
In either situation, make sure that your main thread yields processor time to the running threads, else your wait loop will consume most/all the CPU and starve your worker threads.

You can use Process.GetCurrentProcess().Threads.Count.

There are various approaches here, but utlimately most of them come down to your changing the executed threads to do something whenever they leave (success or via exception, which you don't want to do anyway). A simple approach might be to use Interlock.Decrement to reduce a counter - and if it is zero (or -ve, which probably means an error) release a ManualResetEvent or Monitor.Pulse an object; in either case, the original thread would be waiting on that object. A number of such approaches are discussed here.
Of course, it might be easier to look at the TPL bits in 4.0, which provide a lot of new options here (not least things like Parallel.For in PLINQ).
If you are using a synchronized work queue, it might also be possible to set that queue to close (drain) itself, and simply wait for the queue to be empty? The assumption here being that your worker threads are doing something like:
T workItem;
while(queue.TryDequeue(out workItem)) { // this may block until either something
ProcessWorkItem(workItem); // todo, or the queue is terminated
}
// queue has closed - exit the thread
in which case, once the queue is empty all your worker threads should already be in the process of suicide.

You can use Thread.Join(). The Join method will block the calling thread until the thread (the one on which the Join method is called) terminates.
So if you have a list of thread, then you can loop through and call Join on each thread. You loop will only exit when all the threads are dead. Something like this:
for(int i = 0 ;i < childThreadList.Count; i++)
{
childThreadList[i].Join();
}
///...The following code will execute when all threads in the list have been terminated...///

I find that using the Join() method is the cleanest way. I use multiple threads frequently, and each thread is typically loading data from different data sources (Informix, Oracle and SQL at the same time.) A simple loop, as mentioned above, calling Join() on each thread object (which I store in a simple List object) works!!!
Carlos Merighe.

I prefer using a HashSet of Threads:
// create a HashSet of heavy tasks (threads) to run
HashSet<Thread> Threadlist = new HashSet<Thread>();
Threadlist.Add(new Thread(() => SomeHeavyTask1()));
Threadlist.Add(new Thread(() => SomeHeavyTask2()));
Threadlist.Add(new Thread(() => SomeHeavyTask3()));
// start the threads
foreach (Thread T in Threadlist)
T.Start();
// these will execute sequential
NotSoHeavyTask1();
NotSoHeavyTask2();
NotSoHeavyTask3();
// loop through tasks to see if they are still active, and join them to main thread
foreach (Thread T in Threadlist)
if (T.ThreadState == ThreadState.Running)
T.Join();
// finally this code will execute
MoreTasksToDo();

Maximum queued elements in ThreadPool.QueueUserWorkItem

I set the max thread to 10. Then I added 22000 task using ThreadPool.QueueUserWorkItem.
It is very likely that not all the 22000 task was completed after running the program. Is there a limitation how many task can be queued for avaiable threads?

If you need to wait for all of the tasks to process, you need to handle that yourself. The ThreadPool threads are all background threads, and will not keep the application alive.
This is a relatively clean way to handle this type of situation:
using (var mre = new ManualResetEvent(false))
{
int remainingToProcess = workItems.Count(); // Assuming workItems is a collection of "tasks"
foreach(var item in workItems)
{
// Delegate closure (in C# 4 and earlier) below will
// capture a reference to 'item', resulting in
// the incorrect item sent to ProcessTask each iteration. Use a local copy
// of the 'item' variable instead.
// C# 5/VS2012 will not require the local here.
var localItem = item;
ThreadPool.QueueUserWorkItem(delegate
{
// Replace this with your "work"
ProcessTask(localItem);
// This will (safely) decrement the remaining count, and allow the main thread to continue when we're done
if (Interlocked.Decrement(ref remainingToProcess) == 0)
mre.Set();
});
}
mre.WaitOne();
}
That being said, it's usually better to "group" together your work items if you have thousands of them, and not treat them as separate Work Items for the threadpool. This is some overhead involved in managing the list of items, and since you won't be able to process 22000 at a time, you're better off grouping these into blocks. Having single work items each process 50 or so will probably help your overall throughput quite a bit...

The queue has no practical limit however the pool itself will not exceed 64 wait handles, ie total threads active.

This is an implementation dependent question and the implementation of this function has changed a bit over time. But in .Net 4.0, you're essentially limited by the amount of memory in the system as the tasks are stored in an in memory queue. You can see this by digging through the implementation in reflector.

From the documentation of ThreadPool:
Note: The threads in the managed thread pool are background threads. That is, their IsBackground properties are true. This means that a ThreadPool thread will not keep an application running after all foreground threads have exited.
Is it possible that you're exiting before all tasks have been processed?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.