Parallel.ForEach with async lambda waiting forall iterations to complete

Parallel.ForEach with async lambda waiting forall iterations to complete - c#

recently I have seen several SO threads related to Parallel.ForEach mixed with async lambdas, but all proposed answers were some kind of workarounds.
Is there any way how could I write:
List<int> list = new List<int>[]();
Parallel.ForEach(arrayValues, async (item) =>
{
var x = await LongRunningIoOperationAsync(item);
list.Add(x);
});
How can I ensure that list will contain all items from all iterations executed withing lambdas in each iteration?
How will generally Parallel.ForEach work with async lambdas, if it hit await will it hand over its thread to next iteration?
I assume ParallelLoopResult IsCompleted field is not proper one, as it will return true when all iterations are executed, no matter if their actual lambda jobs are finished or not?

recently I have seen several SO threads related to Parallel.ForEach mixed with async lambdas, but all proposed answers were some kind of workarounds.
Well, that's because Parallel doesn't work with async. And from a different perspective, why would you want to mix them in the first place? They do opposite things. Parallel is all about adding threads and async is all about giving up threads. If you want to do asynchronous work concurrently, then use Task.WhenAll. That's the correct tool for the job; Parallel is not.
That said, it sounds like you want to use the wrong tool, so here's how you do it...
How can I ensure that list will contain all items from all iterations executed withing lambdas in each iteration?
You'll need to have some kind of a signal that some code can block on until the processing is done, e.g., CountdownEvent or Monitor. On a side note, you'll need to protect access to the non-thread-safe List<T> as well.
How will generally Parallel.ForEach work with async lambdas, if it hit await will it hand over its thread to next iteration?
Since Parallel doesn't understand async lambdas, when the first await yields (returns) to its caller, Parallel will assume that interation of the loop is complete.
I assume ParallelLoopResult IsCompleted field is not proper one, as it will return true when all iterations are executed, no matter if their actual lambda jobs are finished or not?
Correct. As far as Parallel knows, it can only "see" the method to the first await that returns to its caller. So it doesn't know when the async lambda is complete. It also will assume iterations are complete too early, which throws partitioning off.

You don't need Parallel.For/ForEach here you just need to await a list of tasks.
Background
In short you need to be very careful about async lambdas, and if you are passing them to an Action or Func<Task>
Your problem is because Parallel.For / ForEach is not suited for the async and await pattern or IO bound tasks. They are suited for cpu bound workloads. Which means they essentially have Action parameters and let's the task scheduler create the tasks for you
If you want to run multiple async tasks at the same time use Task.WhenAll , or a TPL Dataflow Block (or something similar) which can deal effectively with both CPU bound and IO bound works loads, or said more directly, they can deal with tasks which is what an async method is.
Unless you need to do more inside of your lambda (for which you haven't shown), just use aSelect and WhenAll
var tasks = items .Select(LongRunningIoOperationAsync);
var results = await Task.WhenAll(tasks); // here is your list of int
If you do, you can still use the await,
var tasks = items.Select(async (item) =>
{
var x = await LongRunningIoOperationAsync(item);
// do other stuff
return x;
});
var results = await Task.WhenAll(tasks);
Note : If you need the extended functionality of Parallel.ForEach (namely the Options to control max concurrency), there are several approach, however RX or DataFlow might be the most succinct

Related

Using Task.Wait on a parallel task vs asynchronous task

In chapter 4.4 Dynamic Parallelism, in Stephen Cleary's book Concurrency in C# Cookbook, it says the following:
Parallel tasks may use blocking members, such as Task.Wait,
Task.Result, Task.WaitAll, and Task.WaitAny. In contrast, asynchronous
tasks should avoid blocking members, and prefer await, Task.WhenAll,
and Task.WhenAny.
I was always told that Task.Wait etc are bad because they block the current thread, and that it's much better to use await instead, so that the calling thread is not blocked.
Why is it ok to use Task.Wait etc for a parallel (which I think means CPU bound) Task?
Example:
In the example below, isn't Test1() better because the thread that calls Test1() is able to continue doing something else while it waits for the for loop to complete?
Whereas the thread that calls Test() is stuck waiting for the for loop to complete.
private static void Test()
{
Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
}).Wait();
}
private static async Task Test1()
{
await Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
});
}
EDIT:
This is the rest of the paragraph which I'm adding based on Peter Csala's comment:
Parallel tasks also commonly use AttachedToParent to create parent/child relationships between tasks. Parallel tasks should be created with Task.Run or Task.Factory.StartNew.

You've already got some great answers here, but just to chime in (sorry if this is repetitive at all):
Task was introduced in the TPL before async/await existed. When async came along, the Task type was reused instead of creating a separate "Promise" type.
In the TPL, pretty much all tasks were Delegate Tasks - i.e., they wrap a delegate (code) which is executed on a TaskScheduler. It was also possible - though rare - to have Promise Tasks in the TPL, which were created by TaskCompletionSource<T>.
The higher-level TPL APIs (Parallel and PLINQ) hide the Delegate Tasks from you; they are higher-level abstractions that create multiple Delegate Tasks and execute them on multiple threads, complete with all the complexity of partitioning and work queue stealing and all that stuff.
However, the one drawback to the higher-level APIs is that you need to know how much work you are going to do before you start. It's not possible for, e.g., the processing of one data item to add another data item(s) back at the beginning of the parallel work. That's where Dynamic Parallelism comes in.
Dynamic Parallelism uses the Task type directly. There are many APIs on the Task type that were designed for Dynamic Parallelism and should be avoided in async code unless you really know what you're doing (i.e., either your name is Stephen Toub or you're writing a high-performance .NET runtime). These APIs include StartNew, ContinueWith, Wait, Result, WaitAll, WaitAny, Id, CurrentId, RunSynchronously, and parent/child tasks. And then there's the Task constructor itself and Start which should never be used in any code at all.
In the particular case of Wait, yes, it does block the thread. And that is not ideal (even in parallel programming), because it blocks a literal thread. However, the alternative may be worse.
Consider the case where task A reaches a point where it has to be sure task B completes before it continues. This is the general Dynamic Parallelism case, so assume no parent/child relationship.
The old-school way to avoid this kind of blocking is to split method A up into a continuation and use ContinueWith. That works fine, but it does complicate the code - rather considerably in the case of loops. You end up writing a state machine, essentially what async does for you. In modern code, you may be able to use await, but then that has its own dangers: parallel code does not work out of the box with async, and combining the two can be tricky.
So it really comes down to a tradeoff between code complexity vs runtime efficiency. And when you consider the following points, you'll see why blocking was common:
Parallelism is normally done on Desktop applications; it's not common (or recommended) for web servers.
Desktop machines tend to have plenty of threads to spare. I remember Mark Russinovich (long before he joined Microsoft) demoing how showing a File Open dialog on Windows spawned some crazy number of threads (over 20, IIRC). And yet the user wouldn't even notice 20 threads being spawned (and presumably blocked).
Parallel code is difficult to maintain in the first place; Dynamic Parallelism using continuations is exceptionally difficult to maintain.
Given these points, it's pretty easy to see why a lot of parallel code blocks thread pool threads: the user experience is degraded by an unnoticeable amount, but the developer experience is enhanced significantly.

The thing is if you are using tasks to parallelize CPU-bound work - your method is likely not asynchronous, because the main benefit of async is asynchronous IO, and you have no IO in this case. Since your method is synchronous - you can't await anything, including tasks you use to parallelize computation, nor do you need to.
The valid concern you mentioned is you would waste current thread if you just block it waiting for parallel tasks to complete. However you should not waste it like this - it can be used as one participant in parallel computation. Say you want to perform parallel computation on 4 threads. Use current thread + 3 other threads, instead of using just 4 other threads and waste current one blocked waiting for them.
That's what for example Parallel LINQ does - it uses current thread together with thread pool threads. Note also its methods are not async (and should not be), but they do use Tasks internally and do block waiting on them.
Update: about your examples.
This one:
private static void Test()
{
Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
}).Wait();
}
Is always useless - you offset some computation to separate thread while current thread is blocked waiting, so in result one thread is just wasted for nothing useful. Instead you should just do:
private static void Test()
{
for (int i = 0; i < 100; i++)
{
//do something.
}
}
This one:
private static async Task Test1()
{
await Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
//do something.
}
});
}
Is useful sometimes - when for some reason you need to perform computation but don't want to block current thread. For example, if current thread is UI thread and you don't want user interface to be freezed while computation is performed. However, if you are not in such environment, for example you are writing general purpose library - then it's useless too and you should stick to synchronous version above. If user of your library happen to be on UI thread - he can wrap the call in Task.Run himself. I would say that even if you are not writing a library but UI application - you should move all such logic (for loop in this case) into separate synchronous method and then wrap call to that method in Task.Run if necessary. So like this:
private static async Task Test2()
{
// we are on UI thread here, don't want to block it
await Task.Run(() => {
OurSynchronousVersionAbove();
});
// back on UI thread
// do something else
}
Now say you have that synchronous method and want to parallelize the computation. You may try something like this:
static void Test1() {
var task1 = Task.Run(() => {
for (int i = 0; i < 50;i++) {
// do something
}
});
var task2 = Task.Run(() => {
for (int i = 50; i < 100;i++) {
// do something
}
});
Task.WaitAll(task1, task2);
}
That will work but it wastes current thread blocked for no reason, waiting for two tasks to complete. Instead, you should do it like this:
static void Test1() {
var task = Task.Run(() => {
for (int i = 0; i < 50; i++) {
// do something
}
});
for (int i = 50; i < 100; i++) {
// do something
}
task.Wait();
}
Now you perform computation in parallel using 2 threads - one thread pool thread (from Task.Run) and current thread. And here is your legitimate use of task.Wait(). Of course usually you should stick to existing solutions like parallel LINQ, which does the same for you but better.

One of the risks of Task.Wait is deadlocks. If you call .Wait on the UI thread, you will deadlock if the task needs the main thread to complete. If you call an async method on the UI thread such deadlocks are very likely.
If you are 100% sure the task is running on a background thread, is guaranteed to complete no matter what, and that this will never change, it is fine to wait on it.
Since this if fairly difficult to guarantee it is usually a good idea to try to avoid waiting on tasks at all.

I believe that point in this passage is to not use blocking operations like Task.Wait in asynchronous code.
The main point isn't that Task.Wait is preferred in parallel code; it just says that you can get away with it, while in asynchronous code it can have a really serious effect.
This is because the success of async code depends on the tasks 'letting go' (with await) so that the thread(s) can do other work. In explicitly parallel code a blocking Wait may be OK because the other streams of work will continue going because they have a dedicated thread(s).

As I mentioned in the comments section if you look at the Receipt as a whole it might make more sense. Let me quote here the relevant part as well.
The Task type serves two purposes in concurrent programming: it can be a parallel task or an asynchronous task. Parallel tasks may use blocking members, such as Task.Wait, Task.Result, Task.WaitAll, and Task.WaitAny. In contrast, asynchronous tasks should avoid blocking members, and prefer await, Task.WhenAll, and Task.WhenAny. Parallel tasks also commonly use AttachedToParent to create parent/child relationships between tasks. Parallel tasks should be created with Task.Run or Task.Factory.StartNew.
In contrast, asynchronous tasks should avoid blocking members and prefer await, Task.WhenAll, and Task.WhenAny. Asynchronous tasks do not use AttachedToParent, but they can inform an implicit kind of parent/child relationship by awaiting an other task.
IMHO, it clearly articulates that a Task (or future) can represent a job, which can take advantage of the async I/O. OR it can represent a CPU bound job which could run in parallel with other CPU bound jobs.
Awaiting the former is the suggested way because otherwise you can't really take advantage of the underlying I/O driver's async capability. The latter does not require awaiting since it is not an async I/O job.
UPDATE Provide an example
As Theodor Zoulias asked in a comment section here is a made up example for parallel tasks where Task.WaitAll is being used.
Let's suppose we have this naive Is Prime Number implementation. It is not efficient, but it demonstrates that you perform something which is computationally can be considered as heavy. (Please also bear in mind for the sake of simplicity I did not add any error handling logic.)
static (int, bool) NaiveIsPrime(int number)
{
int numberOfDividers = 0;
for (int divider = 1; divider <= number; divider++)
{
if (number % divider == 0)
{
numberOfDividers++;
}
}
return (number, numberOfDividers == 2);
}
And here is a sample use case which run a couple of is prime calculation in parallel and waits for the results in a blocking way.
List<Task<(int, bool)>> jobs = new();
for (int number = 1_010; number < 1_020; number++)
{
var x = number;
jobs.Add(Task.Run(() => NaiveIsPrime(x)));
}
Task.WaitAll(jobs.ToArray());
foreach (var job in jobs)
{
(int number, bool isPrime) = job.Result;
var isPrimeInText = isPrime ? "a prime" : "not a prime";
Console.WriteLine($"{number} is {isPrimeInText}");
}
As you can see I haven't used any await keyword anywhere.
Here is a dotnet fiddle link
and here is a link for the prime numbers under 10 000.

I recommended using await instead of Task.Wait() for asynchronous methods/tasks, because this way the thread can be used for something else while the task is running.
However, for parallel tasks that are CPU -bound, most of the available CPU should be used. It makes sense use Task.Wait() to block the current thread until the task is complete. This way, the CPU -bound task can make full use of the CPU resources.
Update with supplementary statement.
Parallel tasks can use blocking members such as Task.Wait(), Task.Result, Task.WaitAll, and Task.WaitAny as they should consume all available CPU resources. When working with parallel tasks, it can be beneficial to block the current thread until the task is complete, since the thread is not being used for anything else. This way, the software can fully utilize all available CPU resources instead of wasting resources by keeping the thread running while it is blocked.

Parallelizing execution with Task.Run

I am trying to improve performane of some code which does some shopping function calling number of different vendors. 3rd party vendor call is async and results are processed to generate a result. Strucure of the code is as follows.
public async Task<List<ShopResult>> DoShopping(IEnumerable<Vendor> vendors)
{
var res = vendors.Select(async s => await DoShopAndProcessResultAsync(s));
await Task.WhenAll(res); ....
}
Since DoShopAndProcessResultAsync is both IO bound and CPU bound, and each vendor iteration is independant I think Task.Run can be used to do something like below.
public async Task<List<ShopResult>> DoShopping(IEnumerable<Vendor> vendors)
{
var res = vendors.Select(s => Task.Run(() => DoShopAndProcessResultAsync(s)));
await Task.WhenAll(res); ...
}
Using Task.Run as is having a performance gain and I can see multiple threads are being involved here from the order of execution of the calls. And it is running without any issue locally on my machine.
However, it is a tasks of tasks scenario and wondering whether any pitfalls or this is deadlock prone in a high traffic prod environment.
What are your opinions on the approach of using Task.Run to parallelize async calls?

Tasks are .NET's low-level building blocks. .NET almost always has a better high-level abstraction for specific concurrency paradigms.
To paraphrase Rob Pike (slides) Concurrency is not parallelism is not asynchronous execution. What you ask is concurrent execution, with a specific degree-of-parallelism. NET already offers high-level classes that can do that, without resorting to low-level task handling.
At the end, I explain why these distinctions matter and how they're implemented using different .NET classes or libraries
Dataflow blocks
At the highest level, the Dataflow classes allow creating a pipeline of processing blocks similar to a Powershell or Bash pipeline, where each block can use one or more tasks to process input. Dataflow blocks preserve message order, ensuring results are emitted in the order the input messages were received.
You'll often see combinations of block called meshes, not pipelines. Dataflow grew out of the Microsoft Robotics Framework and can be used to create a network of independent processing blocks. Most programmers just use to build a pipeline of steps though.
In your case, you could use a TransformBlock to execute DoShopAndProcessResultAsync and feed the output either to another processing block, or a BufferBlock you can read after processing all results. You could even split Shop and Process into separate blocks, each with its own logic and degree of parallelism
Eg.
var buffer=new BufferBlock<ShopResult>();
var blockOptions=new ExecutionDataflowBlockOptions {
MaxDegreeOfParallelism=3,
BoundedCapacity=1
};
var shop=new TransformBlock<Vendor,ShopResult)(DoShopAndProcessResultAsync,
blockOptions);
var linkOptions=new DataflowLinkOptions{ PropagateCompletion=true;}
shop.LinkTo(buffer,linkOptions);
foreach(var v in vendors)
{
await shop.SendAsync(v);
}
shop.Complete();
await shop.Completion;
buffer.TryReceiveAll(out IList<ShopResult> results);
You can use two separate blocks to shop and process :
var shop=new TransformBlock<Vendor,ShopResponse>(DoShopAsync,shopOptions);
var process=new TransformBlock<ShopResponse,ShopResult>(DoProcessAsync,processOptions);
shop.LinkTo(process,linkOptions);
process.LinkTo(results,linkOptions);
foreach(var v in vendors)
{
await shop.SendAsync(v);
}
shop.Complete();
await process.Completion;
In this case we await the completion of the last block in the chain before reading the results.
Instead of reading from a buffer block, we could use an ActionBlock at the end to do whatever we want to do with the results, eg store them to a database. The results can be batched using a BatchBlock to reduce the number of storage operations
...
var batch=new BatchBlock<ShopResult>(100);
var store=new ActionBlock<ShopResult[]>(DoStoreAsync);
shop.LinkTo(process,linkOptions);
process.LinkTo(batch,linkOptions);
batch.LinkTo(store,linkOptions);
...
shop.Complete();
await store.Completion;
Why do names matter
Tasks are the lowest level building blocks used to implement multiple paradigms. In other languages you'd see them described as Futures or Promises (eg Javascript)
Parallelism in .NET means executing CPU-bound computations over a lot of data using all available cores. Parallel.ForEach will partition the input data into roughly as many partitions as there are cores and use one worker task per partition. PLINQ goes one step further, allowing the use of LINQ operators to specify the computation and let PLINQ to use algorithms optimized for parallel execution to map, filter, sort, group and collect results. That's why Parallel.ForEach can't be used for async work at all.
Concurrency means executing multiple independent and often IO-bound jobs. At the lowest level you can use Tasks but Dataflow, Rx.NET, Channels, IAsyncEnumerable etc allow the use of high-level patterns like CSP/Pipelines, event stream processing etc
Asynchronous execution means you don't have to block while waiting for I/O-bound work to complete.

What is alarming with the Task.Run approach in your question, is that it depletes the ThreadPool from available worker threads in a non-controlled manner. It doesn't offer any configuration option that would allow you to reduce the parallelism of each individual request, in favor of preserving the scalability of the whole service. That's something that might bite you in the long run.
Ideally you would like to control both the parallelism and the concurrency, and control them independently. For example you might want to limit the maximum concurrency of the I/O-bound work to 10, and the maximum parallelism of the CPU-bound work to 2. Regarding the former you could take a look at this question: How to limit the amount of concurrent async I/O operations?
Regarding the later, you could use a TaskScheduler with limited concurrency. The ConcurrentExclusiveSchedulerPair is a handy class for this purpose. Here is an example of how you could rewrite your DoShopping method in a way that limits the ThreadPool usage to two threads at maximum (per request), without limiting at all the concurrency of the I/O-bound work:
public async Task<ShopResult[]> DoShopping(IEnumerable<Vendor> vendors)
{
var scheduler = new ConcurrentExclusiveSchedulerPair(
TaskScheduler.Default, maxConcurrencyLevel: 2).ConcurrentScheduler;
var tasks = vendors.Select(vendor =>
{
return Task.Factory.StartNew(() => DoShopAndProcessResultAsync(vendor),
default, TaskCreationOptions.DenyChildAttach, scheduler).Unwrap();
});
return await Task.WhenAll(tasks);
}
Important: In order for this to work, the DoShopAndProcessResultAsync method should be implemented internally without .ConfigureAwait(false) at the await points. Otherwise the continuations after the await will not run on our preferred scheduler, and the goal of limiting the ThreadPool utilization will be defeated.
My personal preference though would be to use instead the new (.NET 6) Parallel.ForEachAsync API. Apart from making it easy to control the concurrency through the MaxDegreeOfParallelism option, it also comes with a better behavior in case of exceptions. Instead of launching invariably all the async operations, it stops launching new operations as soon as a previously launched operation has failed. This can make a big difference in the responsiveness of your service, in case for example that all individual async operations are failing with a timeout exception. You can find here a synopsis of the main differences between the Parallel.ForEachAsync and the Task.WhenAll APIs.
Unfortunately the Parallel.ForEachAsync has the disadvantage that it doesn't return the results of the async operations. Which means that you have to collect the results manually as a side-effect of each async operation. I've posted here a ForEachAsync variant that returns results, that combines the best aspects of the Parallel.ForEachAsync and the Task.WhenAll APIs. You could use it like this:
public async Task<ShopResult[]> DoShopping(IEnumerable<Vendor> vendors)
{
var scheduler = new ConcurrentExclusiveSchedulerPair(
TaskScheduler.Default, maxConcurrencyLevel: 2).ConcurrentScheduler;
ParallelOptions options = new() { MaxDegreeOfParallelism = 10 };
return await ForEachAsync(vendors, options, async (vendor, ct) =>
{
return await Task.Factory.StartNew(() => DoShopAndProcessResultAsync(vendor),
ct, TaskCreationOptions.DenyChildAttach, scheduler).Unwrap();
});
}
Note: In my initial answer (revision 1) I had suggested erroneously to pass the scheduler through the ParallelOptions.TaskScheduler property. I just found out that this doesn't work as I expected. The ParallelOptions class has an internal property EffectiveMaxConcurrencyLevel that represents the minimum of the MaxDegreeOfParallelism and the TaskScheduler.MaximumConcurrencyLevel. The implementation of the Parallel.ForEachAsync method uses this property, instead of reading directly the MaxDegreeOfParallelism. So the MaxDegreeOfParallelism, by being larger than the MaximumConcurrencyLevel, was effectively ignored.
You've probably also noticed by now that the names of these two settings are confusing. We use the MaximumConcurrencyLevel in order to control the number of threads (aka the parallelization), and we use the MaxDegreeOfParallelism in order to control the amount of concurrent async operations (aka the concurrency). The reason for this confusing terminology can be traced to the historic origins of these APIs. The ParallelOptions class was introduced before the async-await era, and the designers of the new Parallel.ForEachAsync API aimed at making it compatible with the older non-asynchronous members of the Parallel class.

Effects of async within a parallel for loop

I am going to start by saying that I am learning about mulithreading at the moment so it may be the case that not all I say is correct - please feel free to correct me as required. I do have a reasonable understanding of async and await.
My basic aim is as follows:
I have a body of code that currently takes about 3 seconds. I am trying to load some data at the start of the method that will be used right at the end. My plan is to load the data on a different thread right at the start - allowing the rest of the code to execute independently. Then, at the point that I need the data, the code will wait if the data is not loaded. So far this is all seems to be working fine and as I describe.
My question relates to what happens when I call a method that is async, within a parallel for loop, without awaiting it.
My code follows this structure:
public void MainCaller()
{
List<int> listFromThread = null;
var secondThread = Task.Factory.StartNew(() =>
{
listFromThread = GetAllLists().Result;
});
//Do some other stuff
secondThread.Wait();
//Do not pass this point until this thread has completed
}
public Task<List<int>> GetAllLists()
{
var intList = new List<int>(){ /*Whatever... */};
var returnList = new List<int>();
Parallel.ForEach(intList, intEntry =>
{
var res = MyMethod().Result;
returnList.AddRange(res);
});
return Task.FromResult(returnList);
}
private async Task<List<int>> MyMethod()
{
var myList = await obtainList.ToListAsync();
}
Note the Parallel for Loop calls the async method, but does not await it as it is not async itself.
This is a method that is used elsewhere, so it is valid that it is async. I know one option is to make a copy of this method that is not async, but I am trying to understand what will happen here.
My question is, can I be sure that when I reach secondThread.Wait(); the async part of the execution will be complete. Eg will wait to know wait for the async part to complete, or will async mess up the wait, or will it work seamlessly together?
It seems to me it could be possible that as the call to MyMethod is not awaited, but there is an await within MyMethod, the parallel for loop could continue execution before the awaited call has completed?
Then I think, as it is assigning it by reference, then once the assigning takes place, the value will be the correct result.
This leads me to think that as long as the wait will know to wait for the async to complete, then there is no problem - hence my question.
I guess this relates to my lack of understanding of Tasks?
I hope this is clear?

In your code there is no part that is executed asynchrounously.
In MainCaller, you start a Task and immediately Wait for it to finished.
This is a blocking operation which only introduces the extra overhead of calling
GetAllLists in another Task.
In this Task you call You start a new Task (by calling GettAllLists) but immediately
wait for this Task to finish by waiting for its Result (which is also blocking).
In the Task started by GetAllLists you have the Parallel.Foreach loop which starts
several new Tasks. Each of these 'for' Tasks will start another Task by calling
MyMethod and immediately waiting for its result.
The net result is that your code completely executes synchronously. The only parallelism is introduced in the Parallel.For loop.
Hint: a usefull thread concerning this topic: Using async/await for multiple tasks
Additionally your code contains a serious bug:
Each Task created by the Parallel.For loop will eventually add its partial List to the ReturnList by calling AddRange. 'AddRange' is not thread safe, so you need to have some synchronisation mechanism (e.g. 'Lock') or there is the possibility that your ReturnList gets corrupted or does not contain all the results. See also: Is the List<T>.AddRange() thread safe?

How to call an async method from within a loop without awaiting?

Consider this piece of code, where there is some work being done within a for loop, and then a recursive call to process sub items. I wanted to convert DoSomething(item) and GetItems(id) to async methods, but if I await on them here, the for loop is going to wait for each iteration to finish before moving on, essentially losing the benefit of parallel processing. How could I improve the performance of this method? Is it possible to do it using async/await?
public void DoWork(string id)
{
var items = GetItems(id); //takes time
if (items == null)
return;
Parallel.ForEach(items, item =>
{
DoSomething(item); //takes time
DoWork(item.subItemId);
});
}

Instead of using Parallel.ForEach to loop over the items you can create a sequence of tasks and then use Task.WhenAll to wait for them all to complete. As your code also involves recursion it gets slightly more complicated and you need to combine DoSomething and DoWork into a single method which I have aptly named DoIt:
async Task DoWork(String id) {
var items = GetItems(id);
if (items == null)
return;
var tasks = items.Select(DoIt);
await Task.WhenAll(tasks);
}
async Task DoIt(Item item) {
await DoSomething(item);
await DoWork(item.subItemId);
}
Mixing Parallel.ForEach and async/await is a bad idea. Parallel.ForEach will allow your code to execute in parallel and for compute intensive but parallelizable algorithms you get the best performance. However async/await allows your code to execute concurrently and for instance reuse threads that are blocked on IO operations.
Simplified Parallel.ForEach will setup as many threads as you have CPU cores on your computer and then partition the items you are iterating to be executed across these threads. So Parallel.ForEach should be used once at the bottom of your call stack where it will then fan out the work to multiple threads and wait for them to complete. Calling Parallel.ForEach in a recursive manner inside each of these threads is just crazy and will not improve performance at all.

How can I wait till the Parallel.ForEach completes

I'm using TPL in my current project and using Parallel.Foreach to spin many threads. The Task class contains Wait() to wait till the task gets completed. Like that, how I can wait for the Parallel.ForEach to complete and then go into executing next statements?

You don't have to do anything special, Parallel.Foreach() will wait until all its branched tasks are complete. From the calling thread you can treat it as a single synchronous statement and for instance wrap it inside a try/catch.
Update:
The old Parallel class methods are not a good fit for async (Task based) programming. But starting with dotnet 6 we can use Parallel.ForEachAsync()
await Parallel.ForEachAsync(items, (item, cancellationToken) =>
{
await ...
});
There are a few overloads available and the 'body' method should return a ValueTask.

You don't need that with Parallel.Foreach: it only executes the foreach in as many thread as there are processors available, but it returns synchronously.
More information can be found here

As everyone here said - you dont need to wait for it. What I can add from my experience:
If you have an async body to execute and you await some async calls inside, it just ran through my code and did not wait for anything. So I just replaced the await with .Result - then it worked as intended. I couldnt find out though why is that so :/

if you are storing results from the tasks in a List, make sure to use a thread-safe data structure such as ConcurrentBag, otherwise, some results would be missing because of concurrent write issues.

I believe that you can use IsCompleted like follows:
if(Parallel.ForEach(files, f => ProcessFiles(f)).IsCompleted)
{
// DO STUFF
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.