Chaining tasks with continuation and run parallel task afterward

Chaining tasks with continuation and run parallel task afterward - c#

The work flow of the parallel tasks
I am hoping to get help on the problem I am facing. So the problem is that I am running parallel tasks to search through folders for files. Each task entails identifying files and add it to a array of files. Next, wait until every task completes so that files are gathered up, then perform sorting on the results. Next, process the sorted file independently, by running one task per file to read through it to get a matching pattern back. The final stage is to aggregate all the results together in human readable format and display it in a user-friendly way.
So the question is that I want to chain the tasks in a proper way that does not blocks the UI thread. I would like to be able to cancel everything at any stage the program is at.
To sum it up:
Stage 1: Find files by searching through folders. Each task search recursively through a folder tree.
Stage 2: Sort all the files found and clean up duplicates
Stage 3: Start new tasks to process the files independently. Each task opens a file and search for matching pattern.
Stage 4: Aggregate result from every single file search into one giant result set and make it pretty for human to read.
List<Task> myTasks = new List<Task>();
// ==== stage 1 ======
for(int i = 0; i < 10; i++) {
string directoryName = directories[i];
Task t = new Task(() =>
{
FindFiles(directoryName);
});
myTasks.Add(t);
t.Start();
}
// ==== stage 2 ====
Task sortTask = Task.Factory.ContinueWhenAll(myTasks.ToArray(), (t) =>
{
if(_fileResults.Count > 1) {
// sort the files and remove any duplicates
}
});
sortTask.Wait();
// ==== stage 3 ====
Task tt = new Task(() =>
{
Parallel.For(0, _fileResults.Count, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount, CancellationToken = token, TaskScheduler = _taskScheduler },
(i, loopstate) => {
// 1. open file
// 2. read file
// 3. read file line by line
}
}
// == stage 4 ===
tt.ContinueWith((t) =>
{
// 1. aggregate the file results into one giant result set
// 2. display the giant result set in human readable format
}, token, TaskContinuationOptions.OnlyOnRanToCompletion, TaskScheduler.FromCurrentSynchronizationContext());
tt.start();

Don't synchronously wait for any of the tasks to finish. If any of those operations need to take place after a previously created task, add that work as a continuation of that task instead.

Have you considered using the async/await feature - By the sounds of your question, it's the perfect match for your needs. Here's a quick attempt at your problem using it:
try
{
List<Task<File[]>> stage1Tasks = new List<Task<File[]>>();
// ==== stage 1 ======
for (int i = 0; i < 10; i++)
{
string directoryName = directories[i];
Task<File[]> t = Task.Run(() =>
{
return FindFiles(directoryName);
},
token);
stage1Tasks.Add(t);
}
File[][] files = await Task.WhenAll(stage1Tasks).ConfigureAwait(false);
// Flatten files.
File[] _fileResults = files.SelectMany(x => x).ToArray();
// ==== stage 2 ====
Task<File[]> sortFilesTask = Task.Run(() =>
{
if (_fileResults.Count > 1)
{
// sort the files and remove any duplicates
return _fileResults.Reverse().ToArray();
}
},
token);
File[] _sortedFileResults = await sortFilesTask.ConfigureAwait(false);
// ==== stage 3 ====
Task<SomeResult[]> tt = Task.Run(() =>
{
SomeResult[] results = new SomeResult[_sortedFileResults.Length];
Parallel.ForEach(_sortedFileResults,
new ParallelOptions {
MaxDegreeOfParallelism = Environment.ProcessorCount,
CancellationToken = token,
TaskScheduler = _taskScheduler
},
(i, loopstate) =>
{
// 1. open file
// 2. read file
// 3. read file line by line
results[i] = new SomeResult( /* here goes your results for each file */);
});
return results;
},
token);
SomeResult[] theResults = await tt.ConfigureAwait(false);
// == stage 4 ===
// 1. aggregate the file results into one giant result set
// 2. display the giant result set in human readable format
// ....
}
catch (TaskCanceledException)
{
// some task has been cancelled...
}

Related

Reading millions of small files with C#

I have millions of log files which generating every day and I need to read all of them and put together as a single file to do some process on it in other app.
I'm looking for the fastest way to do this. Currently I'm using Threads, Tasks and parallel like this:
Parallel.For(0, files.Length, new ParallelOptions { MaxDegreeOfParallelism = 100 }, i =>
{
ReadFiles(files[i]);
});
void ReadFiles(string file)
{
try
{
var txt = File.ReadAllText(file);
filesTxt.Add(tmp);
}
catch { }
GlobalCls.ThreadNo--;
}
or
foreach (var file in files)
{
//Int64 index = i;
//var file = files[index];
while (Process.GetCurrentProcess().Threads.Count > 100)
{
Thread.Sleep(100);
Application.DoEvents();
}
new Thread(() => ReadFiles(file)).Start();
GlobalCls.ThreadNo++;
// Task.Run(() => ReadFiles(file));
}
The problem is that after a few thousand reading files, the reading gets slower and slower!!
Any idea why? and what's the fastest approaches to reading millions small files? Thank you.

It seems that you are loading the contents of all files in memory, before writing them back to the single file. This could explain why the process becomes slower over time.
A way to optimize the process is to separate the reading part from the writing part, and do them in parallel. This is called the producer-consumer pattern. It can be implemented with the Parallel class, or with threads, or with tasks, but I will demonstrate instead an implementation based on the powerful TPL Dataflow library, that is particularly suited for jobs like this.
private static async Task MergeFiles(IEnumerable<string> sourceFilePaths,
string targetFilePath, CancellationToken cancellationToken = default,
IProgress<int> progress = null)
{
var readerBlock = new TransformBlock<string, string>(async filePath =>
{
return File.ReadAllText(filePath); // Read the small file
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 2, // Reading is parallelizable
BoundedCapacity = 100, // No more than 100 file-paths buffered
CancellationToken = cancellationToken, // Cancel at any time
});
StreamWriter streamWriter = null;
int filesProcessed = 0;
var writerBlock = new ActionBlock<string>(text =>
{
streamWriter.Write(text); // Append to the target file
filesProcessed++;
if (filesProcessed % 10 == 0) progress?.Report(filesProcessed);
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 1, // We can't parallelize the writer
BoundedCapacity = 100, // No more than 100 file-contents buffered
CancellationToken = cancellationToken, // Cancel at any time
});
readerBlock.LinkTo(writerBlock,
new DataflowLinkOptions() { PropagateCompletion = true });
// This is a tricky part. We use BoundedCapacity, so we must propagate manually
// a possible failure of the writer to the reader, otherwise a deadlock may occur.
PropagateFailure(writerBlock, readerBlock);
// Open the output stream
using (streamWriter = new StreamWriter(targetFilePath))
{
// Feed the reader with the file paths
foreach (var filePath in sourceFilePaths)
{
var accepted = await readerBlock.SendAsync(filePath,
cancellationToken); // Cancel at any time
if (!accepted) break; // This will happen if the reader fails
}
readerBlock.Complete();
await writerBlock.Completion;
}
async void PropagateFailure(IDataflowBlock block1, IDataflowBlock block2)
{
try { await block1.Completion.ConfigureAwait(false); }
catch (Exception ex)
{
if (block1.Completion.IsCanceled) return; // On cancellation do nothing
block2.Fault(ex);
}
}
}
Usage example:
var cts = new CancellationTokenSource();
var progress = new Progress<int>(value =>
{
// Safe to update the UI
Console.WriteLine($"Files processed: {value:#,0}");
});
var sourceFilePaths = Directory.EnumerateFiles(#"C:\SourceFolder", "*.log",
SearchOption.AllDirectories); // Include subdirectories
await MergeFiles(sourceFilePaths, #"C:\AllLogs.log", cts.Token, progress);
The BoundedCapacity is used to keep the memory usage under control.
If the disk drive is SSD, you can try reading with a MaxDegreeOfParallelism larger than 2.
For best performance you could consider writing to a different disc drive than the drive containing the source files.
The TPL Dataflow library is available as a package for .NET Framework, and is build-in for .NET Core.

When it comes to IO operations, CPU parallelism is useless. Your IO device (disk, network, whatever) is your bottleneck. By reading from the device concurrently you risk to even lower your performance.

Perhaps you can just use PowerShell to concatenate the files, such as in this answer.
Another alternative is to write a program that uses the FileSystemWatcher class to watch for new files and append them as they are created.

Await async call in for-each-loop [duplicate]

This question already has answers here:
Using async/await for multiple tasks
(8 answers)
Closed 4 years ago.
I have a method in which I'm retrieving a list of deployments. For each deployment I want to retrieve an associated release. Because all calls are made to an external API, I now have a foreach-loop in which those calls are made.
public static async Task<List<Deployment>> GetDeployments()
{
try
{
var depjson = await GetJson($"{BASEURL}release/deployments?deploymentStatus=succeeded&definitionId=2&definitionEnvironmentId=5&minStartedTime={MinDateTime}");
var deployments = (JsonConvert.DeserializeObject<DeploymentWrapper>(depjson))?.Value?.OrderByDescending(x => x.DeployedOn)?.ToList();
foreach (var deployment in deployments)
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
}
return deployments;
}
catch (Exception)
{
throw;
}
}
This all works perfectly fine. However, I do not like the await in the foreach-loop at all. I also believe this is not considered good practice. I just don't see how to refactor this so the calls are made parallel, because the result of each call is used to set a property of the deployment.
I would appreciate any suggestions on how to make this method faster and, whenever possible, avoid the await-ing in the foreach-loop.

There is nothing wrong with what you are doing now. But there is a way to call all tasks at once instead of waiting for a single task, then processing it and then waiting for another one.
This is how you can turn this:
wait for one -> process -> wait for one -> process ...
into
wait for all -> process -> done
Convert this:
foreach (var deployment in deployments)
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
}
To:
var deplTasks = deployments.Select(d => GetJson($"{BASEURL}release/releases/{d.ReleaseId}"));
var reljsons = await Task.WhenAll(deplTasks);
for(var index = 0; index < deployments.Count; index++)
{
deployments[index].Release = JsonConvert.DeserializeObject<Release>(reljsons[index]);
}
First you take a list of unfinished tasks. Then you await it and you get a collection of results (reljson's). Then you have to deserialize them and assign to Release.
By using await Task.WhenAll() you wait for all the tasks at the same time, so you should see a performance boost from that.
Let me know if there are typos, I didn't compile this code.

Fcin suggested to start all Tasks, await for them all to finish and then start deserializing the fetched data.
However, if the first Task is already finished, but the second task not, and internally the second task is awaiting, the first task could already start deserializing. This would shorten the time that your process is idly waiting.
So instead of:
var deplTasks = deployments.Select(d => GetJson($"{BASEURL}release/releases/{d.ReleaseId}"));
var reljsons = await Task.WhenAll(deplTasks);
for(var index = 0; index < deployments.Count; index++)
{
deployments[index].Release = JsonConvert.DeserializeObject<Release>(reljsons[index]);
}
I'd suggest the following slight change:
// async fetch the Release data of Deployment:
private async Task<Release> FetchReleaseDataAsync(Deployment deployment)
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
return JsonConvert.DeserializeObject<Release>(reljson);
}
// async fill the Release data of Deployment:
private async Task FillReleaseDataAsync(Deployment deployment)
{
deployment.Release = await FetchReleaseDataAsync(deployment);
}
Then your procedure is similar to the solution that Fcin suggested:
IEnumerable<Task> tasksFillDeploymentWithReleaseData = deployments.
.Select(deployment => FillReleaseDataAsync(deployment)
.ToList();
await Task.WhenAll(tasksFillDeploymentWithReleaseData);
Now if the first task has to wait while fetching the release data, the 2nd task begins and the third etc. If the first task already finished fetching the release data, but the other tasks are awaiting for their release data, the first task starts already deserializing it and assigns the result to deployment.Release, after which the first task is complete.
If for instance the 7th task got its data, but the 2nd task is still waiting, the 7th task can deserialize and assign the data to deployment.Release. Task 7 is completed.
This continues until all tasks are completed. Using this method there is less waiting time because as soon as one task has its data it is scheduled to start deserializing

If i understand you right and you want to make the var reljson = await GetJson parralel:
Try this:
Parallel.ForEach(deployments, (deployment) =>
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
});
you might limit the number of parallel executions such as:
Parallel.ForEach(
deployments,
new ParallelOptions { MaxDegreeOfParallelism = 4 },
(deployment) =>
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
});
you might also want to be able to break the loop:
Parallel.ForEach(deployments, (deployment, state) =>
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
if (noFurtherProcessingRequired) state.Break();
});

Starting Multiple Async Tasks and Process Them As They Complete (C#)

So I am trying to learn how to write asynchronous methods and have been banging my head to get asynchronous calls to work. What always seems to happen is the code hangs on "await" instruction until it eventually seems to time out and crash the loading form in the same method with it.
There are two main reason this is strange:
The code works flawlessly when not asynchronous and just a simple loop
I copied the MSDN code almost verbatim to convert the code to asynchronous calls here: https://msdn.microsoft.com/en-us/library/mt674889.aspx
I know there are a lot of questions already about this on the forms but I have gone through most of them and tried a lot of other ways (with the same result) and now seem to think something is fundamentally wrong after MSDN code wasn't working.
Here is the main method that is called by a background worker:
// this method loads data from each individual webPage
async Task LoadSymbolData(DoWorkEventArgs _e)
{
int MAX_THREADS = 10;
int tskCntrTtl = dataGridView1.Rows.Count;
Dictionary<string, string> newData_d = new Dictionary<string, string>(tskCntrTtl);
// we need to make copies of things that can change in a different thread
List<string> links = new List<string>(dataGridView1.Rows.Cast<DataGridViewRow>()
.Select(r => r.Cells[dbIndexs_s.url].Value.ToString()).ToList());
List<string> symbols = new List<string>(dataGridView1.Rows.Cast<DataGridViewRow>()
.Select(r => r.Cells[dbIndexs_s.symbol].Value.ToString()).ToList());
// we need to create a cancelation token once this is working
// TODO
using (LoadScreen loadScreen = new LoadScreen("Querying stock servers..."))
{
// we cant use the delegate becaus of async keywords
this.loaderScreens.Add(loadScreen);
// wait until the form is loaded so we dont get exceptions when writing to controls on that form
while ( !loadScreen.IsLoaded() );
// load the total number of operations so we can simplify incrementing the progress bar
// on seperate form instances
loadScreen.LoadProgressCntr(0, tskCntrTtl);
// try to run all async tasks since they are non-blocking threaded operations
for (int i = 0; i < tskCntrTtl; i += MAX_THREADS)
{
List<Task<string[]>> ProcessURL = new List<Task<string[]>>();
List<int> taskList = new List<int>();
// Make a list of task indexs
for (int task = i; task < i + MAX_THREADS && task < tskCntrTtl; task++)
taskList.Add(task);
// ***Create a query that, when executed, returns a collection of tasks.
IEnumerable<Task<string[]>> downloadTasksQuery =
from task in taskList select QueryHtml(loadScreen, links[task], symbols[task]);
// ***Use ToList to execute the query and start the tasks.
List<Task<string[]>> downloadTasks = downloadTasksQuery.ToList();
// ***Add a loop to process the tasks one at a time until none remain.
while (downloadTasks.Count > 0)
{
// Identify the first task that completes.
Task<string[]> firstFinishedTask = await Task.WhenAny(downloadTasks); // <---- CODE HANGS HERE
// ***Remove the selected task from the list so that you don't
// process it more than once.
downloadTasks.Remove(firstFinishedTask);
// Await the completed task.
string[] data = await firstFinishedTask;
if (!newData_d.ContainsKey(data.First()))
newData_d.Add(data.First(), data.Last());
}
}
// now we have the dictionary with all the information gathered from teh websites
// now we can add the columns if they dont already exist and load the information
// TODO
loadScreen.UpdateProgress(100);
this.loaderScreens.Remove(loadScreen);
}
}
And here is the async method for querying web pages:
async Task<string[]> QueryHtml(LoadScreen _loadScreen, string _link, string _symbol)
{
string data = String.Empty;
try
{
HttpClient client = new HttpClient();
var doc = new HtmlAgilityPack.HtmlDocument();
var html = await client.GetStringAsync(_link); // <---- CODE HANGS HERE
doc.LoadHtml(html);
string percGrn = doc.FindInnerHtml(
"//span[contains(#class,'time_rtq_content') and contains(#class,'up_g')]//span[2]");
string percRed = doc.FindInnerHtml(
"//span[contains(#class,'time_rtq_content') and contains(#class,'down_r')]//span[2]");
// create somthing we'll nuderstand later
if ((String.IsNullOrEmpty(percGrn) && String.IsNullOrEmpty(percRed)) ||
(!String.IsNullOrEmpty(percGrn) && !String.IsNullOrEmpty(percRed)))
throw new Exception();
// adding string to empty gives string
string perc = percGrn + percRed;
bool isNegative = String.IsNullOrEmpty(percGrn);
double percDouble;
if (double.TryParse(Regex.Match(perc, #"\d+([.])?(\d+)?").Value, out percDouble))
data = (isNegative ? 0 - percDouble : percDouble).ToString();
}
catch (Exception ex) { }
finally
{
// update the progress bar...
_loadScreen.IncProgressCntr();
}
return new string[] { _symbol, data };
}
I could really use some help. Thanks!

In short when you combine async with any 'regular' task functions you get a deadlock
http://olitee.com/2015/01/c-async-await-common-deadlock-scenario/
the solution is by using configureawait
var html = await client.GetStringAsync(_link).ConfigureAwait(false);
The reason you need this is because you didn't await your orginal thread.
// ***Create a query that, when executed, returns a collection of tasks.
IEnumerable<Task<string[]>> downloadTasksQuery = from task in taskList select QueryHtml(loadScreen,links[task], symbols[task]);
What's happeneing here is that you mix the await paradigm with thre regular task handling paradigm. and those don't mix (or rather you have to use the ConfigureAwait(false) for this to work.

Executing N number of threads in parallel and in a sequential manner

I have an application where i have 1000+ small parts of 1 large file.
I have to upload maximum of 16 parts at a time.
I used Thread parallel library of .Net.
I used Parallel.For to divide in multiple parts and assigned 1 method which should be executed for each part and set DegreeOfParallelism to 16.
I need to execute 1 method with checksum values which are generated by different part uploads, so i have to set certain mechanism where i have to wait for all parts upload say 1000 to complete.
In TPL library i am facing 1 issue is it is randomly executing any of the 16 threads from 1000.
I want some mechanism using which i can run first 16 threads initially, if the 1st or 2nd or any of the 16 thread completes its task next 17th part should be started.
How can i achieve this ?

One possible candidate for this can be TPL Dataflow. This is a demonstration which takes in a stream of integers and prints them out to the console. You set the MaxDegreeOfParallelism to whichever many threads you wish to spin in parallel:
void Main()
{
var actionBlock = new ActionBlock<int>(
i => Console.WriteLine(i),
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 16});
foreach (var i in Enumerable.Range(0, 200))
{
actionBlock.Post(i);
}
}
This can also scale well if you want to have multiple producer/consumers.

Here is the manual way of doing this.
You need a queue. The queue is sequence of pending tasks. You have to dequeue and put them inside list of working task. When ever the task is done remove it from list of working task and take another from queue. Main thread controls this process. Here is the sample of how to do this.
For the test i used List of integer but it should work for other types because its using generics.
private static void Main()
{
Random r = new Random();
var items = Enumerable.Range(0, 100).Select(x => r.Next(100, 200)).ToList();
ParallelQueue(items, DoWork);
}
private static void ParallelQueue<T>(List<T> items, Action<T> action)
{
Queue pending = new Queue(items);
List<Task> working = new List<Task>();
while (pending.Count + working.Count != 0)
{
if (pending.Count != 0 && working.Count < 16) // Maximum tasks
{
var item = pending.Dequeue(); // get item from queue
working.Add(Task.Run(() => action((T)item))); // run task
}
else
{
Task.WaitAny(working.ToArray());
working.RemoveAll(x => x.IsCompleted); // remove finished tasks
}
}
}
private static void DoWork(int i) // do your work here.
{
// this is just an example
Task.Delay(i).Wait();
Console.WriteLine(i);
}
Please let me know if you encounter problem of how to implement DoWork for your self. because if you change method signature you may need to do some changes.
Update
You can also do this with async await without blocking the main thread.
private static void Main()
{
Random r = new Random();
var items = Enumerable.Range(0, 100).Select(x => r.Next(100, 200)).ToList();
Task t = ParallelQueue(items, DoWork);
// able to do other things.
t.Wait();
}
private static async Task ParallelQueue<T>(List<T> items, Func<T, Task> func)
{
Queue pending = new Queue(items);
List<Task> working = new List<Task>();
while (pending.Count + working.Count != 0)
{
if (working.Count < 16 && pending.Count != 0)
{
var item = pending.Dequeue();
working.Add(Task.Run(async () => await func((T)item)));
}
else
{
await Task.WhenAny(working);
working.RemoveAll(x => x.IsCompleted);
}
}
}
private static async Task DoWork(int i)
{
await Task.Delay(i);
}

var workitems = ... /*e.g. Enumerable.Range(0, 1000000)*/;
SingleItemPartitioner.Create(workitems)
.AsParallel()
.AsOrdered()
.WithDegreeOfParallelism(16)
.WithMergeOptions(ParallelMergeOptions.NotBuffered)
.ForAll(i => { Thread.Slee(1000); Console.WriteLine(i); });
This should be all you need. I forgot how the methods are named exactly... Look at the documentation.
Test this by printing to the console after sleeping for 1sec (which this sample code does).

Another option would be to use a BlockingCollection<T> as a queue between your file reader thread and your 16 uploader threads. Each uploader thread would just loop around consuming the blocking collection until it is complete.
And, if you want to limit memory consumption in the queue you can set an upper limit on the blocking collection such that the file-reader thread will pause when the buffer has reached capacity. This is particularly useful in a server environment where you may need to limit memory used per user/API call.
// Create a buffer of 4 chunks between the file reader and the senders
BlockingCollection<Chunk> queue = new BlockingCollection<Chunk>(4);
// Create a cancellation token source so you can stop this gracefully
CancellationTokenSource cts = ...
File reader thread
...
queue.Add(chunk, cts.Token);
...
queue.CompleteAdding();
Sending threads
for(int i = 0; i < 16; i++)
{
Task.Run(() => {
foreach (var chunk in queue.GetConsumingEnumerable(cts.Token))
{
.. do the upload
}
});
}

Async/await tasks and WaitHandle

Say I have 10N items(I need to fetch them via http protocol), in the code N Tasks are started to get data, each task takes 10 items in sequence. I put the items in a ConcurrentQueue<Item>. After that, the items are processed in a thread-unsafe method one by one.
async Task<Item> GetItemAsync()
{
//fetch one item from the internet
}
async Task DoWork()
{
var tasks = new List<Task>();
var items = new ConcurrentQueue<Item>();
var handles = new List<ManualResetEvent>();
for i 1 -> N
{
var handle = new ManualResetEvent(false);
handles.Add(handle);
tasks.Add(Task.Factory.StartNew(async delegate
{
for j 1 -> 10
{
var item = await GetItemAsync();
items.Enqueue(item);
}
handle.Set();
});
}
//begin to process the items when any handle is set
WaitHandle.WaitAny(handles);
while(true)
{
if (all handles are set && items collection is empty) //***
break;
//in another word: all tasks are really completed
while(items.TryDequeue(out item))
{
AThreadUnsafeMethod(item); //process items one by one
}
}
}
I don't know what if condition can be placed in the statement marked ***. I can't use Task.IsCompleted property here, because I use await in the task, so the task is completed very soon. And a bool[] that indicates whether the task is executed to the end looks really ugly, because I think ManualResetEvent can do the same work. Can anyone give me a suggestion?

Well, you could build this yourself, but I think it's tons easier with TPL Dataflow.
Something like:
static async Task DoWork()
{
// By default, ActionBlock uses MaxDegreeOfParallelism == 1,
// so AThreadUnsafeMethod is not called in parallel.
var block = new ActionBlock<Item>(AThreadUnsafeMethod);
// Start off N tasks, each asynchronously acquiring 10 items.
// Each item is sent to the block as it is received.
var tasks = Enumerable.Range(0, N).Select(Task.Run(
async () =>
{
for (int i = 0; i != 10; ++i)
block.Post(await GetItemAsync());
})).ToArray();
// Complete the block when all tasks have completed.
Task.WhenAll(tasks).ContinueWith(_ => { block.Complete(); });
// Wait for the block to complete.
await block.Completion;
}

You can do a WaitOne with a timeout of zero to check the state. Something like this should work:
if (handles.All(handle => handle.WaitOne(TimeSpan.Zero)) && !items.Any())
break;
http://msdn.microsoft.com/en-us/library/cc190477.aspx

Thanks all. At last I found CountDownEvent is very suitable for this scenario. The general implementation looks like this:(for others' information)
for i 1 -> N
{
//start N tasks
//invoke CountDownEvent.Signal() at the end of each task
}
//see if CountDownEvent.IsSet here

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Chaining tasks with continuation and run parallel task afterward - c#

Don't synchronously wait for any of the tasks to finish. If any of those operations need to take place after a previously created task, add that work as a continuation of that task instead.

Related

Reading millions of small files with C#

Await async call in for-each-loop [duplicate]

Starting Multiple Async Tasks and Process Them As They Complete (C#)

Executing N number of threads in parallel and in a sequential manner

Async/await tasks and WaitHandle

Categories

Resources