Asynchronous foreach - c#

is there a way for an Asynchronous foreach in C#?
where id(s) will be processed asynchronously by the method, instead of using Parallel.ForEach
//This Gets all the ID(s)-IEnumerable <int>
var clientIds = new Clients().GetAllClientIds();
Parallel.ForEach(clientIds, ProcessId); //runs the method in parallel
static void ProcessId(int id)
{
// just process the id
}
should be something a foreach but runs asynchronously
foreach(var id in clientIds)
{
ProcessId(id) //runs the method with each Id asynchronously??
}
i'm trying to run the Program in Console, it should wait for all id(s) to complete processing before closing the Console.

No, it is not really possible.
Instead in foreach loop add what you want to do as Task for Task collection and later use Task.WaitAll.
var tasks = new List<Task>();
foreach(var something in somethings)
tasks.Add(DoJobAsync(something));
await Task.WhenAll(tasks);
Note that method DoJobAsync should return Task.
Update:
If your method does not return Task but something else (eg void) you have two options which are essentially the same:
1.Add Task.Run(action) to tasks collection instead
tasks.Add(Task.Run(() => DoJob(something)));
2.Wrap your sync method in method returning Task
private Task DoJobAsync(Something something)
{
return Task.Run(() => DoJob(something));
}
You can also use Task<TResult> generic if you want to receive some results from task execution.

Your target method would have to return a Task
static Task ProcessId(int id)
{
// just process the id
}
Processing ids would be done like this
// This Gets all the ID(s)-IEnumerable <int>
var clientIds = new Clients().GetAllClientIds();
// This gets all the tasks to be executed
var tasks = clientIds.Select(id => ProcessId(id)).
// this will create a task that will complete when all of the `Task`
// objects in an enumerable collection have completed.
await Task.WhenAll(tasks);

Now, in .NET 6 there is already a built-in Parallel.ForEachAsync
See:
https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreachasync?view=net-6.0
https://www.hanselman.com/blog/parallelforeachasync-in-net-6

Related

Task List with parameters

I'm needing to create a list of tasks to execute a routine that takes one parameter and then wait for those tasks to complete before continuing with the rest of the program code. Here is an example:
List<Task> tasks = new List<Task>();
foreach (string URL in LIST_URL_COLLECTION)
{
tasks[i] = Task.Factory.StartNew(
GoToURL(URL)
);
}
//wait for them to finish
Console.WriteLine("Done");
I've have googled and searched this site but I just keep hitting a dead end, I did this once but can't remember how.
The Task Parallel Library exposes a convinent way to asynchronously wait for the completion of all tasks via the Task.WhenAll method. The method returns a Task by itself which is awaitable and should be awaited:
public async Task QueryUrlsAsync()
{
var urlFetchingTasks = ListUrlCollection.Select(url => Task.Run(url));
await Task.WhenAll(urlFetchingTasks);
Console.WriteLine("Done");
}
Note that in order to await, your method must be marked with the async modifier in the method signature and return either a Task (if it has no return value) or a Task<T> (if it does have a return value, which type is T).
As a side note, your method looks like it's fetching urls, which i am assuming is generating a web request to some endpoint. In order to do that, there's no need to use extra threads via Task.Factory.StartNew or Task.Run, as these operations are naturally asynchronous. You should look into HttpClient as a starting point. For example, your method could look like this:
public async Task QueryUrlsAsync()
{
var urlFetchingTasks = ListUrlCollection.Select(url =>
{
var httpClient = new HttpClient();
return httpClient.GetAsync(url);
});
await Task.WhenAll(urlFetchingTasks);
Console.WriteLine("Done");
}

Do I create a deadlock for Task.WhenAll()

I seem to be experiencing a deadlock with the following code, but I do not understand why.
From a certain point in code I call this method.
public async Task<SearchResult> Search(SearchData searchData)
{
var tasks = new List<Task<FolderResult>>();
using (var serviceClient = new Service.ServiceClient())
{
foreach (var result in MethodThatCallsWebservice(serviceClient, config, searchData))
tasks.Add(result);
return await GetResult(tasks);
}
Where GetResult is as following:
private static async Task<SearchResult> GetResult(IEnumerable<Task<FolderResult>> tasks)
{
var result = new SearchResult();
await Task.WhenAll(tasks).ConfigureAwait(false);
foreach (var taskResult in tasks.Select(p => p.MyResult))
{
foreach (var folder in taskResult.Result)
{
// Do stuff to fill result
}
}
return result;
}
The line var result = new SearchResult(); never completes, though the GUI is responsive because of the following code:
public async void DisplaySearchResult(Task<SearchResult> searchResult)
{
var result = await searchResult;
FillResultView(result);
}
This method is called via an event handler that called the Search method.
_view.Search += (sender, args) => _view.DisplaySearchResult(_model.Search(args.Value));
The first line of DisplaySearchResult gets called, which follows the path down to the GetResult method with the Task.WhenAll(...) part.
Why isn't the Task.WhenAll(...) ever completed? Did I not understand the use of await correctly?
If I run the tasks synchronously, I do get the result but then the GUI freezes:
foreach (var task in tasks)
task.RunSynchronously();
I read various solutions, but most were in combination with Task.WaitAll() and therefore did not help much. I also tried to use the help from this blogpost as you can see in DisplaySearchResult but I failed to get it to work.
Update 1:
The method MethodThatCallsWebservice:
private IEnumerable<Task<FolderResult>> MethodThatCallsWebservice(ServiceClient serviceClient, SearchData searchData)
{
// Doing stuff here to determine keys
foreach(var key in keys)
yield return new Task<FolderResult>(() => new FolderResult(key, serviceClient.GetStuff(input))); // NOTE: This is not the async variant
}
Since you have an asynchronous version of GetStuff (GetStuffAsync) it's much better to use it instead of offloading the synchronous GetStuff to a ThreadPool thread with Task.Run. This wastes threads and limits scalability.
async methods return a "hot" task so you don't need to call Start:
IEnumerable<Task<FolderResult>> MethodThatCallsWebservice(ServiceClient serviceClient, SearchData searchData)
{
return keys.Select(async key =>
new FolderResult(key, await serviceClient.GetStuffAsync(input)));
}
You need to start your tasks before you return them. Or even better use Task.Run.
This:
yield return new Task<FolderResult>(() =>
new FolderResult(key, serviceClient.GetStuff(input)))
// NOTE: This is not the async variant
Is better written as:
yield return Task.Run<FolderResult>(() =>
new FolderResult(key, serviceClient.GetStuff(input)));

Check if at least one thread is completed

First of all I am totally new to threading in C#. I have created multiple threads as shown below.
if (flag)
{
foreach (string empNo in empList)
{
Thread thrd = new Thread(()=>ComputeSalary(empNo));
threadList.Add(thrd);
thrd.Start();
}
}
Before proceeding further I need check if at least one thread is completed its execution so that I can perform additional operations.
I also tried creating the list of type thread and by added it to list, so that I can check if at least one thread has completed its execution. I tried with thrd.IsAlive but it always gives me current thread status.
Is there any other way to check if atleast on thread has completed its execution?
You can use AutoResetEvent.
var reset = new AutoResetEvent(false); // ComputeSalary should have access to reset
.....
....
if (flag)
{
foreach (string empNo in empList)
{
Thread thrd = new Thread(()=>ComputeSalary(empNo));
threadList.Add(thrd);
thrd.Start();
}
reset.WaitOne();
}
.....
.....
void ComputeSalary(int empNo)
{
.....
reset.set()
}
Other options are callback function, event or a flag/counter(this is not advised).
Here is a solution based on the Task Parallel Library:
// Create a list of tasks for each string in empList
List<Task> empTaskList = empList.Select(emp => Task.Run(() => ComputeSalary(emp)))
.ToList();
// Give me the task that finished first.
var firstFinishedTask = await Task.WhenAny(empTaskList);
A couple of things to note:
In order to use await inside your method, you will have to declare it as async Task or or async Task<T> where T is the desired return type
Task.Run is your equivalent of new Thread().Start(). The difference is Task.Run will use the ThreadPool (unless you explicitly tell it not to), and the Thread class will construct an entirely new thread.
Notice the use of await. This tells the compiler to yield control back to the caller until Task.WhenAny returns the first task that finished.
You should read more about async-await here

async i/o and process results as they become available

I has a simple console app where I want to call many Urls in a loop and put the result in a database table. I am using .Net 4.5 and using async i/o to fetch the URL data. Here is a simplified version of what I am doing. All methods are async except for the database operation. Do you guys see any issues with this? Are there better ways of optimizing?
private async Task Run(){
var items = repo.GetItems(); // sync method to get list from database
var tasks = new List<Task>();
// add each call to task list and process result as it becomes available
// rather than waiting for all downloads
foreach(Item item in items){
tasks.Add(GetFromWeb(item.url).ContinueWith(response => { AddToDatabase(response.Result);}));
}
await Task.WhenAll(tasks); // wait for all tasks to complete.
}
private async Task<string> GetFromWeb(url) {
HttpResponseMessage response = await GetAsync(url);
return await response.Content.ReadAsStringAsync();
}
private void AddToDatabase(string item){
// add data to database.
}
Your solution is acceptable. But you should check out TPL Dataflow, which allows you to set up a dataflow "mesh" (or "pipeline") and then shove the data through it.
For a problem this simple, Dataflow won't really add much other than getting rid of the ContinueWith (I always find manual continuations awkward). But if you plan to add more steps or change your data flow in the future, Dataflow should be something you consider.
Your solution is pretty much correct, with just two minor mistakes (both of which cause compiler errors). First, you don't call ContinueWith on the result of List.Add, you need call continue with on the task and then add the continuation to your list, this is solved by just moving a parenthesis. You also need to call Result on the reponse Task.
Here is the section with the two minor changes:
tasks.Add(GetFromWeb(item.url)
.ContinueWith(response => { AddToDatabase(response.Result);}));
Another option is to leverage a method that takes a sequence of tasks and orders them by the order that they are completed. Here is my implementation of such a method:
public static IEnumerable<Task<T>> Order<T>(this IEnumerable<Task<T>> tasks)
{
var taskList = tasks.ToList();
var taskSources = new BlockingCollection<TaskCompletionSource<T>>();
var taskSourceList = new List<TaskCompletionSource<T>>(taskList.Count);
foreach (var task in taskList)
{
var newSource = new TaskCompletionSource<T>();
taskSources.Add(newSource);
taskSourceList.Add(newSource);
task.ContinueWith(t =>
{
var source = taskSources.Take();
if (t.IsCanceled)
source.TrySetCanceled();
else if (t.IsFaulted)
source.TrySetException(t.Exception.InnerExceptions);
else if (t.IsCompleted)
source.TrySetResult(t.Result);
}, CancellationToken.None, TaskContinuationOptions.PreferFairness, TaskScheduler.Default);
}
return taskSourceList.Select(tcs => tcs.Task);
}
Using this your code can become:
private async Task Run()
{
IEnumerable<Item> items = repo.GetItems(); // sync method to get list from database
foreach (var task in items.Select(item => GetFromWeb(item.url))
.Order())
{
await task.ConfigureAwait(false);
AddToDatabase(task.Result);
}
}
Just though I'd throw in my hat as well with the Rx solution
using System.Reactive;
using System.Reactive.Linq;
private Task Run()
{
var fromWebObservable = from item in repo.GetItems.ToObservable(Scheduler.Default)
select GetFromWeb(item.url);
fromWebObservable
.Select(async x => await x)
.Do(AddToDatabase)
.ToTask();
}

List of objects with async Task methods, execute all concurrently

Given the following:
BlockingCollection<MyObject> collection;
public class MyObject
{
public async Task<ReturnObject> DoWork()
{
(...)
return await SomeIOWorkAsync();
}
}
What would be the correct/most performant way to execute all DoWork() tasks asynchronously on all MyObjects in collection concurrently (while capturing the return object), ideally with a sensible thread limit though (I believe the Task Factory/ThreadPool does some management here)?
You can make use of the WhenAll extension method.
var combinedTask = await Task.WhenAll(collection.Select(x => x.DoWork());
It will start all tasks concurrently and waits for all to finish.
ThreadPool manages the number of threads running, but that won't help you much with asynchronous Tasks.
Because of that, you need something else. One way to do this is to utilize ActionBlock from TPL Dataflow:
int limit = …;
IEnumerable<MyObject> collection = …;
var block = new ActionBlock<MyObject>(
o => o.DoWork(),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = limit });
foreach (var obj in collection)
block.Post(o);
block.Complete();
await block.Completion;
What would be the correct/most performant way to execute all DoWork() tasks asynchronously on all MyObjects in collection concurrently (while capturing the return object), ideally with a sensible thread limit
The easiest way to do that is with Task.WhenAll:
ReturnObject[] results = await Task.WhenAll(collection.Select(x => x.DoWork()));
This will invoke DoWork on all MyObjects in the collection and then wait for them all to complete. The thread pool handles all throttling sensibly.
Is there a different way if I want to capture every individual DoWork() return immediately instead of waiting for all items to complete?
Yes, you can use the method described by Jon Skeet and Stephen Toub. I have a similar solution in my AsyncEx library (available via NuGet), which you can use like this:
// "tasks" is of type "Task<ReturnObject>[]"
var tasks = collection.Select(x => x.DoWork()).OrderByCompletion();
foreach (var task in tasks)
{
var result = await task;
...
}
My comment was a bit cryptic, so I though I'd add this answer:
List<Task<ReturnObject>> workTasks =
collection.Select( o => o.DoWork() ).ToList();
List<Task> resultTasks =
workTasks.Select( o => o.ContinueWith( t =>
{
ReturnObject r = t.Result;
// do something with the result
},
// if you want to run this on the UI thread
TaskScheduler.FromCurrentSynchronizationContext()
)
)
.ToList();
await Task.WhenAll( resultTasks );

Categories

Resources