Task.WaitSubset / Task.WaitN? - c#

There're Task.WaitAll method which waits for all tasks and Task.WaitAny method which waits for one task. How to wait for any N tasks?
Use case: search result pages are downloaded, each result needs a separate task to download and process it. If I use WaitAll to wait for the results of the subtasks before getting next search result page, I will not use all available resources (one long task will delay the rest). Not waiting at all can cause thousands of tasks to be queued which isn't the best idea either.
So, how to wait for a subset of tasks to be completed? Or, alternatively, how to wait for the task scheduler queue to have only N tasks?

This looks like an excellent problem for TPL Dataflow, which will allow you to control parallelism and buffering to process at maximum speed.
Here's some (untested) code to show you what I mean:
static void Process()
{
var searchReader =
new TransformManyBlock<SearchResult, SearchResult>(async uri =>
{
// return a list of search results at uri.
return new[]
{
new SearchResult
{
IsResult = true,
Uri = "http://foo.com"
},
new SearchResult
{
// return the next search result page here.
IsResult = false,
Uri = "http://google.com/next"
}
};
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 8, // restrict buffer size.
MaxDegreeOfParallelism = 4 // control parallelism.
});
// link "next" pages back to the searchReader.
searchReader.LinkTo(searchReader, x => !x.IsResult);
var resultActor = new ActionBlock<SearchResult>(async uri =>
{
// do something with the search result.
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 64,
MaxDegreeOfParallelism = 16
});
// link search results into resultActor.
searchReader.LinkTo(resultActor, x => x.IsResult);
// put in the first piece of input.
searchReader.Post(new SearchResult { Uri = "http://google/first" });
}
struct SearchResult
{
public bool IsResult { get; set; }
public string Uri { get; set; }
}

I think you should independently limit the number of parallel download tasks and the number of concurrent result processing tasks. I would do it using two SemaphoreSlim objects, like below. This version doesn't use the synchronous SemaphoreSlim.Wait (thanks #svick for making the point). It was only slightly tested, the exception handling can be improved; substitute your own DownloadNextPageAsync and ProcessResults:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
namespace Console_21666797
{
partial class Program
{
// the actual download method
// async Task<string> DownloadNextPageAsync(string url) { ... }
// the actual process methods
// void ProcessResults(string data) { ... }
// download and process all pages
async Task DownloadAndProcessAllAsync(
string startUrl, int maxDownloads, int maxProcesses)
{
// max parallel downloads
var downloadSemaphore = new SemaphoreSlim(maxDownloads);
// max parallel processing tasks
var processSemaphore = new SemaphoreSlim(maxProcesses);
var tasks = new HashSet<Task>();
var complete = false;
var protect = new Object(); // protect tasks
var page = 0;
// do the page
Func<string, Task> doPageAsync = async (url) =>
{
bool downloadSemaphoreAcquired = true;
try
{
// download the page
var data = await DownloadNextPageAsync(
url).ConfigureAwait(false);
if (String.IsNullOrEmpty(data))
{
Volatile.Write(ref complete, true);
}
else
{
// enable the next download to happen
downloadSemaphore.Release();
downloadSemaphoreAcquired = false;
// process this download
await processSemaphore.WaitAsync();
try
{
await Task.Run(() => ProcessResults(data));
}
finally
{
processSemaphore.Release();
}
}
}
catch (Exception)
{
Volatile.Write(ref complete, true);
throw;
}
finally
{
if (downloadSemaphoreAcquired)
downloadSemaphore.Release();
}
};
// do the page and save the task
Func<string, Task> queuePageAsync = async (url) =>
{
var task = doPageAsync(url);
lock (protect)
tasks.Add(task);
await task;
lock (protect)
tasks.Remove(task);
};
// process pages in a loop until complete is true
while (!Volatile.Read(ref complete))
{
page++;
// acquire download semaphore synchrnously
await downloadSemaphore.WaitAsync().ConfigureAwait(false);
// do the page
var task = queuePageAsync(startUrl + "?page=" + page);
}
// await completion of the pending tasks
Task[] pendingTasks;
lock (protect)
pendingTasks = tasks.ToArray();
await Task.WhenAll(pendingTasks);
}
static void Main(string[] args)
{
new Program().DownloadAndProcessAllAsync("http://google.com", 10, 5).Wait();
Console.ReadLine();
}
}
}

Something like this should work. There might be some edge cases, but all in all it should ensure a minimum of completions.
public static async Task WhenN(IEnumerable<Task> tasks, int n, CancellationTokenSource cts = null)
{
var pending = new HashSet<Task>(tasks);
if (n > pending.Count)
{
n = pending.Count;
// or throw
}
var completed = 0;
while (completed != n)
{
var completedTask = await Task.WhenAny(pending);
pending.Remove(completedTask);
completed++;
}
if (cts != null)
{
cts.Cancel();
}
}
Usage:
static void Main(string[] args)
{
var tasks = new List<Task>();
var completed = 0;
var cts = new CancellationTokenSource();
for (int i = 0; i < 100; i++)
{
tasks.Add(Task.Run(async () =>
{
await Task.Delay(temp * 100, cts.Token);
Console.WriteLine("Completed task {0}", i);
completed++;
}, cts.Token));
}
Extensions.WhenN(tasks, 30, cts).Wait();
Console.WriteLine(completed);
Console.ReadLine();
}

Task[] runningTasks = MyTasksFactory.StartTasks();
while(runningTasks.Any())
{
int finished = Task.WaitAny(runningTasks);
Task.Factory.StareNew(()=> {Consume(runningTasks[Finished].Result);})
runningTasks.RemoveAt(finished);
}

Related

How to concurrently complete HTTP calls on an observable collection?

In the WPF .net core app there is the following:
An Observable Collection of items (itemObservCollection).
A static readonly HttpClient _httpclient
XML Responses
I am making a URL call to the api on each item in the observable collection (0 to 1000 items in collection). The return is XML. The XML is parsed using XElement. The property values in the observable collection are updated from the XML.
Task.Run is used to run the operation off the UI thread. Parallel.Foreach is used to make the calls in Parallel.
I feel I have made the solution overly complicated. Is there a way to simplify this? UpdateItems() is called from a button click.
private async Task UpdateItems()
{
try
{
await Task.Run(() => Parallel.ForEach(itemObservCollection, new ParallelOptions { MaxDegreeOfParallelism = 12 }, async item =>
{
try
{
var apiRequestString = $"http://localhost:6060/" + item.Name;
HttpResponseMessage httpResponseMessage = await _httpclient.GetAsync(apiRequestString);
var httpResponseStream = await httpResponseMessage.Content.ReadAsStreamAsync();
StringBuilder sb = new StringBuilder(1024);
XElement doc = XElement.Load(httpResponseStream);
foreach (var elem in doc.Descendants())
{
if (elem.Name == "ItemDetails")
{
var itemUpdate = itemObservCollection.FirstOrDefault(updateItem => updateItem.Name == item.Name);
if (itemUpdate != null)
{
itemUpdate.Price = decimal.Parse(elem.Attribute("Price").Value);
itemUpdate.Quantity = int.Parse(elem.Attribute("Quantity").Value);
}
}
}
}
catch (Exception ex)
{
LoggerTextBlock.Text = ('\n' + ex.ToString());
}
}));
}
catch (Exception ex)
{
LoggerTextBlock.Text = ('\n' + ex.ToString());
}
}
You could create an array of tasks and await them all using Task.WhenAll.
The following sample code kicks off a task per item in the ObservableCollection<int> and then wait asynchronously for all tasks to finish:
ObservableCollection<int> itemObservCollection =
new ObservableCollection<int>(Enumerable.Range(1, 10));
async Task SendAsync()
{
//query the HTTP API here...
await Task.Delay(1000);
}
await Task.WhenAll(itemObservCollection.Select(x => SendAsync()).ToArray());
If you want to limit the number of concurrent requests, you could either iterate through a subset of the source collecton to send requests in batches or use a SemaphoreSlim to limit the number of actual concurrent requests:
Task[] tasks = new Task[itemObservCollection.Count];
using (SemaphoreSlim semaphoreSlim = new SemaphoreSlim(12))
{
for (int i = 0; i < itemObservCollection.Count; ++i)
{
async Task SendAsync()
{
//query the HTTP API here...
try
{
await Task.Delay(5000);
}
finally
{
semaphoreSlim.Release();
}
}
await semaphoreSlim.WaitAsync();
tasks[i] = SendAsync();
}
await Task.WhenAll(tasks);
}

How to run looping task untill other task is finished

I found many questions addressing how to sequence tasks and waiting until all tasks finish, but with this topic, I found only 1 question from 2016 with no answers.
I'm processing a large text file in my project and I want to indicate that this process is running with the text being displayed with changing number of dots after the "Processing" text. I got to the point, where the intended looping task is working until a long working task finishes and the proper field in the VM is updated, but I can't make looping task to be delayed so dots are changing in the way it's seen.
In other words - the same functionality as when a loader is displayed while data are being retrieved from the HTTP request.
public void SetRawTextFromAbsPath(string path)
{
if (!File.Exists(path))
{
return;
}
var rawText = "Processing";
bool IsReadingFileFinished = false;
Task<string> getRawTextFromAbsPath = Task.Run(() => {
var result = FileProcessingServices.GetRawFileText(path);
IsReadingFileFinished = true;
return result;
});
Task updateProgressText = Task.Run(async () =>
{
while (!IsReadingFileFinished)
{
rawText = await Task.Run(() => ProcessingTextChange(rawText));
SelectedFileRaw = rawText;
}
});
Task.WaitAll(getRawTextFromAbsPath, updateProgressText);
SelectedFileRaw = completeRawText.Result;
}
public string ProcessingTextChange(string text)
{
Task.Delay(100);
var dotsCount = text.Count<char>(ch => ch == '.');
return dotsCount < 6 ? text + "." : text.Replace(".", "");
}
After learning from all the answers, I come up with this solution:
private const string PROGRESS = "Progress";
private const int PROGRESS_DELAY = 200;
public async void RunProgressTextUpdate()
{
var cts = new CancellationTokenSource();
if (!IsRunning)
{
UpdateProgressTextTask(cts.Token);
string longTaskText = await Task.Run(() => LongTask(cts));
await Task.Delay(PROGRESS_DELAY);
ProgressText = longTaskText;
}
}
private void UpdateProgressTextTask(CancellationToken token)
{
Task.Run(async () =>
{
ProgressText = PROGRESS;
while (!token.IsCancellationRequested)
{
await Task.Delay(PROGRESS_DELAY);
var dotsCount = ProgressText.Count<char>(ch => ch == '.');
ProgressText = dotsCount < 6 ? ProgressText + "." : ProgressText.Replace(".", "");
}
});
}
private string LongTask(CancellationTokenSource cts)
{
var result = Task.Run(async () =>
{
await Task.Delay(5000);
cts.Cancel();
return "Long task finished.";
});
return result.Result;
}
Every way of creating Task and running them is overloaded to expect a CancellationToken. CancellationTokens are, unsurprinsignly, structs that allows us to cancel Tasks.
Having this two methods
public void DelayedWork()
{
Task.Run(async () =>
{
// Simulate some async work
await Task.Delay(1000);
});
}
public void LoopingUntilDelayedWorkFinishes()
{
Task.Run(() =>
{
int i = 0;
// We keep looping until the Token is not cancelled
while (true) // May be?
{
Console.WriteLine($"{++i} iteration ...");
}
});
}
We want LoopingUntilDelayedWorkFinishes to stop looping when DelayedWork finishes (well, naming was quite obvious).
We can provide a CancellationToken to our LoopingUntilDelayedWorkFinishes method. So it will keep looping until it is cancelled.
public void LoopingUntilDelayedWorkFinishes(CancellationToken token)
{
Task.Run(() =>
{
int i = 0;
// We keep looping until the Token is not cancelled
while (!token.IsCancellationRequested)
{
Console.WriteLine($"{++i} iteration ...");
}
}, token); // This is the overload expecting the Token
}
Okay, working. We can control this CancellationToken by extracting from a CancellationTokenSource, which controls its CancellationToken.
var cts = new CancellationTokenSource();
p.LoopingUntilDelayedWorkFinishes(cts.Token);
And now, we need our DelayedWork to cancel the token when it finishes.
public void DelayedWork(CancellationTokenSource cts)
{
Task.Run(async () =>
{
// Simulate some async work
await Task.Delay(1000);
// Once it is done, we cancel.
cts.Cancel();
});
}
That is how we could call the methods.
var cts = new CancellationTokenSource();
p.DelayedWork(cts);
p.LoopingUntilDelayedWorkFinishes(cts.Token);
The call order between DelayedWork and LoopingUntilDelayedWorkFinishes is not that important (in that case).
Maybe LoopingUntilDelayedWorkFinishes can return a Task and the await for it later on, I don't know. I just depends on our needs.
There are tons of ways to achieve this. The environment arround Task is so bast and the API is quite confusing sometimes.
Here's how you could do it. Maybe some smart use of async/await syntax would improve the solution I gave. But, here's the main idea.
Hope it helps.

How to get the individual API call status success response in C#

How to get the individual API call status success response in C#.
I am creating a mobile application using Xamarin Forms,
In my application, I need to prefetch certain information when app launches to use the mobile application.
Right now, I am calling the details like this,
public async Task<Response> GetAllVasInformationAsync()
{
var userDetails = GetUserDetailsAsync();
var getWageInfo = GetUserWageInfoAsync();
var getSalaryInfo = GetSalaryInfoAsync();
await Task.WhenAll(userDetails,
getWageInfo,
getSalaryInfo,
);
var resultToReturn = new Response
{
IsuserDetailsSucceeded = userDetails.Result,
IsgetWageInfoSucceeded = getWageInfo.Result,
IsgetSalaryInfoSucceeded = getSalaryInfo.Result,
};
return resultToReturn;
}
In my app I need to update details based on the success response. Something like this (2/5) completed. And the text should be updated whenever we get a new response.
What is the best way to implement this feature? Is it possible to use along with Task.WhenAll. Because I am trying to wrap everything in one method call.
In my app I need to update details based on the success response.
The proper way to do this is IProgress<string>. The calling code should supply a Progress<string> that updates the UI accordingly.
public async Task<Response> GetAllVasInformationAsync(IProgress<string> progress)
{
var userDetails = UpdateWhenComplete(GetUserDetailsAsync(), "user details");
var getWageInfo = UpdateWhenComplete(GetUserWageInfoAsync(), "wage information");
var getSalaryInfo = UpdateWhenComplete(GetSalaryInfoAsync(), "salary information");
await Task.WhenAll(userDetails, getWageInfo, getSalaryInfo);
return new Response
{
IsuserDetailsSucceeded = await userDetails,
IsgetWageInfoSucceeded = await getWageInfo,
IsgetSalaryInfoSucceeded = await getSalaryInfo,
};
async Task<T> UpdateWhenComplete<T>(Task<T> task, string taskName)
{
try { return await task; }
finally { progress?.Report($"Completed {taskName}"); }
}
}
If you also need a count, you can either use IProgress<(int, string)> or change how the report progress string is built to include the count.
So here's what I would do in C# 8 and .NET Standard 2.1:
First, I create the method which will produce the async enumerable:
static async IAsyncEnumerable<bool> TasksToPerform() {
Task[] tasks = new Task[3] { userDetails, getWageInfo, getSalaryInfo };
for (i = 0; i < tasks.Length; i++) {
await tasks[i];
yield return true;
}
}
So now you need to await foreach on this task enumerable. Every time you get a return, you know that a task has been finished.
int numberOfFinishedTasks = 0;
await foreach (var b in TasksToPerform()) {
numberOfFinishedTasks++;
//Update UI here to reflect the finished task number
}
No need to over-complicate this. This code will show how many of your tasks had exceptions. Your await task.whenall just triggers them and waits for them to finish. So after that you can do whatever you want with the tasks :)
var task = Task.Delay(300);
var tasks = new List<Task> { task };
var faultedTasks = 0;
tasks.ForEach(t =>
{
t.ContinueWith(t2 =>
{
//do something with a field / property holding ViewModel state
//that your view is listening to
});
});
await Task.WhenAll(tasks);
//use this to respond with a finished count
tasks.ForEach(_ => { if (_.IsFaulted) faultedTasks++; });
Console.WriteLine($"{tasks.Count() - faultedTasks} / {tasks.Count()} completed.");
.WhenAll() will allow you to determine if /any/ of the tasks failed, they you just count the tasks that have failed.
public async Task<Response> GetAllVasInformationAsync()
{
var userDetails = GetUserDetailsAsync();
var getWageInfo = GetUserWageInfoAsync();
var getSalaryInfo = GetSalaryInfoAsync();
await Task.WhenAll(userDetails, getWaitInfo, getSalaryInfo)
.ContinueWith((task) =>
{
if(task.IsFaulted)
{
int failedCount = 0;
if(userDetails.IsFaulted) failedCount++;
if(getWaitInfo.IsFaulted) failedCount++;
if(getSalaryInfo.IsFaulted) failedCount++;
return $"{failedCount} tasks failed";
}
});
var resultToReturn = new Response
{
IsuserDetailsSucceeded = userDetails.Result,
IsgetWageInfoSucceeded = getWageInfo.Result,
IsgetSalaryInfoSucceeded = getSalaryInfo.Result,
};
return resultToReturn;
}

Parallel Mulit-threaded Downloads using async-await

I have 100s of multiple big files to download from web in my windows service - C#. The requirement is to maintain at one time - max 4 parallel web file downloads.
Can I achieve concurrent/parallel downloads using async await or do I have to use BackgroundWorker process or threads ? Is async-await multithreaded ?
See my sample Program using async-await below:
static int i = 0;
Timer_tick() {
while (i < 4) {
i++;
model = GetNextModel();
await Download(model);
}
}
private async Download(XYZ model) {
Task<FilesetResult> t = DoWork(model);
result = await t;
//Use Result
}
private async Task<FilesetResult> Work(XYZ model) {
fileresult = await api.Download(model.path)
i--;
return filesetresult;
}
You can limit number of async tasks running in parallel using SemaphoreSlim class. Something like:
List<DownloadRequest> requests = Enumerable.Range(0, 100).Select(x => new DownloadRequest()).ToList();
using (var throttler = new SemaphoreSlim(4))
{
Task<DownloadResult>[] downloadTasks = requests.Select(request => Task.Run(async () =>
{
await throttler.WaitAsync();
try
{
return await DownloadTaskAsync(request);
}
finally
{
throttler.Release();
}
})).ToArray();
await Task.WhenAll(downloadTasks);
}
Update: thank you for comments, fixed issues.
Update2: Sample solution for dynamic list of requests
public class DownloadManager : IDisposable
{
private readonly SemaphoreSlim _throttler = new SemaphoreSlim(4);
public async Task<DownloadResult> DownloadAsync(DownloadRequest request)
{
await _throttler.WaitAsync();
try
{
return await api.Download(request);
}
finally
{
_throttler.Release();
}
}
public void Dispose()
{
_throttler?.Dispose();
}
}
Doing it by hand seems awfully complicated.
var files = new List<Uri>();
Parallel.ForEach(files,
new ParallelOptions { MaxDegreeOfParallelism = 4 },
this.Download);
Now all you need is a single, normal, synchronous method private void Download(Uri file) and you are good to go.
If you need a producer/consumer pattern, the easiest version might be a BlockingCollection:
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApp11
{
internal class Program
{
internal static void Main()
{
using (var queue = new BlockingCollection<Uri>())
{
// starting the producer task:
Task.Factory.StartNew(() =>
{
for (int i = 0; i < 100; i++)
{
// faking read from message queue... we get a new Uri every 100 ms
queue.Add(new Uri("http://www.example.com/" + i));
Thread.Sleep(100);
}
// just to end this program... you don't need to end this, just listen to your message queue
queue.CompleteAdding();
});
// run the consumers:
Parallel.ForEach(queue.GetConsumingEnumerable(), new ParallelOptions { MaxDegreeOfParallelism = 4 }, Download);
}
}
internal static void Download(Uri uri)
{
// download your file here
Console.WriteLine($"Downloading {uri} [.. ]");
Thread.Sleep(1000);
Console.WriteLine($"Downloading {uri} [..... ]");
Thread.Sleep(1000);
Console.WriteLine($"Downloading {uri} [....... ]");
Thread.Sleep(1000);
Console.WriteLine($"Downloading {uri} [......... ]");
Thread.Sleep(1000);
Console.WriteLine($"Downloading {uri} [..........]");
Thread.Sleep(1000);
Console.WriteLine($"Downloading {uri} OK");
}
}
}

TPL Dataflow block which delays the forward of the message to the next block

I require a Dataflow block which delays the forward of the message to the next block based on the timestamp in the message (LogEntry).
This is what i came up with but it feels not right. Any suggestions for improvements?
private IPropagatorBlock<LogEntry, LogEntry> DelayedForwardBlock()
{
var buffer = new ConcurrentQueue<LogEntry>();
var source = new BufferBlock<LogEntry>();
var target = new ActionBlock<LogEntry>(item =>
{
buffer.Enqueue(item);
});
Task.Run(() =>
{
LogEntry entry;
while (true)
{
entry = null;
if (buffer.TryPeek(out entry))
{
if (entry.UtcTimestamp < (DateTime.UtcNow - TimeSpan.FromMinutes(5)))
{
buffer.TryDequeue(out entry);
source.Post(entry);
}
}
}
});
target.Completion.ContinueWith(delegate
{
LogEntry entry;
while (buffer.TryDequeue(out entry))
{
source.Post(entry);
}
source.Complete();
});
return DataflowBlock.Encapsulate(target, source);
}
You could simply use a single TransformBlock that asynchronously waits out the delay using Task.Delay:
IPropagatorBlock<TItem, TItem> DelayedForwardBlock<TItem>(TimeSpan delay)
{
return new TransformBlock<TItem, TItem>(async item =>
{
await Task.Delay(delay);
return item;
});
}
Usage:
var block = DelayedForwardBlock<LogEntry>(TimeSpan.FromMinutes(5));

Categories

Resources