I have two loops that use SemaphoreSlim and a array of strings "Contents"
a foreachloop:
var allTasks = new List<Task>();
var throttle = new SemaphoreSlim(10,10);
foreach (string s in Contents)
{
await throttle.WaitAsync();
allTasks.Add(
Task.Run(async () =>
{
try
{
rootResponse.Add(await POSTAsync(s, siteurl, src, target));
}
finally
{
throttle.Release();
}
}));
}
await Task.WhenAll(allTasks);
a for loop:
var allTasks = new List<Task>();
var throttle = new SemaphoreSlim(10,10);
for(int s=0;s<Contents.Count;s++)
{
await throttle.WaitAsync();
allTasks.Add(
Task.Run(async () =>
{
try
{
rootResponse[s] = await POSTAsync(Contents[s], siteurl, src, target);
}
finally
{
throttle.Release();
}
}));
}
await Task.WhenAll(allTasks);
the first foreach loop runs well, but the for loops Task.WhenAll(allTasks) returns a OutOfRangeException and I want the Contents[] index and List index to match.
Can I fix the for loop? or is there a better approach?
This would fix your current problems
for (int s = 0; s < Contents.Count; s++)
{
var content = Contents[s];
allTasks.Add(
Task.Run(async () =>
{
await throttle.WaitAsync();
try
{
rootResponse[s] = await POSTAsync(content, siteurl, src, target);
}
finally
{
throttle.Release();
}
}));
}
await Task.WhenAll(allTasks);
However this is a fairly messy and nasty piece of code. This looks a bit neater
public static async Task DoStuffAsync(Content[] contents, string siteurl, string src, string target)
{
var throttle = new SemaphoreSlim(10, 10);
// local method
async Task<(Content, SomeResponse)> PostAsyncWrapper(Content content)
{
await throttle.WaitAsync();
try
{
// return a content and result pair
return (content, await PostAsync(content, siteurl, src, target));
}
finally
{
throttle.Release();
}
}
var results = await Task.WhenAll(contents.Select(PostAsyncWrapper));
// do stuff with your results pairs here
}
There are many other ways you could do this, PLinq, Parallel.For,Parallel.ForEach, Or just tidying up your captures in your loops like above.
However since you have an IO bound work load, and you have async methods that run it. The most appropriate solution is the async await pattern which neither Parallel.For,Parallel.ForEach cater for optimally.
Another way is TPL DataFlow library which can be found in the System.Threading.Tasks.Dataflow nuget package.
Code
public static async Task DoStuffAsync(Content[] contents, string siteurl, string src, string target)
{
async Task<(Content, SomeResponse)> PostAsyncWrapper(Content content)
{
return (content, await PostAsync(content, siteurl, src, target));
}
var bufferblock = new BufferBlock<(Content, SomeResponse)>();
var actionBlock = new TransformBlock<Content, (Content, SomeResponse)>(
content => PostAsyncWrapper(content),
new ExecutionDataflowBlockOptions
{
EnsureOrdered = false,
MaxDegreeOfParallelism = 100,
SingleProducerConstrained = true
});
actionBlock.LinkTo(bufferblock);
foreach (var content in contents)
actionBlock.Post(content);
actionBlock.Complete();
await actionBlock.Completion;
if (bufferblock.TryReceiveAll(out var result))
{
// do stuff with your results pairs here
}
}
Basically this creates a BufferBlock And TransformBlock, You pump your work load into the TransformBlock, it has degrees of parallel in its options, and it pushes them into the BufferBlock, you await completion and get your results.
Why Dataflow? because it deals with the async await, it has MaxDegreeOfParallelism, it designed for IO bound or CPU Bound workloads, and its extremely simple to use. Additionally, as most data is generally processed in many ways (in a pipeline), you can then use it to pipe and manipulate streams of data in sequence and parallel or in any way way you choose down the line.
Anyway good luck
Related
I'm a bit new to using Semaphores, and I'm a bit confused by this. All I'm doing is grabbing the html code from a page, and writing it to a file, over and over. When I use it in a foreach, it works perfectly. When I change it over to a for loop, I get a the file is used by another process exception, and it references the file that is after the max count. What is the difference between these two? They should be processing the same files.
Works perfectly:
private async Task GetFiles()
{
string strDir = ConfigurationManager.AppSettings["location"];
var maxThreads = 4;
var allTasks = new List<Task>();
SemaphoreSlim throttler = new SemaphoreSlim(initialCount: maxThreads);
foreach(FileToProcess ftpFile in lstFiles)
{
await throttler.WaitAsync();
allTasks.Add(
Task.Run(async () =>
{
try
{
await GetHtmlFromUrlAsync(ftpFile, strDir);
}
finally
{
throttler.Release();
}
}));
}
await Task.WhenAll(allTasks);
}
Gets an error after on the 5th item in the list (max thread + 1). If I adjusted the max thread count, the error would change as well:
private async Task GetFiles()
{
string strDir = ConfigurationManager.AppSettings["location"];
var maxThreads = 4;
var allTasks = new List<Task>();
SemaphoreSlim throttler = new SemaphoreSlim(initialCount: maxThreads);
for (int x = 0; x < lstFiles.Count; x++)
{
await throttler.WaitAsync();
allTasks.Add(
Task.Run(async () =>
{
try
{
await GetHtmlFromUrlAsync(lstFiles[x], strDir);
}
finally
{
throttler.Release();
}
}));
}
await Task.WhenAll(allTasks);
}
I found many questions addressing how to sequence tasks and waiting until all tasks finish, but with this topic, I found only 1 question from 2016 with no answers.
I'm processing a large text file in my project and I want to indicate that this process is running with the text being displayed with changing number of dots after the "Processing" text. I got to the point, where the intended looping task is working until a long working task finishes and the proper field in the VM is updated, but I can't make looping task to be delayed so dots are changing in the way it's seen.
In other words - the same functionality as when a loader is displayed while data are being retrieved from the HTTP request.
public void SetRawTextFromAbsPath(string path)
{
if (!File.Exists(path))
{
return;
}
var rawText = "Processing";
bool IsReadingFileFinished = false;
Task<string> getRawTextFromAbsPath = Task.Run(() => {
var result = FileProcessingServices.GetRawFileText(path);
IsReadingFileFinished = true;
return result;
});
Task updateProgressText = Task.Run(async () =>
{
while (!IsReadingFileFinished)
{
rawText = await Task.Run(() => ProcessingTextChange(rawText));
SelectedFileRaw = rawText;
}
});
Task.WaitAll(getRawTextFromAbsPath, updateProgressText);
SelectedFileRaw = completeRawText.Result;
}
public string ProcessingTextChange(string text)
{
Task.Delay(100);
var dotsCount = text.Count<char>(ch => ch == '.');
return dotsCount < 6 ? text + "." : text.Replace(".", "");
}
After learning from all the answers, I come up with this solution:
private const string PROGRESS = "Progress";
private const int PROGRESS_DELAY = 200;
public async void RunProgressTextUpdate()
{
var cts = new CancellationTokenSource();
if (!IsRunning)
{
UpdateProgressTextTask(cts.Token);
string longTaskText = await Task.Run(() => LongTask(cts));
await Task.Delay(PROGRESS_DELAY);
ProgressText = longTaskText;
}
}
private void UpdateProgressTextTask(CancellationToken token)
{
Task.Run(async () =>
{
ProgressText = PROGRESS;
while (!token.IsCancellationRequested)
{
await Task.Delay(PROGRESS_DELAY);
var dotsCount = ProgressText.Count<char>(ch => ch == '.');
ProgressText = dotsCount < 6 ? ProgressText + "." : ProgressText.Replace(".", "");
}
});
}
private string LongTask(CancellationTokenSource cts)
{
var result = Task.Run(async () =>
{
await Task.Delay(5000);
cts.Cancel();
return "Long task finished.";
});
return result.Result;
}
Every way of creating Task and running them is overloaded to expect a CancellationToken. CancellationTokens are, unsurprinsignly, structs that allows us to cancel Tasks.
Having this two methods
public void DelayedWork()
{
Task.Run(async () =>
{
// Simulate some async work
await Task.Delay(1000);
});
}
public void LoopingUntilDelayedWorkFinishes()
{
Task.Run(() =>
{
int i = 0;
// We keep looping until the Token is not cancelled
while (true) // May be?
{
Console.WriteLine($"{++i} iteration ...");
}
});
}
We want LoopingUntilDelayedWorkFinishes to stop looping when DelayedWork finishes (well, naming was quite obvious).
We can provide a CancellationToken to our LoopingUntilDelayedWorkFinishes method. So it will keep looping until it is cancelled.
public void LoopingUntilDelayedWorkFinishes(CancellationToken token)
{
Task.Run(() =>
{
int i = 0;
// We keep looping until the Token is not cancelled
while (!token.IsCancellationRequested)
{
Console.WriteLine($"{++i} iteration ...");
}
}, token); // This is the overload expecting the Token
}
Okay, working. We can control this CancellationToken by extracting from a CancellationTokenSource, which controls its CancellationToken.
var cts = new CancellationTokenSource();
p.LoopingUntilDelayedWorkFinishes(cts.Token);
And now, we need our DelayedWork to cancel the token when it finishes.
public void DelayedWork(CancellationTokenSource cts)
{
Task.Run(async () =>
{
// Simulate some async work
await Task.Delay(1000);
// Once it is done, we cancel.
cts.Cancel();
});
}
That is how we could call the methods.
var cts = new CancellationTokenSource();
p.DelayedWork(cts);
p.LoopingUntilDelayedWorkFinishes(cts.Token);
The call order between DelayedWork and LoopingUntilDelayedWorkFinishes is not that important (in that case).
Maybe LoopingUntilDelayedWorkFinishes can return a Task and the await for it later on, I don't know. I just depends on our needs.
There are tons of ways to achieve this. The environment arround Task is so bast and the API is quite confusing sometimes.
Here's how you could do it. Maybe some smart use of async/await syntax would improve the solution I gave. But, here's the main idea.
Hope it helps.
How to get the individual API call status success response in C#.
I am creating a mobile application using Xamarin Forms,
In my application, I need to prefetch certain information when app launches to use the mobile application.
Right now, I am calling the details like this,
public async Task<Response> GetAllVasInformationAsync()
{
var userDetails = GetUserDetailsAsync();
var getWageInfo = GetUserWageInfoAsync();
var getSalaryInfo = GetSalaryInfoAsync();
await Task.WhenAll(userDetails,
getWageInfo,
getSalaryInfo,
);
var resultToReturn = new Response
{
IsuserDetailsSucceeded = userDetails.Result,
IsgetWageInfoSucceeded = getWageInfo.Result,
IsgetSalaryInfoSucceeded = getSalaryInfo.Result,
};
return resultToReturn;
}
In my app I need to update details based on the success response. Something like this (2/5) completed. And the text should be updated whenever we get a new response.
What is the best way to implement this feature? Is it possible to use along with Task.WhenAll. Because I am trying to wrap everything in one method call.
In my app I need to update details based on the success response.
The proper way to do this is IProgress<string>. The calling code should supply a Progress<string> that updates the UI accordingly.
public async Task<Response> GetAllVasInformationAsync(IProgress<string> progress)
{
var userDetails = UpdateWhenComplete(GetUserDetailsAsync(), "user details");
var getWageInfo = UpdateWhenComplete(GetUserWageInfoAsync(), "wage information");
var getSalaryInfo = UpdateWhenComplete(GetSalaryInfoAsync(), "salary information");
await Task.WhenAll(userDetails, getWageInfo, getSalaryInfo);
return new Response
{
IsuserDetailsSucceeded = await userDetails,
IsgetWageInfoSucceeded = await getWageInfo,
IsgetSalaryInfoSucceeded = await getSalaryInfo,
};
async Task<T> UpdateWhenComplete<T>(Task<T> task, string taskName)
{
try { return await task; }
finally { progress?.Report($"Completed {taskName}"); }
}
}
If you also need a count, you can either use IProgress<(int, string)> or change how the report progress string is built to include the count.
So here's what I would do in C# 8 and .NET Standard 2.1:
First, I create the method which will produce the async enumerable:
static async IAsyncEnumerable<bool> TasksToPerform() {
Task[] tasks = new Task[3] { userDetails, getWageInfo, getSalaryInfo };
for (i = 0; i < tasks.Length; i++) {
await tasks[i];
yield return true;
}
}
So now you need to await foreach on this task enumerable. Every time you get a return, you know that a task has been finished.
int numberOfFinishedTasks = 0;
await foreach (var b in TasksToPerform()) {
numberOfFinishedTasks++;
//Update UI here to reflect the finished task number
}
No need to over-complicate this. This code will show how many of your tasks had exceptions. Your await task.whenall just triggers them and waits for them to finish. So after that you can do whatever you want with the tasks :)
var task = Task.Delay(300);
var tasks = new List<Task> { task };
var faultedTasks = 0;
tasks.ForEach(t =>
{
t.ContinueWith(t2 =>
{
//do something with a field / property holding ViewModel state
//that your view is listening to
});
});
await Task.WhenAll(tasks);
//use this to respond with a finished count
tasks.ForEach(_ => { if (_.IsFaulted) faultedTasks++; });
Console.WriteLine($"{tasks.Count() - faultedTasks} / {tasks.Count()} completed.");
.WhenAll() will allow you to determine if /any/ of the tasks failed, they you just count the tasks that have failed.
public async Task<Response> GetAllVasInformationAsync()
{
var userDetails = GetUserDetailsAsync();
var getWageInfo = GetUserWageInfoAsync();
var getSalaryInfo = GetSalaryInfoAsync();
await Task.WhenAll(userDetails, getWaitInfo, getSalaryInfo)
.ContinueWith((task) =>
{
if(task.IsFaulted)
{
int failedCount = 0;
if(userDetails.IsFaulted) failedCount++;
if(getWaitInfo.IsFaulted) failedCount++;
if(getSalaryInfo.IsFaulted) failedCount++;
return $"{failedCount} tasks failed";
}
});
var resultToReturn = new Response
{
IsuserDetailsSucceeded = userDetails.Result,
IsgetWageInfoSucceeded = getWageInfo.Result,
IsgetSalaryInfoSucceeded = getSalaryInfo.Result,
};
return resultToReturn;
}
There're Task.WaitAll method which waits for all tasks and Task.WaitAny method which waits for one task. How to wait for any N tasks?
Use case: search result pages are downloaded, each result needs a separate task to download and process it. If I use WaitAll to wait for the results of the subtasks before getting next search result page, I will not use all available resources (one long task will delay the rest). Not waiting at all can cause thousands of tasks to be queued which isn't the best idea either.
So, how to wait for a subset of tasks to be completed? Or, alternatively, how to wait for the task scheduler queue to have only N tasks?
This looks like an excellent problem for TPL Dataflow, which will allow you to control parallelism and buffering to process at maximum speed.
Here's some (untested) code to show you what I mean:
static void Process()
{
var searchReader =
new TransformManyBlock<SearchResult, SearchResult>(async uri =>
{
// return a list of search results at uri.
return new[]
{
new SearchResult
{
IsResult = true,
Uri = "http://foo.com"
},
new SearchResult
{
// return the next search result page here.
IsResult = false,
Uri = "http://google.com/next"
}
};
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 8, // restrict buffer size.
MaxDegreeOfParallelism = 4 // control parallelism.
});
// link "next" pages back to the searchReader.
searchReader.LinkTo(searchReader, x => !x.IsResult);
var resultActor = new ActionBlock<SearchResult>(async uri =>
{
// do something with the search result.
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 64,
MaxDegreeOfParallelism = 16
});
// link search results into resultActor.
searchReader.LinkTo(resultActor, x => x.IsResult);
// put in the first piece of input.
searchReader.Post(new SearchResult { Uri = "http://google/first" });
}
struct SearchResult
{
public bool IsResult { get; set; }
public string Uri { get; set; }
}
I think you should independently limit the number of parallel download tasks and the number of concurrent result processing tasks. I would do it using two SemaphoreSlim objects, like below. This version doesn't use the synchronous SemaphoreSlim.Wait (thanks #svick for making the point). It was only slightly tested, the exception handling can be improved; substitute your own DownloadNextPageAsync and ProcessResults:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
namespace Console_21666797
{
partial class Program
{
// the actual download method
// async Task<string> DownloadNextPageAsync(string url) { ... }
// the actual process methods
// void ProcessResults(string data) { ... }
// download and process all pages
async Task DownloadAndProcessAllAsync(
string startUrl, int maxDownloads, int maxProcesses)
{
// max parallel downloads
var downloadSemaphore = new SemaphoreSlim(maxDownloads);
// max parallel processing tasks
var processSemaphore = new SemaphoreSlim(maxProcesses);
var tasks = new HashSet<Task>();
var complete = false;
var protect = new Object(); // protect tasks
var page = 0;
// do the page
Func<string, Task> doPageAsync = async (url) =>
{
bool downloadSemaphoreAcquired = true;
try
{
// download the page
var data = await DownloadNextPageAsync(
url).ConfigureAwait(false);
if (String.IsNullOrEmpty(data))
{
Volatile.Write(ref complete, true);
}
else
{
// enable the next download to happen
downloadSemaphore.Release();
downloadSemaphoreAcquired = false;
// process this download
await processSemaphore.WaitAsync();
try
{
await Task.Run(() => ProcessResults(data));
}
finally
{
processSemaphore.Release();
}
}
}
catch (Exception)
{
Volatile.Write(ref complete, true);
throw;
}
finally
{
if (downloadSemaphoreAcquired)
downloadSemaphore.Release();
}
};
// do the page and save the task
Func<string, Task> queuePageAsync = async (url) =>
{
var task = doPageAsync(url);
lock (protect)
tasks.Add(task);
await task;
lock (protect)
tasks.Remove(task);
};
// process pages in a loop until complete is true
while (!Volatile.Read(ref complete))
{
page++;
// acquire download semaphore synchrnously
await downloadSemaphore.WaitAsync().ConfigureAwait(false);
// do the page
var task = queuePageAsync(startUrl + "?page=" + page);
}
// await completion of the pending tasks
Task[] pendingTasks;
lock (protect)
pendingTasks = tasks.ToArray();
await Task.WhenAll(pendingTasks);
}
static void Main(string[] args)
{
new Program().DownloadAndProcessAllAsync("http://google.com", 10, 5).Wait();
Console.ReadLine();
}
}
}
Something like this should work. There might be some edge cases, but all in all it should ensure a minimum of completions.
public static async Task WhenN(IEnumerable<Task> tasks, int n, CancellationTokenSource cts = null)
{
var pending = new HashSet<Task>(tasks);
if (n > pending.Count)
{
n = pending.Count;
// or throw
}
var completed = 0;
while (completed != n)
{
var completedTask = await Task.WhenAny(pending);
pending.Remove(completedTask);
completed++;
}
if (cts != null)
{
cts.Cancel();
}
}
Usage:
static void Main(string[] args)
{
var tasks = new List<Task>();
var completed = 0;
var cts = new CancellationTokenSource();
for (int i = 0; i < 100; i++)
{
tasks.Add(Task.Run(async () =>
{
await Task.Delay(temp * 100, cts.Token);
Console.WriteLine("Completed task {0}", i);
completed++;
}, cts.Token));
}
Extensions.WhenN(tasks, 30, cts).Wait();
Console.WriteLine(completed);
Console.ReadLine();
}
Task[] runningTasks = MyTasksFactory.StartTasks();
while(runningTasks.Any())
{
int finished = Task.WaitAny(runningTasks);
Task.Factory.StareNew(()=> {Consume(runningTasks[Finished].Result);})
runningTasks.RemoveAt(finished);
}
I'm trying to create an asynchronous console app that does a some work on a collection. I have one version which uses parallel for loop another version that uses async/await. I expected the async/await version to work similar to parallel version but it executes synchronously. What am I doing wrong?
public class Program
{
public static void Main(string[] args)
{
var worker = new Worker();
worker.ParallelInit();
var t = worker.Init();
t.Wait();
Console.ReadKey();
}
}
public class Worker
{
public async Task<bool> Init()
{
var series = Enumerable.Range(1, 5).ToList();
foreach(var i in series)
{
Console.WriteLine("Starting Process {0}", i);
var result = await DoWorkAsync(i);
if (result)
{
Console.WriteLine("Ending Process {0}", i);
}
}
return true;
}
public async Task<bool> DoWorkAsync(int i)
{
Console.WriteLine("working..{0}", i);
await Task.Delay(1000);
return true;
}
public bool ParallelInit()
{
var series = Enumerable.Range(1, 5).ToList();
Parallel.ForEach(series, i =>
{
Console.WriteLine("Starting Process {0}", i);
DoWorkAsync(i);
Console.WriteLine("Ending Process {0}", i);
});
return true;
}
}
The way you're using the await keyword tells C# that you want to wait each time you pass through the loop, which isn't parallel. You can rewrite your method like this to do what you want, by storing a list of Tasks and then awaiting them all with Task.WhenAll.
public async Task<bool> Init()
{
var series = Enumerable.Range(1, 5).ToList();
var tasks = new List<Task<Tuple<int, bool>>>();
foreach (var i in series)
{
Console.WriteLine("Starting Process {0}", i);
tasks.Add(DoWorkAsync(i));
}
foreach (var task in await Task.WhenAll(tasks))
{
if (task.Item2)
{
Console.WriteLine("Ending Process {0}", task.Item1);
}
}
return true;
}
public async Task<Tuple<int, bool>> DoWorkAsync(int i)
{
Console.WriteLine("working..{0}", i);
await Task.Delay(1000);
return Tuple.Create(i, true);
}
Your code waits for each operation (using await) to finish before starting the next iteration.
Therefore, you don't get any parallelism.
If you want to run an existing asynchronous operation in parallel, you don't need await; you just need to get a collection of Tasks and call Task.WhenAll() to return a task that waits for all of them:
return Task.WhenAll(list.Select(DoWorkAsync));
public async Task<bool> Init()
{
var series = Enumerable.Range(1, 5);
Task.WhenAll(series.Select(i => DoWorkAsync(i)));
return true;
}
In C# 7.0 you can use semantic names to each of the members of the tuple, here is Tim S.'s answer using the new syntax:
public async Task<bool> Init()
{
var series = Enumerable.Range(1, 5).ToList();
var tasks = new List<Task<(int Index, bool IsDone)>>();
foreach (var i in series)
{
Console.WriteLine("Starting Process {0}", i);
tasks.Add(DoWorkAsync(i));
}
foreach (var task in await Task.WhenAll(tasks))
{
if (task.IsDone)
{
Console.WriteLine("Ending Process {0}", task.Index);
}
}
return true;
}
public async Task<(int Index, bool IsDone)> DoWorkAsync(int i)
{
Console.WriteLine("working..{0}", i);
await Task.Delay(1000);
return (i, true);
}
You could also get rid of task. inside foreach:
// ...
foreach (var (IsDone, Index) in await Task.WhenAll(tasks))
{
if (IsDone)
{
Console.WriteLine("Ending Process {0}", Index);
}
}
// ...
We can use async method in foreach loop to run async API calls.
public static void Main(string[] args)
{
List<ZoneDetails> lst = GetRecords();
foreach (var item in lst)
{
//For loop run asyn
var result = GetAPIData(item.ZoneId, item.fitnessclassid).Result;
if (result != null && result.EventHistoryId != null)
{
UpdateDB(result);
}
}
}
private static async Task<FODBrandChannelLicense> GetAPIData(int zoneId, int fitnessclassid)
{
HttpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", token);
var response = HttpClient.GetAsync(new Uri(url)).Result;
var content = response.Content.ReadAsStringAsync().Result;
var result = JsonConvert.DeserializeObject<Model>(content);
if (response.EnsureSuccessStatusCode().IsSuccessStatusCode)
{
Console.WriteLine($"API Call completed successfully");
}
return result;
}
To add to the already good answers here, it's always helpful to me to remember that the async method returns a Task.
So in the example in this question, each iteration of the loop has await. This causes the Init() method to return control to its caller with a Task<bool> - not a bool.
Thinking of await as just a magic word that causes execution state to be saved, then skipped to the next available line until ready, encourages confusion: "why doesn't the for loop just skip the line with await and go to the next statement?"
If instead you think of await as something more like a yield statement, that brings a Task with it when it returns control to the caller, in my opinion flow starts to make more sense: "the for loop stops at await, and returns control and the Task to the caller. The for loop won't continue until that is done."