This question already has answers here:
Task constructor vs Task.Run with async Action - different behavior
(3 answers)
Queue of async tasks with throttling which supports muti-threading
(5 answers)
Closed 1 year ago.
I have a simple taken form MS documentation implementation of generic throttle function.
public async Task RunTasks(List<Task> actions, int maxThreads)
{
var queue = new ConcurrentQueue<Task>(actions);
var tasks = new List<Task>();
for (int n = 0; n < maxThreads; n++)
{
tasks.Add(Task.Run(async () =>
{
while (queue.TryDequeue(out Task action))
{
action.Start();
await action;
int i = 9; //this should not be reached.
}
}));
}
await Task.WhenAll(tasks);
}
To test it I have a unit test:
[Fact]
public async Task TaskRunningLogicThrottles1()
{
var tasks = new List<Task>();
const int limit = 2;
for (int i = 0; i < 2000; i++)
{
var task = new Task(async () => {
await Task.Delay(-1);
});
tasks.Add(task);
}
await _logic.RunTasks(tasks, limit);
}
Since there is a Delay(-1) in the tasks this test should never complete. The line "int y = 9;" in the RunTasks function should never be reached. However it does and my whole function fails to do what it is supposed to do - throttle execution. If instead or await Task.Delay() I used synchronous Thread.Sleep ot works as exptected.
The Task class has no constructor that accepts async delegates, so the async delegate you passed to it is async void. This is a common trap. Just because the compiler allows us to add the async keyword to any lambda, doesn't mean that we should. We should only pass async lambdas to methods that expect and understand them, meaning that the type of the parameter should be a Func with a Task return value. For example Func<Task>, or Func<Task<T>>, or Func<TSource, Task<TResult>> etc.
What you could do is to pass the async lambda to a Task<TResult> constructor, in which case the TResult would be resolved as Task. In other words you could create nested Task<Task> instances:
var taskTask = new Task<Task>(async () =>
{
await Task.Delay(-1);
});
This way you would have a cold outer task, that when started would create the inner task. The work required to create a task is negligible, so the inner task will be created instantly. The inner task would be a promise-style task, like all tasks generated by async methods. A promise-style task is always hot on creation. It cannot be created in a cold state like a delegate-based task. Calling its Start method results to an InvalidOperationException.
Creating cold tasks and nested tasks is an advanced technique that is used rarely in practice. The common technique for starting promise-style tasks on demand is to pass them around as async delegates (Func<Task> instances in various flavors), and invoke each delegate at the right moment.
Related
This question already has answers here:
Parallel.ForEach and async-await [duplicate]
(4 answers)
Parallel foreach with asynchronous lambda
(10 answers)
Closed 12 days ago.
I was reading this post about Parallel.ForEach where it was stated that "Parallel.ForEach is not compatible with passing in a async method."
So, to check I write this code:
static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();
Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
Console.ReadLine();
}
static async Task<int> DoAsyncJob(int i)
{
Thread.Sleep(100);
return await Task.FromResult(i * 10);
}
This code fills in the results dictionary concurrently.
By the way, I created a dictionary of type ConcurrentDictionary<string, int> because in case I have ConcurrentDictionary<int, int> when I explore its elements in debug mode I see that elements are sorted by the key and I thought that elenents was added consequently.
So, I want to know is my code is valid? If it "is not compatible with passing in a async method" why it works well?
This code works only because DoAsyncJob isn't really an asynchronous method. async doesn't make a method work asynchronously. Awaiting a completed task like that returned by Task.FromResult is synchronous too. async Task Main doesn't contain any asynchronous code, which results in a compiler warning.
An example that demonstrates how Parallel.ForEach doesn't work with asynchronous methods should call a real asynchronous method:
static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();
Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
Console.WriteLine($"Items in dictionary {results.Count}");
}
static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return i * 10;
}
The result will be
Items in dictionary 0
Parallel.ForEach has no overload accepting a Func<Task>, it accepts only Action delegates. This means it can't await any asynchronous operations.
async index is accepted because it's implicitly an async void delegate. As far as Parallel.ForEach is concerned, it's just an Action<int>.
The result is that Parallel.ForEach fires off 100 tasks and never waits for them to complete. That's why the dictionary is still empty when the application terminates.
An async method is one that starts and returns a Task.
Your code here
Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
runs async methods 100 times in parallel. That's to say it parallelises the task creation, not the whole task. By the time ForEach has returned, your tasks are running but they are not necessarily complete.
You code works because DoAsyncJob() not actually asynchronous - your Task is completed upon return. Thread.Sleep() is a synchronous method. Task.Delay() is its asynchronous equivalent.
Understand the difference between CPU-bound and I/O-bound operations. As others have already pointed out, parallelism (and Parallel.ForEach) is for CPU-bound operations and asynchronous programming is not appropriate.
If you already have asynchronous work, you don't need Parallel.ForEach:
static async Task Main(string[] args)
{
var results = await new Task.WhenAll(
Enumerable.Range(0, 100)
Select(i => DoAsyncJob(I)));
Console.ReadLine();
}
Regarding your async job, you either go async all the way:
static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return await Task.FromResult(i * 10);
}
Better yet:
static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return i * 10;
}
or not at all:
static Task<int> DoAsyncJob(int i)
{
Thread.Sleep(100);
return Task.FromResult(i * 10);
}
I have bunch of async methods, which I invoke from Dispatcher. The methods does not perform any work in the background, they just waits for some I/O operations, or wait for response from the webserver.
async Task FetchAsync()
{
// Prepare request in UI thread
var response = await new WebClient().DownloadDataTaskAsync();
// Process response in UI thread
}
now, I want to perform load tests, by calling multiple FetchAsync() in parallel with some max degree of parallelism.
My first attempt was using Paralell.Foreach(), but id does not work well with async/await.
var option = new ParallelOptions {MaxDegreeOfParallelism = 10};
Parallel.ForEach(UnitsOfWork, uow => uow.FetchAsync().Wait());
I've been looking at reactive extensions, but I'm still not able to take advantage of Dispatcher and async/await.
My goal is to not create separate thread for each FetchAsync(). Can you give me some hints how to do it?
Just call Fetchasync without awaiting each call and then use Task.WhenAll to await all of them together.
var tasks = new List<Task>();
var max = 10;
for(int i = 0; i < max; i++)
{
tasks.Add(FetchAsync());
}
await Task.WhenAll(tasks);
Here is a generic reusable solution to your question that you can reuse not only with your FetchAsync method but for any async method that has the same signature. The api includes real time concurrent throttling support as well:
Parameters are self explanatory:
totalRequestCount: is how many async requests (FatchAsync calls) you want to do in total, async processor is the FetchAsync method itself, maxDegreeOfParallelism is the optional nullable parameter. If you want real time concurrent throttling with max number of concurrent async requests, set it, otherwise not.
public static Task ForEachAsync(
int totalRequestCount,
Func<Task> asyncProcessor,
int? maxDegreeOfParallelism = null)
{
IEnumerable<Task> tasks;
if (maxDegreeOfParallelism != null)
{
SemaphoreSlim throttler = new SemaphoreSlim(maxDegreeOfParallelism.Value, maxDegreeOfParallelism.Value);
tasks = Enumerable.Range(0, totalRequestCount).Select(async requestNumber =>
{
await throttler.WaitAsync();
try
{
await asyncProcessor().ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
}
else
{
tasks = Enumerable.Range(0, totalRequestCount).Select(requestNumber => asyncProcessor());
}
return Task.WhenAll(tasks);
}
I'm needing to create a list of tasks to execute a routine that takes one parameter and then wait for those tasks to complete before continuing with the rest of the program code. Here is an example:
List<Task> tasks = new List<Task>();
foreach (string URL in LIST_URL_COLLECTION)
{
tasks[i] = Task.Factory.StartNew(
GoToURL(URL)
);
}
//wait for them to finish
Console.WriteLine("Done");
I've have googled and searched this site but I just keep hitting a dead end, I did this once but can't remember how.
The Task Parallel Library exposes a convinent way to asynchronously wait for the completion of all tasks via the Task.WhenAll method. The method returns a Task by itself which is awaitable and should be awaited:
public async Task QueryUrlsAsync()
{
var urlFetchingTasks = ListUrlCollection.Select(url => Task.Run(url));
await Task.WhenAll(urlFetchingTasks);
Console.WriteLine("Done");
}
Note that in order to await, your method must be marked with the async modifier in the method signature and return either a Task (if it has no return value) or a Task<T> (if it does have a return value, which type is T).
As a side note, your method looks like it's fetching urls, which i am assuming is generating a web request to some endpoint. In order to do that, there's no need to use extra threads via Task.Factory.StartNew or Task.Run, as these operations are naturally asynchronous. You should look into HttpClient as a starting point. For example, your method could look like this:
public async Task QueryUrlsAsync()
{
var urlFetchingTasks = ListUrlCollection.Select(url =>
{
var httpClient = new HttpClient();
return httpClient.GetAsync(url);
});
await Task.WhenAll(urlFetchingTasks);
Console.WriteLine("Done");
}
I am attempting to run async methods from a synchronous method. But I can't await the async method since I am in a synchronous method. I must not be understanding TPL as this is the fist time I'm using it.
private void GetAllData()
{
GetData1()
GetData2()
GetData3()
}
Each method needs the previous method to finish as the data from the first is used for the second.
However, inside each method I want to start multiple Task operations in order to speed up the performance. Then I want to wait for all of them to finish.
GetData1 looks like this
internal static void GetData1 ()
{
const int CONCURRENCY_LEVEL = 15;
List<Task<Data>> dataTasks = new List<Task<Data>>();
for (int item = 0; item < TotalItems; item++)
{
dataTasks.Add(MyAyncMethod(State[item]));
}
int taskIndex = 0;
//Schedule tasks to concurency level (or all)
List<Task<Data>> runningTasks = new List<Task<Data>>();
while (taskIndex < CONCURRENCY_LEVEL && taskIndex < dataTasks.Count)
{
runningTasks.Add(dataTasks[taskIndex]);
taskIndex++;
}
//Start tasks and wait for them to finish
while (runningTasks.Count > 0)
{
Task<Data> dataTask = await Task.WhenAny(runningTasks);
runningTasks.Remove(dataTask);
myData = await dataTask;
//Schedule next concurrent task
if (taskIndex < dataTasks.Count)
{
runningTasks.Add(dataTasks[taskIndex]);
taskIndex++;
}
}
Task.WaitAll(dataTasks.ToArray()); //This probably isn't necessary
}
I am using await here but get an Error
The 'await' operator can only be used within an async method. Consider
marking this method with the 'async' modifier and changing its return
type to 'Task'
However, if I use the async modifier this will be an asynchronous operation. Therefore, if my call to GetData1 doesn't use the await operator won't control go to GetData2 on the first await, which is what I am trying to avoid? Is it possible to keep GetData1 as a synchronous method that calls an asynchronous method? Am I designing the Asynchronous method incorrectly? As you can see I'm quite confused.
This could be a duplicate of How to call asynchronous method from synchronous method in C#? However, I'm not sure how to apply the solutions provided there as I'm starting multiple tasks, want to WaitAny, do a little more processing for that task, then wait for all tasks to finish before handing control back to the caller.
UPDATE
Here is the solution I went with based on the answers below:
private static List<T> RetrievePageTaskScheduler<T>(
List<T> items,
List<WebPageState> state,
Func<WebPageState, Task<List<T>>> func)
{
int taskIndex = 0;
// Schedule tasks to concurency level (or all)
List<Task<List<T>>> runningTasks = new List<Task<List<T>>>();
while (taskIndex < CONCURRENCY_LEVEL_PER_PROCESSOR * Environment.ProcessorCount
&& taskIndex < state.Count)
{
runningTasks.Add(func(state[taskIndex]));
taskIndex++;
}
// Start tasks and wait for them to finish
while (runningTasks.Count > 0)
{
Task<List<T>> task = Task.WhenAny(runningTasks).Result;
runningTasks.Remove(task);
try
{
items.AddRange(task.Result);
}
catch (AggregateException ex)
{
/* Throwing this exception means that if one task fails
* don't process any more of them */
// https://stackoverflow.com/questions/8853693/pattern-for-implementing-sync-methods-in-terms-of-non-parallel-task-translating
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(
ex.Flatten().InnerExceptions.First()).Throw();
}
// Schedule next concurrent task
if (taskIndex < state.Count)
{
runningTasks.Add(func(state[taskIndex]));
taskIndex++;
}
}
return items;
}
Task<TResult>.Result (or Task.Wait() when there's no result) is similar to await, but is a synchronous operation. You should change GetData1() to use this. Here's the portion to change:
Task<Data> dataTask = Task.WhenAny(runningTasks).Result;
runningTasks.Remove(dataTask);
myData = gameTask.Result;
First, I recommend that your "internal" tasks not use Task.Run in their implementation. You should use an async method that does the CPU-bound portion synchronously.
Once your MyAsyncMethod is an async method that does some CPU-bound processing, then you can wrap it in a Task and use parallel processing as such:
internal static void GetData1()
{
// Start the tasks
var dataTasks = Enumerable.Range(0, TotalItems)
.Select(item => Task.Run(() => MyAyncMethod(State[item]))).ToList();
// Wait for them all to complete
Task.WaitAll(dataTasks);
}
Your concurrency limiting in your original code won't work at all, so I removed it for simpilicity. If you want to apply a limit, you can either use SemaphoreSlim or TPL Dataflow.
You can call the following:
GetData1().Wait();
GetData2().Wait();
GetData3().Wait();
I'm having some trouble getting a task to asynchronously delay. I am writing an application that needs to run at a scale of tens/hundreds of thousands of asynchronously executing scripts. I am doing this using C# Actions and sometimes, during the execution of a particular sequence, in order for the script to execute properly, it needs to wait on an external resource to reach an expected state. At first I wrote this using Thread.Sleep() but that turned out to be a torpedo in the applications performance, so I'm looking into async/await for async sleep. But I can't get it to actually wait on the pause! Can someone explain this?
static void Main(string[] args)
{
var sync = new List<Action>();
var async = new List<Action>();
var syncStopWatch = new Stopwatch();
sync.Add(syncStopWatch.Start);
sync.Add(() => Thread.Sleep(1000));
sync.Add(syncStopWatch.Stop);
sync.Add(() => Console.Write("Sync:\t" + syncStopWatch.ElapsedMilliseconds + "\n"));
var asyncStopWatch = new Stopwatch();
sync.Add(asyncStopWatch.Start);
sync.Add(async () => await Task.Delay(1000));
sync.Add(asyncStopWatch.Stop);
sync.Add(() => Console.Write("Async:\t" + asyncStopWatch.ElapsedMilliseconds + "\n"));
foreach (Action a in sync)
{
a.Invoke();
}
foreach (Action a in async)
{
a.Invoke();
}
}
The results of the execution are:
Sync: 999
Async: 2
How do I get it to wait asynchronously?
You're running into a problem with async void. When you pass an async lambda to an Action, the compiler is creating an async void method for you.
As a best practice, you should avoid async void.
One way to do this is to have your list of actions actually be a List<Func<Task>> instead of List<Action>. This allows you to queue async Task methods instead of async void methods.
This means your "execution" code would have to wait for each Task as it completes. Also, your synchronous methods would have to return Task.FromResult(0) or something like that so they match the Func<Task> signature.
If you want a bigger scope solution, I recommend you strongly consider TPL Dataflow instead of creating your own queue.