Passing async method into Parallel.ForEach [duplicate]

Passing async method into Parallel.ForEach [duplicate] - c#

This question already has answers here:
Parallel.ForEach and async-await [duplicate]
(4 answers)
Parallel foreach with asynchronous lambda
(10 answers)
Closed 12 days ago.
I was reading this post about Parallel.ForEach where it was stated that "Parallel.ForEach is not compatible with passing in a async method."
So, to check I write this code:
static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();
Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
Console.ReadLine();
}
static async Task<int> DoAsyncJob(int i)
{
Thread.Sleep(100);
return await Task.FromResult(i * 10);
}
This code fills in the results dictionary concurrently.
By the way, I created a dictionary of type ConcurrentDictionary<string, int> because in case I have ConcurrentDictionary<int, int> when I explore its elements in debug mode I see that elements are sorted by the key and I thought that elenents was added consequently.
So, I want to know is my code is valid? If it "is not compatible with passing in a async method" why it works well?

This code works only because DoAsyncJob isn't really an asynchronous method. async doesn't make a method work asynchronously. Awaiting a completed task like that returned by Task.FromResult is synchronous too. async Task Main doesn't contain any asynchronous code, which results in a compiler warning.
An example that demonstrates how Parallel.ForEach doesn't work with asynchronous methods should call a real asynchronous method:
static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();
Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
Console.WriteLine($"Items in dictionary {results.Count}");
}
static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return i * 10;
}
The result will be
Items in dictionary 0
Parallel.ForEach has no overload accepting a Func<Task>, it accepts only Action delegates. This means it can't await any asynchronous operations.
async index is accepted because it's implicitly an async void delegate. As far as Parallel.ForEach is concerned, it's just an Action<int>.
The result is that Parallel.ForEach fires off 100 tasks and never waits for them to complete. That's why the dictionary is still empty when the application terminates.

An async method is one that starts and returns a Task.
Your code here
Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
runs async methods 100 times in parallel. That's to say it parallelises the task creation, not the whole task. By the time ForEach has returned, your tasks are running but they are not necessarily complete.
You code works because DoAsyncJob() not actually asynchronous - your Task is completed upon return. Thread.Sleep() is a synchronous method. Task.Delay() is its asynchronous equivalent.
Understand the difference between CPU-bound and I/O-bound operations. As others have already pointed out, parallelism (and Parallel.ForEach) is for CPU-bound operations and asynchronous programming is not appropriate.

If you already have asynchronous work, you don't need Parallel.ForEach:
static async Task Main(string[] args)
{
var results = await new Task.WhenAll(
Enumerable.Range(0, 100)
Select(i => DoAsyncJob(I)));
Console.ReadLine();
}
Regarding your async job, you either go async all the way:
static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return await Task.FromResult(i * 10);
}
Better yet:
static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return i * 10;
}
or not at all:
static Task<int> DoAsyncJob(int i)
{
Thread.Sleep(100);
return Task.FromResult(i * 10);
}

Related

Await and .Result keeps awaiting forever

Why the following program never ends?
namespace Example
{
class Program
{
static async Task Main(string[] args)
{
var result = await new Program().test();
Console.WriteLine(result);
}
private async Task<(int, string)> test()
{
var result = new Task<(int, string)>(() => (10, "sssss"));
return (result.Result.Item1, result.Result.Item2);
}
}
}
What I have noticed is that result.Result or writing await new Task<(int, string)>(() => (10, "sssss")); will let the program keep awaiting forever!!

The Task constructor creates a "cold" task, a task that has not been started. If you await such a task, the await will never complete, because the task will remain forever in the Created status, and will never transition to the RanToCompletion status. To start a task that was created cold, you must call its Start or RunSynchronously methods, preferably passing the TaskScheduler.Default as argument:
private Task<(int, string)> TestAsync()
{
var task = new Task<(int, string)>(() => (10, "sssss"));
task.Start(TaskScheduler.Default);
return task;
}
Creating cold tasks by using the Task constructor is an advanced technique that is used rarely in practice. The common way to create delegate-based tasks is by using the static Task.Run method, that creates hot tasks. The method below is functionally identical¹ with the previous method:
private Task<(int, string)> TestAsync()
{
return Task.Run(() => (10, "sssss"));
}
Be aware that both methods above are considered bad practices, for reasons explained here.
¹ Actually not exactly identical. The second example starts a task having the TaskCreationOptions.DenyChildAttach configuration. More info about this can be found here.

Mixing async await and blocking calls like .Result can causes deadlocks
No need to await if nothing in there is async. Just return Task.FromResult with your desired value
private Task<(int, string)> test() {
(int, string) value = (10, "sssss");
return Task.FromResult(value);
}
It will return a successfully completed task with the specified result.

await Task does not wait [duplicate]

This question already has answers here:
Task constructor vs Task.Run with async Action - different behavior
(3 answers)
Queue of async tasks with throttling which supports muti-threading
(5 answers)
Closed 1 year ago.
I have a simple taken form MS documentation implementation of generic throttle function.
public async Task RunTasks(List<Task> actions, int maxThreads)
{
var queue = new ConcurrentQueue<Task>(actions);
var tasks = new List<Task>();
for (int n = 0; n < maxThreads; n++)
{
tasks.Add(Task.Run(async () =>
{
while (queue.TryDequeue(out Task action))
{
action.Start();
await action;
int i = 9; //this should not be reached.
}
}));
}
await Task.WhenAll(tasks);
}
To test it I have a unit test:
[Fact]
public async Task TaskRunningLogicThrottles1()
{
var tasks = new List<Task>();
const int limit = 2;
for (int i = 0; i < 2000; i++)
{
var task = new Task(async () => {
await Task.Delay(-1);
});
tasks.Add(task);
}
await _logic.RunTasks(tasks, limit);
}
Since there is a Delay(-1) in the tasks this test should never complete. The line "int y = 9;" in the RunTasks function should never be reached. However it does and my whole function fails to do what it is supposed to do - throttle execution. If instead or await Task.Delay() I used synchronous Thread.Sleep ot works as exptected.

The Task class has no constructor that accepts async delegates, so the async delegate you passed to it is async void. This is a common trap. Just because the compiler allows us to add the async keyword to any lambda, doesn't mean that we should. We should only pass async lambdas to methods that expect and understand them, meaning that the type of the parameter should be a Func with a Task return value. For example Func<Task>, or Func<Task<T>>, or Func<TSource, Task<TResult>> etc.
What you could do is to pass the async lambda to a Task<TResult> constructor, in which case the TResult would be resolved as Task. In other words you could create nested Task<Task> instances:
var taskTask = new Task<Task>(async () =>
{
await Task.Delay(-1);
});
This way you would have a cold outer task, that when started would create the inner task. The work required to create a task is negligible, so the inner task will be created instantly. The inner task would be a promise-style task, like all tasks generated by async methods. A promise-style task is always hot on creation. It cannot be created in a cold state like a delegate-based task. Calling its Start method results to an InvalidOperationException.
Creating cold tasks and nested tasks is an advanced technique that is used rarely in practice. The common technique for starting promise-style tasks on demand is to pass them around as async delegates (Func<Task> instances in various flavors), and invoke each delegate at the right moment.

Performance impact of using async await when its not necessary

Let's assume I have these methods:
public async Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
return await service.GetAsync();
}
and
public Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
return service.GetAsync();
}
Both options compile and work the same way. Is there any best practise as to which option is better of if one is faster then the other?
Or is it just some syntactic sugar?

In the first method compiler will generate "state machine" code around it and execution will be returned to the line return await service.GetAsync(); after task will be completed. Consider example below:
public async Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
// Here execution returns to the caller and returned back only when Task is completed.
Something value = await service.GetAsync();
DoSomething();
return value;
}
The line DoSomething(); will be executed only after service.GetAsync task is completed.
Second approach simply starts execution of service.GetAsync and return correspondent Task to the caller without waiting for completion.
public Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
Task<Something> valueTask = service.GetAsync();
DoSomething();
return valueTask;
}
So in the example above DoSomething() will be executed straight after line Task<Something> valueTask = service.GetAsync(); without waiting for completion of task.
Executing async method on the another thread depend on the method itself.
If method execute IO operation, then another thread will be only waste of the thread, which do nothing, only waiting for response. On my opinion async - await are perfect approach for IO operations.
If method GetAsync contains for example Task.Run then execution goes to the another thread fetched from thread pool.
Below is short example, not a good one, but it show the logic a tried to explain:
static async Task GetAsync()
{
for(int i = 0; i < 10; i++)
{
Console.WriteLine($"Iterate GetAsync: {i}");
await Task.Delay(500);
}
}
static Task GetSomethingAsync() => GetAsync();
static void Main(string[] args)
{
Task gettingSomethingTask = GetSomethingAsync();
Console.WriteLine("GetAsync Task returned");
Console.WriteLine("Start sleeping");
Thread.Sleep(3000);
Console.WriteLine("End sleeping");
Console.WriteLine("Before Task awaiting");
gettingSomethingTask.Wait();
Console.WriteLine("After Task awaited");
Console.ReadLine();
}
And output will be next:
Iterate GetAsync: 0
GetAsync Task returned
Start sleeping
Iterate GetAsync: 1
Iterate GetAsync: 2
Iterate GetAsync: 3
Iterate GetAsync: 4
Iterate GetAsync: 5
End sleeping
Before Task awaiting
Iterate GetAsync: 6
Iterate GetAsync: 7
Iterate GetAsync: 8
Iterate GetAsync: 9
After Task awaited
As you can see executing of GetAsync starts straight after calling it.
If GetSomethingAsync() will be changed to the:
static Task GetSomethingAsync() => new Task(async () => await GetAsync());
Where GetAsync wrapped inside another Task, then GetAsync() will not be executed at all and output will be:
GetAsync Task returned
Start sleeping
End sleeping
Before Task awaiting
After Task awaited
Of course you will need to remove line gettingSomethingTask.Wait();, because then application just wait for task which not even started.

In C# 5, difference between async function and non-async function? [duplicate]

This question already has answers here:
Difference between returning and awaiting a Task in an async method [duplicate]
(3 answers)
Closed 8 years ago.
Suppose I have below code:
static void Main(string[] args)
{
println("begin s");
Task<int> s = CaculateSometingAsync();
println("begin s1");
Task<int> s1 = CaculateSometingAsync1();
println(s.Result.ToString());
println(s1.Result.ToString());
}
static async Task<int> CaculateSometingAsync()
{
return await Task.Factory.StartNew<int>(() =>
{
Thread.Sleep(1000);
return 100;
});
}
static Task<int> CaculateSometingAsync1()
{
return Task.Factory.StartNew<int>(() =>
{
Thread.Sleep(1000);
return 200;
});
}
The result is as follow:
16:55:38 begin s
16:55:38 begin s1
16:55:39 100
16:55:39 200
What I know about these two functions is that they have the same behavior.
Both they create one thread-pool thread to run the task.
Both
Task<int> s = CaculateSometingAsync();
and
Task<int> s1 = CaculateSometingAsync1();
don't block the main thread.
So is there any difference between this two functions?

The difference is the way you're using it.
In the first one (CaculateSometingAsync) you're declaring it as asynchronous, and then you await inside it until it's done. You then return whatever it returns.
In your second one (CaculateSometingAsync1) you just use it as a "fire and forget" kind of things, so it goes away, waits, and returns straightto where you called it from.
(and why do you use a method println to print the string ? :) )

you await inside the CaculateSometingAsync but could await on s as the method is declared async, where as you could not await on s1 as CaculateSometingAsync1 is not declared async. The way you are using the keywords means there is no difference in bahviour

How do I convert this to an async task?

Given the following code...
static void DoSomething(int id) {
Thread.Sleep(50);
Console.WriteLine(#"DidSomething({0})", id);
}
I know I can convert this to an async task as follows...
static async Task DoSomethingAsync(int id) {
await Task.Delay(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
}
And that by doing so if I am calling multiple times (Task.WhenAll) everything will be faster and more efficient than perhaps using Parallel.Foreach or even calling from within a loop.
But for a minute, lets pretend that Task.Delay() does not exist and I actually have to use Thread.Sleep(); I know in reality this is not the case, but this is concept code and where the Delay/Sleep is would normally be an IO operation where there is no async option (such as early EF).
I have tried the following...
static async Task DoSomethingAsync2(int id) {
await Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
But, though it runs without error, according to Lucien Wischik this is in fact bad practice as it is merely spinning up threads from the pool to complete each task (it is also slower using the following console application - if you swap between DoSomethingAsync and DoSomethingAsync2 call you can see a significant difference in the time that it takes to complete)...
static void Main(string[] args) {
MainAsync(args).Wait();
}
static async Task MainAsync(String[] args) {
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(DoSomethingAsync2(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
I then tried the following...
static async Task DoSomethingAsync3(int id) {
await new Task(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
Transplanting this in place of the original DoSomethingAsync, the test never completes and nothing is shown on screen!
I have also tried multiple other variations that either do not compile or do not complete!
So, given the constraint that you cannot call any existing asynchronous methods and must complete both the Thread.Sleep and the Console.WriteLine in an asynchronous task, how do you do it in a manner that is as efficient as the original code?
The objective here for those of you who are interested is to give me a better understanding of how to create my own async methods where I am not calling anybody elses. Despite many searches, this seems to be the one area where examples are really lacking - whilst there are many thousands of examples of calling async methods that call other async methods in turn I cannot find any that convert an existing void method to an async task where there is no call to a further async task other than those that use the Task.Run(() => {} ) method.

There are two kinds of tasks: those that execute code (e.g., Task.Run and friends), and those that respond to some external event (e.g., TaskCompletionSource<T> and friends).
What you're looking for is TaskCompletionSource<T>. There are various "shorthand" forms for common situations so you don't always have to use TaskCompletionSource<T> directly. For example, Task.FromResult or TaskFactory.FromAsync. FromAsync is most commonly used if you have an existing *Begin/*End implementation of your I/O; otherwise, you can use TaskCompletionSource<T> directly.
For more information, see the "I/O-bound Tasks" section of Implementing the Task-based Asynchronous Pattern.
The Task constructor is (unfortunately) a holdover from Task-based parallelism, and should not be used in asynchronous code. It can only be used to create a code-based task, not an external event task.
So, given the constraint that you cannot call any existing asynchronous methods and must complete both the Thread.Sleep and the Console.WriteLine in an asynchronous task, how do you do it in a manner that is as efficient as the original code?
I would use a timer of some kind and have it complete a TaskCompletionSource<T> when the timer fires. I'm almost positive that's what the actual Task.Delay implementation does anyway.

So, given the constraint that you cannot call any existing
asynchronous methods and must complete both the Thread.Sleep and the
Console.WriteLine in an asynchronous task, how do you do it in a
manner that is as efficient as the original code?
IMO, this is a very synthetic constraint that you really need to stick with Thread.Sleep. Under this constraint, you still can slightly improve your Thread.Sleep-based code. Instead of this:
static async Task DoSomethingAsync2(int id) {
await Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
You could do this:
static Task DoSomethingAsync2(int id) {
return Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
This way, you'd avoid an overhead of the compiler-generated state machine class. There is a subtle difference between these two code fragments, in how exceptions are propagated.
Anyhow, this is not where the bottleneck of the slowdown is.
(it is also slower using the following console application - if you
swap between DoSomethingAsync and DoSomethingAsync2 call you can see a
significant difference in the time that it takes to complete)
Let's look one more time at your main loop code:
static async Task MainAsync(String[] args) {
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(DoSomethingAsync2(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
Technically, it requests 1000 tasks to be run in parallel, each supposedly to run on its own thread. In an ideal universe, you'd expect to execute Thread.Sleep(50) 1000 times in parallel and complete the whole thing in about 50ms.
However, this request is never satisfied by the TPL's default task scheduler, for a good reason: thread is a precious and expensive resource. Moreover, the actual number of concurrent operations is limited to the number of CPUs/cores. So in reality, with the default size of ThreadPool, I'm getting 21 pool threads (at peak) serving this operation in parallel. That is why DoSomethingAsync2 / Thread.Sleep takes so much longer than DoSomethingAsync / Task.Delay. DoSomethingAsync doesn't block a pool thread, it only requests one upon the completion of the time-out. Thus, more DoSomethingAsync tasks can actually run in parallel, than DoSomethingAsync2 those.
The test (a console app):
// https://stackoverflow.com/q/21800450/1768303
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace Console_21800450
{
public class Program
{
static async Task DoSomethingAsync(int id)
{
await Task.Delay(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync({0})", id);
}
static async Task DoSomethingAsync2(int id)
{
await Task.Run(() =>
{
Thread.Sleep(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync2({0})", id);
});
}
static async Task MainAsync(Func<int, Task> tester)
{
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(tester(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
volatile static int s_maxThreads = 0;
static void UpdateMaxThreads()
{
var threads = Process.GetCurrentProcess().Threads.Count;
// not using locks for simplicity
if (s_maxThreads < threads)
s_maxThreads = threads;
}
static void TestAsync(Func<int, Task> tester)
{
s_maxThreads = 0;
var stopwatch = new Stopwatch();
stopwatch.Start();
MainAsync(tester).Wait();
Console.WriteLine(
"time, ms: " + stopwatch.ElapsedMilliseconds +
", threads at peak: " + s_maxThreads);
}
static void Main()
{
Console.WriteLine("Press enter to test with Task.Delay ...");
Console.ReadLine();
TestAsync(DoSomethingAsync);
Console.ReadLine();
Console.WriteLine("Press enter to test with Thread.Sleep ...");
Console.ReadLine();
TestAsync(DoSomethingAsync2);
Console.ReadLine();
}
}
}
Output:
Press enter to test with Task.Delay ...
...
time, ms: 1077, threads at peak: 13
Press enter to test with Thread.Sleep ...
...
time, ms: 8684, threads at peak: 21
Is it possible to improve the timing figure for the Thread.Sleep-based DoSomethingAsync2? The only way I can think of is to use TaskCreationOptions.LongRunning with Task.Factory.StartNew:
You should think twice before doing this in any real-life application:
static async Task DoSomethingAsync2(int id)
{
await Task.Factory.StartNew(() =>
{
Thread.Sleep(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync2({0})", id);
}, TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
// ...
static void Main()
{
Console.WriteLine("Press enter to test with Task.Delay ...");
Console.ReadLine();
TestAsync(DoSomethingAsync);
Console.ReadLine();
Console.WriteLine("Press enter to test with Thread.Sleep ...");
Console.ReadLine();
TestAsync(DoSomethingAsync2);
Console.ReadLine();
}
Output:
Press enter to test with Thread.Sleep ...
...
time, ms: 3600, threads at peak: 163
The timing gets better, but the price for this is high. This code asks the task scheduler to create a new thread for each new task. Do not expect this thread to come from the pool:
Task.Factory.StartNew(() =>
{
Thread.Sleep(1000);
Console.WriteLine("Thread pool: " +
Thread.CurrentThread.IsThreadPoolThread); // false!
}, TaskCreationOptions.LongRunning).Wait();

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Passing async method into Parallel.ForEach [duplicate] - c#

Related

Await and .Result keeps awaiting forever

await Task does not wait [duplicate]

Performance impact of using async await when its not necessary

In C# 5, difference between async function and non-async function? [duplicate]

How do I convert this to an async task?

Categories

Resources