Here's a dumbed-down version of what I want to do:
private static int Inc(int input)
{
return input + 1;
}
private static async Task<int> IncAsync(int input)
{
await Task.Delay(200);
return input + 1;
}
private static async Task<IEnumerable<TResult>> GetResultsAsync<TInput, TResult>(Func<TInput, TResult> func, IEnumerable<TInput> values)
{
var tasks = values.Select(value => Task.Run(() => func(value)))
.ToList();
await Task.WhenAll(tasks);
return tasks.Select(t => t.Result);
}
public async void TestAsyncStuff()
{
var numbers = new[] { 1, 2, 3, 4 };
var resultSync = await GetResultsAsync(Inc, numbers); // returns IEnumerable<int>
Console.WriteLine(string.Join(",", resultSync.Select(n => $"{n}")));
// The next line is the important one:
var resultAsync = await GetResultsAsync(IncAsync, numbers); // returns IEnumerable<Task<int>>
}
So basically, GetResultsAsync() is intended to be a generic method that will get the results of a function for a set of input values. In TestAsyncStuff() you can see how it would work for calling a synchronous function (Inc()).
The trouble comes when I want to call an asynchronous function (IncAsync()). The result I get back is of type IEnumerable<Task<int>>. I could do a Task.WhenAll() on that result, and that works:
var tasksAsync = (await GetResultsAsync(IncAsync, numbers)).ToList();
await Task.WhenAll(tasksAsync);
var resultAsync = tasksAsync.Select(t => t.Result);
Console.WriteLine(string.Join(",", resultAsync.Select(n => $"{n}")));
But I'd like to tighten up the code and do the await inline. It should look something like this:
var resultAsync = await GetResultsAsync(async n => await IncAsync(n), numbers);
But that also returns an IEnumerable<Task<int>>! I could do this:
var resultAsync = await GetResultsAsync(n => IncAsync(n).GetAwaiter().GetResult(), numbers);
And that works... but from what I've seen, use of Task.GetAwaiter().GetResult() or Task.Result is not encouraged.
So what is the correct way to do this?
You should create two overloads of GetResultsAsync. One should accept a 'synchronous' delegate which returns TResult. This method will wrap each delegate into a task, and run them asynchronously:
private static async Task<IEnumerable<TResult>> GetResultsAsync<TInput, TResult>(
Func<TInput, TResult> func, IEnumerable<TInput> values)
{
var tasks = values.Select(value => Task.Run(() => func(value)));
return await Task.WhenAll(tasks);
}
The second overload will accept an 'asynchronous' delegate, which returns Task<TResult>. This method doesn't need to wrap each delegate into a task, because they are already tasks:
private static async Task<IEnumerable<TResult>> GetResultsAsync<TInput, TResult>(
Func<TInput, Task<TResult>> func, IEnumerable<TInput> values)
{
var tasks = values.Select(value => func(value));
return await Task.WhenAll(tasks);
}
You even can call the second method from the first one to avoid code duplication:
private static async Task<IEnumerable<TResult>> GetResultsAsync<TInput, TResult>(
Func<TInput, TResult> func, IEnumerable<TInput> values)
{
return await GetResultsAsync(x => Task.Run(() => func(x)), values);
}
NOTE: These methods don't simplify your life a lot. The same results can be achieved with
var resultSync = await Task.WhenAll(numbers.Select(x => Task.Run(() => Inc(x))));
var resultAsync = await Task.WhenAll(numbers.Select(IncAsync));
I'd say that your concern is a stylistic one: you want something that reads better. For your first case consider:
var resultSync= numbers.AsParallel()/*.AsOrdered()*/.Select(Inc);
on the grounds that Plinq already does what you're trying to do: It parallelizes IEnumerables. For your second case, there's no point in creating Tasks around Tasks. The equivalent would be:
var resultAsync = numbers.AsParallel()./*AsOrdered().*/Select(n => IncAsync(n).Result);
but I like Sergey's await Task.WhenAll(numbers.Select(IncAsync)) better.
Perhaps what I really like is a Linq style pair of overloads:
var numbers = Enumerable.Range(1,6);
var resultSync = await Enumerable.Range(1,6).SelectAsync(Inc);
var resultAsync = await Enumerable.Range(1,100).SelectAsync(IncAsync);
Console.WriteLine("sync" + string.Join(",", resultSync));
Console.WriteLine("async" + string.Join(",", resultAsync));
static class IEnumerableTasks
{
public static Task<TResult[]> SelectAsync<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> func)
{
return Task.WhenAll( source.Select(async n => await Task.Run(()=> func(n))));
}
public static Task<TResult[]> SelectAsync<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, Task<TResult>> func)
{
return Task.WhenAll(source.Select(func));
}
}
static int Inc(int input)
{
Task.Delay(1000).Wait();
return input+1;
}
static async Task<int> IncAsync(int input)
{
await Task.Delay(1000);
return input + 1;
}
Which, incidentally, if you change Range(1,6) to Range(1,40) shows the advantage of async. On my machine, the timing for the sync can rise steeply where the async version stays at a second or so even for Range(1, 100000)
Related
I'm tring to make a database query inside a LINQ statement asynchronous, but I'm running into an error. The code below runs fine with out async/await
var newEntities = _repositoryMapping.Mapper.Map<List<Entry>>(entries);
newEntities = newEntities.Where(async e => await !_context.Entries.AnyAsync(c => c.Id == e.Id)).ToList();
Severity Code Description Project File Line Suppression State
Error CS4010 Cannot convert async lambda expression to delegate type
'Func<Entry, bool>'. An async lambda expression may return
void, Task or Task, none of which are convertible to
'Func<Entry,
bool>'
Other than breaking this up into a foreach loop, how can I make this work with async/await?
If you care about performance, code should be smarter. You just need to send one query and check what is already present in database.
Prepared extension which can do that in generic way:
newEntities = (await newEntities.FilterExistentAsync(_context.Entries, e => e.Id)).ToList();
Implementation is not so complex
public static class QueryableExtensions
{
public static async Task<IEnumerable<T>> FilterExistentAsync<T, TProp>(this ICollection<T> items,
IQueryable<T> dbQuery, Expression<Func<T, TProp>> prop, CancellationToken cancellationToken = default)
{
var propGetter = prop.Compile();
var ids = items.Select(propGetter).ToList();
var parameter = prop.Parameters[0];
var predicate = Expression.Call(typeof(Enumerable), "Contains", new[] { typeof(TProp) }, Expression.Constant(ids), prop.Body);
var predicateLambda = Expression.Lambda(predicate, parameter);
var filtered = Expression.Call(typeof(Queryable), "Where", new[] {typeof(T)}, dbQuery.Expression,
predicateLambda);
var selectExpr = Expression.Call(typeof(Queryable), "Select", new[] {typeof(T), typeof(TProp)}, filtered, prop);
var selectQuery = dbQuery.Provider.CreateQuery<TProp>(selectExpr);
var existingIds = await selectQuery.ToListAsync(cancellationToken);
return items.Where(i => !existingIds.Contains(propGetter(i)));
}
}
For the Exception, you can add a extension for IEnumerable to support async
public static class MyExtensions
{
public static async Task<IEnumerable<T>> Where<T>(this IEnumerable<T> source,
Func<T, Task<bool>> func)
{
var tasks = new List<Task<bool>>();
foreach (var element in source)
{
tasks.Add(func(element));
}
var results = await Task.WhenAll<bool>(tasks.ToArray());
var trueIndex = results.Select((x, index) => new { x, index })
.Where(x => x.x)
.Select(x => x.index).ToList();
var filterSource = source.Where((x, index) => trueIndex.Contains(index));
return filterSource;
}
}
Then you can use someting like below
var result = await users.Where(async x => await TestAsync(x));
Full code here https://dotnetfiddle.net/lE2swz
I have the below method (it's an extension method but not relevant to this question) and I would like to use GroupBy on the results of the method.
class MyClass
{
public async Task<string> GetRank()
{
return "X";
}
public async static Task Test()
{
List<MyClass> items = new List<MyClass>() { new MyClass() };
var grouped = items.GroupBy(async _ => (await _.GetRank()));
}
}
The type of grouped is IGrouping<Task<string>, MyClass>, however I need to group by the actual awaited result of the async method (string). Despite using await and making the lambda async, I still get IGrouping<Task<string>, ..> instead of IGrouping<string, ...>
How to use GroupBy and group by a result of async Task<string> method and get a grouping by string?
You probably are looking to await all your tasks first, then group
// projection to task
var tasks = items.Select(y => AsyncMethod(y);
// Await them all
var results = await Task.WhenAll(tasks)
// group stuff
var groups = results.GroupBy(x => ...);
Full Demo here
Note : You didnt really have any testable code so i just plumbed up something similar
Update
the reason why you example isn't working
items.GroupBy(async _ => (await _.GetRank()))
is because and async lambda is really just a method that returns a task, this is why you are getting IGrouping<Task<string>, MyClass>
You need to wait for all you tasks to finish first before you can think about doing anything with the results from the task
To further explain what is happening take a look at this SharpLab example
Your async lambda basically resolves to this
new Func<int, Task<string>>(<>c__DisplayClass1_.<M>b__0)
Here is an asynchronous version of GroupBy. It expects a task as the result of keySelector, and returns a task that can be awaited:
public static async Task<IEnumerable<IGrouping<TKey, TSource>>>
GroupByAsync<TSource, TKey>(this IEnumerable<TSource> source,
Func<TSource, Task<TKey>> keySelector)
{
var tasks = source.Select(async item => (Key: await keySelector(item), Item: item));
var entries = await Task.WhenAll(tasks);
return entries.GroupBy(entry => entry.Key, entry => entry.Item);
}
It can be used like this:
class MyClass
{
public async Task<string> GetRank()
{
await Task.Delay(100);
return "X";
}
public async static Task Test()
{
var items = new List<MyClass>() { new MyClass(), new MyClass() };
var grouped = items.GroupByAsync(async _ => (await _.GetRank()));
foreach (var grouping in await grouped)
{
Console.WriteLine($"Key: {grouping.Key}, Count: {grouping.Count()}");
}
}
}
Output:
Key: X, Count: 2
I'm trying to change Stephen Toub's ForEachAsync<T> extension method into an extension which returns a result...
Stephen's extension:
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
My approach (not working; tasks get executed but result is wrong)
public static Task<TResult[]> ForEachAsync<T, TResult>(this IList<T> source,
int degreeOfParallelism, Func<T, Task<TResult>> body)
{
return Task.WhenAll<TResult>(
from partition in Partitioner.Create(source).GetPartitions(degreeOfParallelism)
select Task.Run<TResult>(async () =
{
using (partition)
while (partition.MoveNext())
await body(partition.Current); // When I "return await",
// I get good results but only one per partition
return default(TResult);
}));
}
I know I somehow have to return (WhenAll?) the results from the last part but I didn't yet figure out how to do it...
Update: The result I get is just degreeOfParallelism times null (I guess because of default(TResult)) even though all the tasks get executed. I also tried to return await body(...) and then the result was fine, but only degreeOfParallelism number of tasks got executed.
Now that the Parallel.ForEachAsync API has become part of the standard libraries (.NET 6), it makes sense to implement a variant that returns a Task<TResult[]>, based on this API. Here is an implementation:
/// <summary>
/// Executes a foreach loop on an enumerable sequence, in which iterations may run
/// in parallel, and returns the results of all iterations in the original order.
/// </summary>
public static Task<TResult[]> ForEachAsync<TSource, TResult>(
IEnumerable<TSource> source,
ParallelOptions parallelOptions,
Func<TSource, CancellationToken, ValueTask<TResult>> body)
{
ArgumentNullException.ThrowIfNull(source);
ArgumentNullException.ThrowIfNull(parallelOptions);
ArgumentNullException.ThrowIfNull(body);
List<TResult> results = new();
if (source.TryGetNonEnumeratedCount(out int count)) results.Capacity = count;
IEnumerable<(TSource, int)> withIndexes = source.Select((x, i) => (x, i));
return Parallel.ForEachAsync(withIndexes, parallelOptions, async (entry, ct) =>
{
(TSource item, int index) = entry;
TResult result = await body(item, ct).ConfigureAwait(false);
lock (results)
{
while (results.Count <= index) results.Add(default);
results[index] = result;
}
}).ContinueWith(t =>
{
if (t.IsFaulted)
{
TaskCompletionSource<TResult[]> tcs = new();
tcs.SetException(t.Exception.InnerExceptions);
return tcs.Task;
}
if (t.IsCanceled)
{
TaskCompletionSource<TResult[]> tcs = new();
tcs.SetCanceled(new TaskCanceledException(t).CancellationToken);
return tcs.Task;
}
Debug.Assert(t.IsCompletedSuccessfully);
lock (results) return Task.FromResult(results.ToArray());
}, default, TaskContinuationOptions.DenyChildAttach |
TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default).Unwrap();
}
This implementation supports all the options and the functionality of the Parallel.ForEachAsync overload that has an IEnumerable<T> as source. Its behavior in case of errors and cancellation is identical. The results are arranged in the same order as the associated elements in the source sequence.
Your LINQ query can only ever have the same number of results as the number of partitions - you're just projecting each partition into a single result.
If you don't care about the order, you just need to assemble the results of each partition into a list, then flatten them afterwards.
public static async Task<TResult[]> ExecuteInParallel<T, TResult>(this IList<T> source, int degreeOfParalleslism, Func<T, Task<TResult>> body)
{
var lists = await Task.WhenAll<List<TResult>>(
Partitioner.Create(source).GetPartitions(degreeOfParalleslism)
.Select(partition => Task.Run<List<TResult>>(async () =>
{
var list = new List<TResult>();
using (partition)
{
while (partition.MoveNext())
{
list.Add(await body(partition.Current));
}
}
return list;
})));
return lists.SelectMany(list => list).ToArray();
}
(I've renamed this from ForEachAsync, as ForEach sounds imperative (suitable for the Func<T, Task> in the original) whereas this is fetching results. A foreach loop doesn't have a result - this does.)
How can i create new task with multiple params, return type and createoptions by using new?
Task<int> task = Task<int>(DoWork(0,1));
private static Task<int> DoWork(int a, int b)
{
return null;
}
this is working fine.... but when i try create task with new keyword so i can set startoptions to longrunning like this:
Task<int> task = new Task<int>(DoWork(0,1), TaskCreationOptions.LongRunning);
I am always getting some errs like:
Argument 1: cannot convert from 'System.Threading.Tasks.Task' to 'System.Func'
I tried xx different variants but no luck. I understand i am probably wrongly passing param "Func function". I would like to avoid anonymous function. thx.
You can pass the method as a Lambda Expression:
Task<Task<int>> task = new Task<Task<int>>(() => DoWork(0,1), TaskCreationOptions.LongRunning);
Although, it is recommended to use Task.Factory.StartNew if possible, so you return a Hot Task instead of a Cold Task (which required you to call Task.Start).
Task<Task<int>> task = Task.Factory.StartNew(() => DoWork(0,1), TaskCreationOptions.LongRunning);
public SomeClass()
{
var func = new Func<int, int, int>((i1, i2) => i1 + i2);
Task.Factory.StartNew(() =>
Debug.WriteLine(func(1, 2)), TaskCreationOptions.LongRunning);
Task.Factory.StartNew(() =>
Debug.WriteLine(DoWork(2, 3).Result), TaskCreationOptions.LongRunning);
}
private static Task<int> DoWork(int a, int b)
{
return Task.FromResult(a + b);
}
The constructor for Task<T> requires a Func<T> argument.
Task<int> task = Task<int>(DoWork(0,1));
is attempting to call the DoWork method and pass the returnedTask<int> as the parameter into the task task. You need to pass a Func<int> instead by changing the type of DoWork to:
private static int DoWork(int a, int b) { ... }
then you can do:
Task<int> task = new Task<int>(() => DoWork(0,1), TaskCreationOptions.LongRunning);
I have a function like this:
public async Task<SomeViewModel> SampleFunction()
{
var data = service.GetData();
var myList = new List<SomeViewModel>();
myList.AddRange(data.select(x => new SomeViewModel
{
Id = x.Id,
DateCreated = x.DateCreated,
Data = await service.GetSomeDataById(x.Id)
}
return myList;
}
My await isn't working as it can only be used in a method or lambda marked with the async modifier. Where do I place the async with this function?
You can only use await inside an async method/delegate. In this case you must mark that lambda expression as async.
But wait, there's more...
Select is from the pre-async era and so it doesn't handle async lambdas (in your case it would return IEnumerable<Task<SomeViewModel>> instead of IEnumerable<SomeViewModel> which is what you actually need).
You can however add that functionality yourself (preferably as an extension method), but you need to consider whether you wish to await each item before moving on to the next (sequentialy) or await all items together at the end (concurrently).
Sequential async
static async Task<TResult[]> SelectAsync<TItem, TResult>(this IEnumerable<TItem> enumerable, Func<TItem, Task<TResult>> selector)
{
var results = new List<TResult>();
foreach (var item in enumerable)
{
results.Add(await selector(item));
}
return results.ToArray();
}
Concurrent async
static Task<TResult[]> SelectAsync<TItem, TResult>(this IEnumerable<TItem> enumerable, Func<TItem, Task<TResult>> selector)
{
return Task.WhenAll(enumerable.Select(selector));
}
Usage
public Task<SomeViewModel[]> SampleFunction()
{
return service.GetData().SelectAsync(async x => new SomeViewModel
{
Id = x.Id,
DateCreated = x.DateCreated,
Data = await service.GetSomeDataById(x.Id)
}
}
You're using await inside of a lambda, and that lambda is going to be transformed into its own separate named method by the compiler. To use await it must itself be async, and not just be defined in an async method. When you make the lambda async you now have a sequence of tasks that you want to translate into a sequence of their results, asynchronously. Task.WhenAll does exactly this, so we can pass our new query to WhenAll to get a task representing our results, which is exactly what this method wants to return:
public Task<SomeViewModel[]> SampleFunction()
{
return Task.WhenAll(service.GetData().Select(
async x => new SomeViewModel
{
Id = x.Id,
DateCreated = x.DateCreated,
Data = await service.GetSomeDataById(x.Id)
}));
}
Though maybe too heavyweight for your use case, using TPL Dataflow will give you finer control over your async processing.
public async Task<List<SomeViewModel>> SampleFunction()
{
var data = service.GetData();
var transformBlock = new TransformBlock<X, SomeViewModel>(
async x => new SomeViewModel
{
Id = x.Id,
DateCreated = x.DateCreated,
Data = await service.GetSomeDataById(x.Id)
},
new ExecutionDataflowBlockOptions
{
// Let 8 "service.GetSomeDataById" calls run at once.
MaxDegreeOfParallelism = 8
});
var result = new List<SomeViewModel>();
var actionBlock = new ActionBlock<SomeViewModel>(
vm => result.Add(vm));
transformBlock.LinkTo(actionBlock,
new DataflowLinkOptions { PropagateCompletion = true });
foreach (var x in data)
{
transformBlock.Post(x);
}
transformBlock.Complete();
await actionBlock.Completion;
return result;
}
This could be substantially less long-winded if service.GetData() returned an IObservable<X> and this method returned an IObservable<SomeViewModel>.