public async Task<List<string>> getAllQueries()
{
List<string> allQueries = new List<string>();
for (int i =0 ; i < 10; i++)
{
List<string> queries = await getQueriesForId(i);
allQueries.AddRange(queries);
}
return allQueries;
}
Is there anything wrong with this code. I am not getting the correct results. I have not much idea about async/await. I observed that this function is returning the list without combining the results from all concurrent calls. Could somebody please let me know how to combine the lists coming from all concurrent calls and then only return ?
I would use the Task.WhenAll method and combine the results once they have all be materialized, consider the following:
public async Task<List<string>> GetAllQueriesAsync()
{
var tasks =
Enumerable.Range(0, 10)
.Select(i => GetQueriesForIdAsync(i));
await Task.WhenAll(tasks);
return tasks.SelectMany(t => t.Result).ToList();
}
With the following snippet there are several key changes that I made.
Suffix Task and Task<T> returning methods with "Async"
Utilized Enumerable.Range instead of for loop
This will return a list of IEnumerable<Task<List<string>>>
Run all the queries in parallel
I would recommend using Task.WhenAll(). I created these handy extension methods you might find useful:
public static Task<TResult[]> SelectAsync<T, TResult>(
this IEnumerable<T> list,
Func<T, Task<TResult>> functionToPerform)
{
var tasks = list.Select(functionToPerform.Invoke);
return Task.WhenAll(tasks);
}
And here's an example how to use it:
var results = await myItems.SelectAsync(item => DoStuff(item)).ConfigureAwait(false);
Related
I have these example code:
private async Task<IEnumerable<long>> GetValidIds1(long[] ids)
{
var validIds = new List<long>();
var results = await Task.WhenAll(ids.Select(i => CheckValidIdAsync(i)))
.ConfigureAwait(false);
for (int i = 0; i < ids.Length; i++)
{
if (results[i])
{
validIds.Add(ids[i]);
}
}
return validIds;
}
private async Task<IEnumerable<long>> GetValidIds2(long[] ids)
{
var validIds = new ConcurrentBag<long>();
await Task.WhenAll(ids.Select(async i =>
{
var valid = await CheckValidIdAsync(i);
if (valid)
validIds.Add(i);
})).ConfigureAwait(false);
return validIds;
}
private async Task<bool> CheckValidIdAsync(long id);
I currently use GetValidIds1() but it has inconvenience of having to tie input ids to result using index at the end.
GetValidIds2() is what i want to write but there are a few concerns:
I have 'await' in select lambda expression. Because LINQ is lazy evaluation, I don't think it would block other CheckValidIdAsync() calls from starting but exactly who's context does it suspend? Per MSDN doc
The await operator suspends evaluation of the enclosing async method until the asynchronous operation represented by its operand completes.
So in this case, the enclosing async method is lambda expression itself so it doesn't affect other calls?
Is there a better way to process result of async method and collect output of that process in a list?
Another way to do it is to project each long ID to a Task<ValueTuple<long, bool>>, instead of projecting it to a Task<bool>. This way you'll be able to filter the results using pure LINQ:
private async Task<long[]> GetValidIds3(long[] ids)
{
IEnumerable<Task<(long Id, bool IsValid)>> tasks = ids
.Select(async id =>
{
bool isValid = await CheckValidIdAsync(id).ConfigureAwait(false);
return (id, isValid);
});
var results = await Task.WhenAll(tasks).ConfigureAwait(false);
return results
.Where(e => e.IsValid)
.Select(e => e.Id)
.ToArray();
}
The above GetValidIds3 is equivalent with the GetValidIds1 in your question. It returns the filtered IDs in the same order as the original ids. On the contrary the GetValidIds2 doesn't guarantee any order. If you have to use a concurrent collection, it's better to use a ConcurrentQueue<T> instead of a ConcurrentBag<T>, because the former preserves the insertion order. Even if the order is not important, preserving it makes the debugging easier.
The first function is designed to enable linq to execute lambda functions safely in parallel (even the async void ones).
So you can do collection.AsParallel().ForAllASync(async x => await x.Action).
The second function is designed to enable you to combine and execute multiple IAsyncEnumerables in parallel and return their results as quick as possible.
I have the following code:
public static async Task ForAllAsync<TSource>(
this ParallelQuery<TSource> source,
Func<TSource, Task> selector,
int? maxDegreeOfParallelism = null)
{
int maxAsyncThreadCount = maxDegreeOfParallelism ?? Math.Min(System.Environment.ProcessorCount, 128);
using SemaphoreSlim throttler = new SemaphoreSlim(maxAsyncThreadCount, maxAsyncThreadCount);
IEnumerable<Task> tasks = source.Select(async input =>
{
await throttler.WaitAsync().ConfigureAwait(false);
try
{
await selector(input).ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
await Task.WhenAll(tasks).ConfigureAwait(true);
}
public static async IAsyncEnumerable<T> ForAllAsync<TSource, T>(
this ParallelQuery<TSource> source,
Func<TSource, IAsyncEnumerable<T>> selector,
int? maxDegreeOfParallelism = null,
[EnumeratorCancellation]CancellationToken cancellationToken = default)
where T : new()
{
IEnumerable<(IAsyncEnumerator<T>, bool)> enumerators =
source.Select(x => (selector.Invoke(x).GetAsyncEnumerator(cancellationToken), true)).ToList();
while (enumerators.Any())
{
await enumerators.AsParallel()
.ForAllAsync(async e => e.Item2 = (await e.Item1.MoveNextAsync()), maxDegreeOfParallelism)
.ConfigureAwait(false);
foreach (var enumerator in enumerators)
{
yield return enumerator.Item1.Current;
}
enumerators = enumerators.Where(e => e.Item2);
}
}
If I remove the "ToList()" from the second function, yield return starts to return null as enumerator.Item1.Current tends to be null, despite enumerator.Item2 (the result from MoveNextAsync()) being true.
Why?
This is a classic case of deferred execution. Every time you invoke an evaluating method on a non-materialized IEnumerable<>, it does the work to materialize the IEnumerable. In this case that's re-invoking your selector and creating new instances of the tasks that await the GetAsyncEnumerator calls.
With the call to .ToList() you materialize the IEnumerable. Without it, materialization occurs with with every call to .Any(), the call to ForAllAsync(), and at your foreach loop.
The same behavior can be reproduced minimally like this:
var enumerable = new[] { 1 }.Select(_ => Task.Delay(10));
await Task.WhenAll(enumerable);
Console.WriteLine(enumerable.First().IsCompleted); // False
enumerable = enumerable.ToList();
await Task.WhenAll(enumerable);
Console.WriteLine(enumerable.First().IsCompleted); // True
In the first call to enumerable.First(), we end up with a different task instance than the one that we awaited in the line before it.
In the second call, we're using the same instance because the Task was already materialized into a List.
I am passing an async delegate to the LINQ Select method, and I would prefer to get a list of ValueTasks instead of a list of Tasks. How can I do it? Example:
var result = (new[] { 0 }).Select(async x => await Task.Yield()).ToArray();
Console.WriteLine($"Result type: {result.GetType()}");
Result type: System.Threading.Tasks.Task[]
This is not desirable. I figured out that I can create the list I want by replacing the async delegate with an async method, like this:
var result = (new[] { 0 }).Select(DoAsync).ToArray();
Console.WriteLine($"Result type: {result.GetType()}");
async ValueTask DoAsync(int arg)
{
await Task.Yield();
}
Result type: System.Threading.Tasks.ValueTask[]
This works but it's awkward. Is there any way to keep the neat delegate syntax, and still get the ValueTasks I want?
You can explicitly write value task like this
var result = (new[] { 0 }).Select<int, ValueTask>(async x => await Task.Yield()).ToArray();
Suppose you have a list of strings (or of any other type, just using string as an example), e.g.
IEnumerable<string> fullList = ...;
and an async predicate, e.g.
static Task<bool> IncludeString(string s) { ... }
What is the simplest way of filtering the list by that predicate, with the following constraints:
Predicates should not be run sequentially (assume the list is long and the async predicate is slow)
Resulting filtered list should preserve ordering
I did find a solution, but it involves creating a temporary list that has the result of the predicate for each entry, and then using that to do the filtering. It just doesn't feel elegant enough. Here it is:
var includedIndices = await Task.WhenAll(fullList.Select(IncludeString));
var filteredList = fullList.Where((_, i) => includedIndices[i]);
It feels like something that should be possible with a simple framework call, but I wasn't able to find one.
It's not particularly elegant, but you can create anonymous types in a Task.ContinueWith call from the predicate in select, awaiting the WhenAll call on that array, and using the values included in those task results.
public async Task<T[]> FilterAsync<T>(IEnumerable<T> sourceEnumerable, Func<T, Task<bool>> predicateAsync)
{
return (await Task.WhenAll(
sourceEnumerable.Select(
v => predicateAsync(v)
.ContinueWith(task => new { Predicate = task.Result, Value = v })))
).Where(a => a.Predicate).Select(a => a.Value).ToArray();
}
Example usage (made-up function for demonstration):
// Returns { "ab", "abcd" } after 1000ms
string[] evenLengthStrings = await FilterAsync<string>(new string[] { "a", "ab", "abc", "abcd" }, (async s => { await Task.Delay(1000); return s.Length % 2 == 0; }));
Note that even without the ToArray call, the returned enumerable will not re-enumerate the source enumerable when enumerated - it will not be lazy because Task.WhenAll doesn't return a LINQy lazy enumerable.
You could create your own implementation of the Linq function you need, i.e.
public static async Task<IEnumerable<TIn>> FilterAsync<TIn>(this IEnumerable<TIn> source, Func<TIn, Task<bool>> action)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (action == null) throw new ArgumentNullException(nameof(action));
var result = new List<TIn>();
foreach (var item in source)
{
if (await action(item))
{
result.Add(item);
}
}
return result;
}
Then you can use it like so
IEnumerable<string> example = new List<string> { "a", "", null, " ", "e" };
var validStrings = await example.FilterAsync(IncludeString);
// returns { "a", "e" }
given this implementation of IncludeString
public static Task<bool> IncludeString(string s) {
return Task.FromResult(!string.IsNullOrWhiteSpace(s));
}
So it basically runs an async Func<int, Task<bool>> for each item in the list
I want to combine the result of 2 tasks in one List collection.
Make sure that- I want to run both methods in parallel.
Code:
List<Employee> totalEmployees = new List<Employee>();
Method1:
public async Task<IEnumerable<Employee>> SearchEmployeeFromDb();
Method2:
public async Task<IEnumerable<Employee>> GetEmployeeFromService();
Now, I want to hold the result of these two methods in totalEmployees field, also these 2 method should run asynchronously.
While many answers are close, the cleanest and most efficient option is using Task.WhenAll combined with SelectMany:
async Task<IEnumerable<Employee>> Combine()
{
var results = await Task.WhenAll(SearchEmployeeFromDb(), GetEmployeeFromService());
return results.SelectMany(result => result);
}
This assumes that by parallel you mean concurrently. If you wish to run these operations with multiple threads from the beginning (including the synchronous parts of the async method) you need to also use Task.Run to offload work to a ThreadPool thread:
private async Task<IEnumerable<Employee>> Combine()
{
var results =
await Task.WhenAll(Task.Run(() => SearchEmployeeFromDb()), Task.Run(() => GetEmployeeFromService()));
return results.SelectMany(result => result);
}
Start both tasks
Use Task.WhenAll to wait for both tasks to finish
Use Enumerable.Concat to combine the results
var searchEmployeesTask = SearchEmployeeFromDb();
var getEmployeesTask = GetEmployeeFromService();
await Task.WhenAll(searchEmployeesTask, getEmployeesTask);
var totalEmployees = searchEmployeesTask.Result.Concat(getEmployeesTask.Result);
You can use Task.WhenAll to create a task which will return when all supplied tasks are complete
var result = await Task.WhenAll(SearchEmployeeFromDb(),GetEmployeeFromService());
var combined = result[0].Concat(result[1]);
Something like this should work:
var t1 = SearchEmployeeFromDb()
var t2 = GetEmployeeFromService()
await Task.WhenAll(t1, t2)
// Now use t1.Result and t2.Result to get `totalEmployees`
Use ConfigureAwait(false) to avoid deadlocking, define the tasks, execute and then await.
var fromDbTask = SearchEmployeeFromDb().ConfigureAwait(false);
var fromServiceTask = GetEmployeeFromService().ConfigureAwait(false);
var fromDbResult = await fromDbTask;
var totalEmployees = new List(fromDbResult);
var fromServiceResult = await fromServiceResult;
totalEmployees.AddRange(fromServiceResult);
... or use whichever way you want to merge the two lists.
I updated the solution, it was unneccessary to create the list and then append the first result. We wait for the first method to finish and then create the list.