I am using some API that returns me an IEnumerator child Instance i want to do some function on each object returned
I usualy use
while(Enumerator.MoveNext())
{
DoSomething(Enumerator.current);
}
I don't have IEnumerable Object
and I want to DoSomething with each object asynchronously
I don't want to get IEnumerable from IEnumerator for performance issues.
What else can i do to loop async with IEnumerator
C# 8.0 has Async Enumerable feature.
static async Task Main(string[] args)
{
await foreach (var dataPoint in FetchIOTData())
{
Console.WriteLine(dataPoint);
}
Console.ReadLine();
}
static async IAsyncEnumerable<int> FetchIOTData()
{
for (int i = 1; i <= 10; i++)
{
await Task.Delay(1000);//Simulate waiting for data to come through.
yield return i;
}
}
Related
I'm trying to write a recursion method to retrieve the parent of a object (and that parent etc). This in itself isn't a problem but the calls are async which result in the following error:
The body of '.....Recursive(string)' cannot be an iterator block because 'Task<IEnumerable>' is not an iterator interface type [...]csharp(CS1624)
The code:
private async Task<IEnumerable<string>> Recursive(string objectId)
{
var result = await GetParent(objectId);
if (result?.Length > 0)
{
yield return result;
await Recursive(objectId);
}
}
private async Task<string> GetParent(string objectId)
{
await Task.Run(() => { return $"{objectId}/parent"; });
}
I have also tried IAsyncEnumerable but that resulted in the folllowing error:
'IAsyncEnumerable' does not contain a definition for 'GetAwaiter' and no accessible extension method 'GetAwaiter' accepting a first argument of type 'IAsyncEnumerable' could be found (are you missing a using directive or an assembly reference?) [...]csharp(CS1061)
private async IAsyncEnumerable<string> Recursive(string objectId)
{
var result = await GetParent(objectId);
if (result?.Length > 0)
{
yield return result;
await Recursive(objectId);
}
}
private async Task<string> GetParent(string objectId)
{
await Task.Run(() => { return $"{objectId}/parent"; });
}
I'm going to write a while loop to get this to work. But I'm interested if this is possible at all.
Update 2:
Ok, I think I got it. Thanks guys.
private async IAsyncEnumerable<string> Recursive(string objectId)
{
var result = await GetParent(objectId);
if (result?.Length > 0)
{
yield return result;
await foreach (var r2 in Recursive(objectId))
{
yield return r2;
}
}
}
private async Task<string> GetParent(string objectId)
{
await Task.Run(() => { return $"{objectId}/parent"; });
}
The current code wouldn't compile even if it was synchronous, and the result was IEnumerable<string>. The results of Recursive are never returned. It's not possible to just return an IEnumerable from an iterator either.
This code would work. Whether it does anything useful is another matter :
private IEnumerable<string> Recursive(string objectId)
{
var result = GetParent(objectId);
if (!string.IsNullOrEmpty(result))
{
yield return result;
foreach(var r in Recursive(result))
{
yield return r;
}
}
}
private string GetParent(string objectId)
{
return $"{objectId}/parent";
}
Getting it to work asynchronously only needs changing to IAsyncEnumerable and using await:
private async IAsyncEnumerable<string> Recursive(string objectId)
{
var result = await GetParent(objectId);
if (!string.IsNullOrEmpty(result))
{
yield return result;
await foreach(var r in Recursive(result))
{
yield return r;
}
}
}
private Task<string> GetParent(string objectId)
{
return Task.FromResult($"{objectId}/parent");
}
I have a custom "CachedEnumerable" class (inspired by Caching IEnumerable) that I need to make thread safe for my asp.net core web app.
Is the following implementation of the Enumerator thread safe? (All other reads/writes to IList _cache are locked appropriately) (Possibly related to Does the C# Yield free a lock?)
And more specifically, if there are 2 threads accessing the enumerator, how do I protect against one thread incrementing "index" causing a second enumerating thread from getting the wrong element from the _cache (ie. element at index + 1 instead of at index)? Is this race condition a real concern?
public IEnumerator<T> GetEnumerator()
{
var index = 0;
while (true)
{
T current;
lock (_enumeratorLock)
{
if (index >= _cache.Count && !MoveNext()) break;
current = _cache[index];
index++;
}
yield return current;
}
}
Full code of my version of CachedEnumerable:
public class CachedEnumerable<T> : IDisposable, IEnumerable<T>
{
IEnumerator<T> _enumerator;
private IList<T> _cache = new List<T>();
public bool CachingComplete { get; private set; } = false;
public CachedEnumerable(IEnumerable<T> enumerable)
{
switch (enumerable)
{
case CachedEnumerable<T> cachedEnumerable: //This case is actually dealt with by the extension method.
_cache = cachedEnumerable._cache;
CachingComplete = cachedEnumerable.CachingComplete;
_enumerator = cachedEnumerable.GetEnumerator();
break;
case IList<T> list:
//_cache = list; //without clone...
//Clone:
_cache = new T[list.Count];
list.CopyTo((T[]) _cache, 0);
CachingComplete = true;
break;
default:
_enumerator = enumerable.GetEnumerator();
break;
}
}
public CachedEnumerable(IEnumerator<T> enumerator)
{
_enumerator = enumerator;
}
private int CurCacheCount
{
get
{
lock (_enumeratorLock)
{
return _cache.Count;
}
}
}
public IEnumerator<T> GetEnumerator()
{
var index = 0;
while (true)
{
T current;
lock (_enumeratorLock)
{
if (index >= _cache.Count && !MoveNext()) break;
current = _cache[index];
index++;
}
yield return current;
}
}
//private readonly AsyncLock _enumeratorLock = new AsyncLock();
private readonly object _enumeratorLock = new object();
private bool MoveNext()
{
if (CachingComplete) return false;
if (_enumerator != null && _enumerator.MoveNext()) //The null check should have been unnecessary b/c of the lock...
{
_cache.Add(_enumerator.Current);
return true;
}
else
{
CachingComplete = true;
DisposeWrappedEnumerator(); //Release the enumerator, as it is no longer needed.
}
return false;
}
public T ElementAt(int index)
{
lock (_enumeratorLock)
{
if (index < _cache.Count)
{
return _cache[index];
}
}
EnumerateUntil(index);
lock (_enumeratorLock)
{
if (_cache.Count <= index) throw new ArgumentOutOfRangeException(nameof(index));
return _cache[index];
}
}
public bool TryGetElementAt(int index, out T value)
{
lock (_enumeratorLock)
{
value = default;
if (index < CurCacheCount)
{
value = _cache[index];
return true;
}
}
EnumerateUntil(index);
lock (_enumeratorLock)
{
if (_cache.Count <= index) return false;
value = _cache[index];
}
return true;
}
private void EnumerateUntil(int index)
{
while (true)
{
lock (_enumeratorLock)
{
if (_cache.Count > index || !MoveNext()) break;
}
}
}
public void Dispose()
{
DisposeWrappedEnumerator();
}
private void DisposeWrappedEnumerator()
{
if (_enumerator != null)
{
_enumerator.Dispose();
_enumerator = null;
if (_cache is List<T> list)
{
list.Trim();
}
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public int CachedCount
{
get
{
lock (_enumeratorLock)
{
return _cache.Count;
}
}
}
public int Count()
{
if (CachingComplete)
{
return _cache.Count;
}
EnsureCachingComplete();
return _cache.Count;
}
private void EnsureCachingComplete()
{
if (CachingComplete)
{
return;
}
//Enumerate the rest of the collection
while (!CachingComplete)
{
lock (_enumeratorLock)
{
if (!MoveNext()) break;
}
}
}
public T[] ToArray()
{
EnsureCachingComplete();
//Once Caching is complete, we don't need to lock
if (!(_cache is T[] array))
{
array = _cache.ToArray();
_cache = array;
}
return array;
}
public T this[int index] => ElementAt(index);
}
public static CachedEnumerable<T> Cached<T>(this IEnumerable<T> source)
{
//no gain in caching a cache.
if (source is CachedEnumerable<T> cached)
{
return cached;
}
return new CachedEnumerable<T>(source);
}
}
Basic Usage: (Although not a meaningful use case)
var cached = expensiveEnumerable.Cached();
foreach (var element in cached) {
Console.WriteLine(element);
}
Update
I tested the current implementation based on #Theodors answer https://stackoverflow.com/a/58547863/5683904 and confirmed (AFAICT) that it is thread-safe when enumerated with a foreach without creating duplicate values (Thread-safe Cached Enumerator - lock with yield):
class Program
{
static async Task Main(string[] args)
{
var enumerable = Enumerable.Range(0, 1_000_000);
var cachedEnumerable = new CachedEnumerable<int>(enumerable);
var c = new ConcurrentDictionary<int, List<int>>();
var tasks = Enumerable.Range(1, 100).Select(id => Test(id, cachedEnumerable, c));
Task.WaitAll(tasks.ToArray());
foreach (var keyValuePair in c)
{
var hasDuplicates = keyValuePair.Value.Distinct().Count() != keyValuePair.Value.Count;
Console.WriteLine($"Task #{keyValuePair.Key} count: {keyValuePair.Value.Count}. Has duplicates? {hasDuplicates}");
}
}
static async Task Test(int id, IEnumerable<int> cache, ConcurrentDictionary<int, List<int>> c)
{
foreach (var i in cache)
{
//await Task.Delay(10);
c.AddOrUpdate(id, v => new List<int>() {i}, (k, v) =>
{
v.Add(i);
return v;
});
}
}
}
Your class is not thread safe, because shared state is mutated in unprotected regions inside your class. The unprotected regions are:
The constructor
The Dispose method
The shared state is:
The _enumerator private field
The _cache private field
The CachingComplete public property
Some other issues regarding your class:
Implementing IDisposable creates the responsibility to the caller to dispose your class. There is no need for IEnumerables to be disposable. In the contrary IEnumerators are disposable, but there is language support for their automatic disposal (feature of foreach statement).
Your class offers extended functionality not expected from an IEnumerable (ElementAt, Count etc). Maybe you intended to implement a CachedList instead? Without implementing the IList<T> interface, LINQ methods like Count() and ToArray() cannot take advantage of your extended functionality, and will use the slow path like they do with plain vanilla IEnumerables.
Update: I just noticed another thread-safety issue. This one is related to the public IEnumerator<T> GetEnumerator() method. The enumerator is compiler-generated, since the method is an iterator (utilizes yield return). Compiler-generated enumerators are not thread safe. Consider this code for example:
var enumerable = Enumerable.Range(0, 1_000_000);
var cachedEnumerable = new CachedEnumerable<int>(enumerable);
var enumerator = cachedEnumerable.GetEnumerator();
var tasks = Enumerable.Range(1, 4).Select(id => Task.Run(() =>
{
int count = 0;
while (enumerator.MoveNext())
{
count++;
}
Console.WriteLine($"Task #{id} count: {count}");
})).ToArray();
Task.WaitAll(tasks);
Four threads are using concurrently the same IEnumerator. The enumerable has 1,000,000 items. You may expect that each thread would enumerate ~250,000 items, but that's not what happens.
Output:
Task #1 count: 0
Task #4 count: 0
Task #3 count: 0
Task #2 count: 1000000
The MoveNext in the line while (enumerator.MoveNext()) is not your safe MoveNext. It is the compiler-generated unsafe MoveNext. Although unsafe, it includes a mechanism intended probably for dealing with exceptions, that marks temporarily the enumerator as finished before calling the externally provided code. So when multiple threads are calling the MoveNext concurrently, all but the first will get a return value of false, and will terminate instantly the enumeration, having completed zero loops. To solve this you must probably code your own IEnumerator class.
Update: Actually my last point about thread-safe enumeration is a bit unfair, because enumerating with the IEnumerator interface is an inherently unsafe operation, which is impossible to fix without the cooperation of the calling code. This is because obtaining the next element is not an atomic operation, since it involves two steps (call MoveNext() + read Current). So your thread-safety concerns are limited to the protection of the internal state of your class (fields _enumerator, _cache and CachingComplete). These are left unprotected only in the constructor and in the Dispose method, but I suppose that the normal use of your class may not follow code paths that create the race conditions that would result to internal state corruption.
Personally I would prefer to take care of these code paths too, and I wouldn't let it to the whims of chance.
Update: I wrote a cache for IAsyncEnumerables, to demonstrate an alternative technique. The enumeration of the source IAsyncEnumerable is not driven by the callers, using locks or semaphores to obtain exclusive access, but by a separate worker-task. The first caller starts the worker-task. Each caller at first yields all items that are already cached, and then awaits for more items, or for a notification that there are no more items. As notification mechanism I used a TaskCompletionSource<bool>. A lock is still used to ensure that all access to shared resources is synchronized.
public class CachedAsyncEnumerable<T> : IAsyncEnumerable<T>
{
private readonly object _locker = new object();
private IAsyncEnumerable<T> _source;
private Task _sourceEnumerationTask;
private List<T> _buffer;
private TaskCompletionSource<bool> _moveNextTCS;
private Exception _sourceEnumerationException;
private int _sourceEnumerationVersion; // Incremented on exception
public CachedAsyncEnumerable(IAsyncEnumerable<T> source)
{
_source = source ?? throw new ArgumentNullException(nameof(source));
}
public async IAsyncEnumerator<T> GetAsyncEnumerator(
CancellationToken cancellationToken = default)
{
lock (_locker)
{
if (_sourceEnumerationTask == null)
{
_buffer = new List<T>();
_moveNextTCS = new TaskCompletionSource<bool>();
_sourceEnumerationTask = Task.Run(
() => EnumerateSourceAsync(cancellationToken));
}
}
int index = 0;
int localVersion = -1;
while (true)
{
T current = default;
Task<bool> moveNextTask = null;
lock (_locker)
{
if (localVersion == -1)
{
localVersion = _sourceEnumerationVersion;
}
else if (_sourceEnumerationVersion != localVersion)
{
ExceptionDispatchInfo
.Capture(_sourceEnumerationException).Throw();
}
if (index < _buffer.Count)
{
current = _buffer[index];
index++;
}
else
{
moveNextTask = _moveNextTCS.Task;
}
}
if (moveNextTask == null)
{
yield return current;
continue;
}
var moved = await moveNextTask;
if (!moved) yield break;
lock (_locker)
{
current = _buffer[index];
index++;
}
yield return current;
}
}
private async Task EnumerateSourceAsync(CancellationToken cancellationToken)
{
TaskCompletionSource<bool> localMoveNextTCS;
try
{
await foreach (var item in _source.WithCancellation(cancellationToken))
{
lock (_locker)
{
_buffer.Add(item);
localMoveNextTCS = _moveNextTCS;
_moveNextTCS = new TaskCompletionSource<bool>();
}
localMoveNextTCS.SetResult(true);
}
lock (_locker)
{
localMoveNextTCS = _moveNextTCS;
_buffer.TrimExcess();
_source = null;
}
localMoveNextTCS.SetResult(false);
}
catch (Exception ex)
{
lock (_locker)
{
localMoveNextTCS = _moveNextTCS;
_sourceEnumerationException = ex;
_sourceEnumerationVersion++;
_sourceEnumerationTask = null;
}
localMoveNextTCS.SetException(ex);
}
}
}
This implementation follows a specific strategy for dealing with exceptions. If an exception occurs while enumerating the source IAsyncEnumerable, the exception will be propagated to all current callers, the currently used IAsyncEnumerator will be discarded, and the incomplete cached data will be discarded too. A new worker-task may start again later, when the next enumeration request is received.
The access to cache, yes it is thread safe, only one thread per time can read from _cache object.
But in that way you can't assure that all threads gets elements in the same order as they access to GetEnumerator.
Check these two exaples, if the behavior is what you expect, you can use lock in that way.
Example 1:
THREAD1 Calls GetEnumerator
THREAD1 Initialize T current;
THREAD2 Calls GetEnumerator
THREAD2 Initialize T current;
THREAD2 LOCK THREAD
THREAD1 WAIT
THREAD2 read from cache safely _cache[0]
THREAD2 index++
THREAD2 UNLOCK
THREAD1 LOCK
THREAD1 read from cache safely _cache[1]
THREAD1 i++
THREAD1 UNLOCK
THREAD2 yield return current;
THREAD1 yield return current;
Example 2:
THREAD2 Initialize T current;
THREAD2 LOCK THREAD
THREAD2 read from cache safely
THREAD2 UNLOCK
THREAD1 Initialize T current;
THREAD1 LOCK THREAD
THREAD1 read from cache safely
THREAD1 UNLOCK
THREAD1 yield return current;
THREAD2 yield return current;
How would you rewrite TaskOfTResult_MethodAsync to avoid the error: Since this is an async method, the return expression must be of type int rather than Task<int>.
private static async Task<int> TaskOfTResult_MethodAsync()
{
return Task.Run(() => ComplexCalculation());
}
private static int ComplexCalculation()
{
double x = 2;
for (int i = 1; i< 10000000; i++)
{
x += Math.Sqrt(x) / i;
}
return (int)x;
}
Simple; either don't make it async:
private static Task<int> TaskOfTResult_MethodAsync()
{
return Task.Run(() => ComplexCalculation());
}
or await the result:
private static async Task<int> TaskOfTResult_MethodAsync()
{
return await Task.Run(() => ComplexCalculation());
}
(adding the await here is more expensive in terms of the generated machinery, but has more obvious/reliable exception handling, etc)
Note: you could also probably use Task.Yield:
private static async Task<int> TaskOfTResult_MethodAsync()
{
await Task.Yield();
return ComplexCalculation();
}
(note that what this does depends a lot on the sync-context, if one)
Is possible to invoke a "IEnumerable/yield return" method when using dynamic?
I'm asking this because I'm getting the error below when I call the "Test(States1.GetNames())" method.
Error: "Additional information: 'object' does not contain a definition for 'GetEnumerator'"
using System;
using System.Collections.Generic;
using System.Collections;
using System.Diagnostics;
namespace YieldDemo
{
public class States1
{
public static IEnumerable<string> GetNames()
{
yield return "Alabama";
yield return "Alaska";
yield return "Arizona";
yield return "Arkansas";
yield return "California";
yield return "Others ...";
}
}
public class States2
{
private static readonly IList<string> _names;
static States2()
{
_names = new List<string>() {"Alabama",
"Alaska",
"Arizona",
"Arkansas",
"California",
"Others ..." };
}
public static IList<string> GetNames()
{
return _names;
}
}
public class Program
{
static void Main()
{
Test(States2.GetNames());
Test(States1.GetNames());
Console.ReadLine();
}
public static void Test(dynamic state)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Iterate(state);
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
public static void Iterate(dynamic itemList)
{
var enumerator = itemList.GetEnumerator();
while (enumerator.MoveNext())
{
Console.WriteLine(enumerator.Current);
}
}
}
}
Thanks
The problem is that the iterator block implementation uses explicit interface implementation to implement IEnumerable<T>... and explicit interface implementation doesn't play nicely with dynamic typing in general. (You don't need to use iterator blocks to see that. See my article on Gotchas in Dynamic Typing for more details.)
You can iterate with foreach though:
public static void Iterate(dynamic itemList)
{
foreach (dynamic item in itemList)
{
Console.WriteLine(item);
}
}
This has the additional benefit that it will dispose of the iterator for you, which your previous code didn't do :)
Alternatively, you add overloads for Iterate to take IEnumerable or IEnumerable<T>, and let execution-time overload resolution within Test do the right thing (due to state being dynamic too).
It fails because the IEnumerable<string> class generated by your yield code explicitly implements its interfaces (including the GetEnumerator you're trying to use). You can call the method like this:
public static void Iterate(dynamic itemList)
{
var enumerator = ((IEnumerable)itemList).GetEnumerator();
while (enumerator.MoveNext())
{
Console.WriteLine(enumerator.Current);
}
}
Or, since you don't need dynamic for any reason I can see here, maybe just:
public static void Iterate(IEnumerable itemList)
{
var enumerator = itemList.GetEnumerator();
while (enumerator.MoveNext())
{
Console.WriteLine(enumerator.Current);
}
}
Or
public static void Iterate<T>(IEnumerable<T> itemList)
How can I create the parallel equivalent of a do-while or similar in the Update() method below?
Another thread in the app writes to TestBuffer at random. TestBuffer.RemoveItemAndDoSomethingWithIt(); should run until the TestBuffer is empty. Currently Update() only runs with the items that were in the collection when it was was enumerated, which makes sense.
internal class UnOrderedBuffer<T> where T : class
{
ConcurrentBag<T> GenericBag = new ConcurrentBag<T>();
}
internal class Tester
{
private UnOrderedBuffer<Data> TestBuffer;
public void Update()
{
Parallel.ForEach(TestBuffer, Item =>
{
TestBuffer.RemoveItemAndDoSomethingWithIt();
});
}
}
You could force a single execution by 'prepending' a null/default value:
static IEnumerable<T> YieldOneDefault<T>(this IEnumerable<T> values)
{
yield return default(T);
foreach(var item in values)
yield return item;
}
And then use it as follows:
Parallel.ForEach(TestBuffer.YieldOneDefault(), Item =>
{
if(Item != null)
TestBuffer.RemoveItemAndDoSomethingWithIt();
else
DoSomethingDuringTheFirstPass();
});
Although I suspect you might be looking for the following extension methods:
public static IEnumerable<IEnumerable<T>> GetParrallelConsumingEnumerable<T>(this IProducerConsumerCollection<T> collection)
{
T item;
while (collection.TryTake(out item))
{
yield return GetParrallelConsumingEnumerableInner(collection, item);
}
}
private static IEnumerable<T> GetParrallelConsumingEnumerableInner<T>(IProducerConsumerCollection<T> collection, T item)
{
yield return item;
while (collection.TryTake(out item))
{
yield return item;
}
}
Which will let get you this result (which I think is what you are after):
Parallel.ForEach(TestBuffer.GetParrallelConsumingEnumerable(), Items =>
{
foreach(var item in Items)
{
DoSomethingWithItem(item);
}
});
for/foreach are usually used for performing a task on multiple items.
while-do/do-while are for:
a. Performing a task on multiple items
that have not yet been enumerated (e.g. a tree).
- In this case you can define a BFS or DFS enumerator and use in a foreach.
b. Performing iterative work on a single item
- Iterative work is not suitable for parallelism
Do not attempt to refactor code from serial to parallel. Instead, consider what your assignment is and how it is best done in parallel. (Refactor algorithm, not code.)
public static void While(
ParallelOptions parallelOptions, Func<bool> condition,
Action<ParallelLoopState> body)
{
Parallel.ForEach(Infinite(), parallelOptions, (ignored, loopState) =>
{
if (condition()) body(loopState);
else loopState.Stop();
});
}
private static IEnumerable<bool> Infinite()
{
while (true) yield return true;
}