How can I create the parallel equivalent of a do-while or similar in the Update() method below?
Another thread in the app writes to TestBuffer at random. TestBuffer.RemoveItemAndDoSomethingWithIt(); should run until the TestBuffer is empty. Currently Update() only runs with the items that were in the collection when it was was enumerated, which makes sense.
internal class UnOrderedBuffer<T> where T : class
{
ConcurrentBag<T> GenericBag = new ConcurrentBag<T>();
}
internal class Tester
{
private UnOrderedBuffer<Data> TestBuffer;
public void Update()
{
Parallel.ForEach(TestBuffer, Item =>
{
TestBuffer.RemoveItemAndDoSomethingWithIt();
});
}
}
You could force a single execution by 'prepending' a null/default value:
static IEnumerable<T> YieldOneDefault<T>(this IEnumerable<T> values)
{
yield return default(T);
foreach(var item in values)
yield return item;
}
And then use it as follows:
Parallel.ForEach(TestBuffer.YieldOneDefault(), Item =>
{
if(Item != null)
TestBuffer.RemoveItemAndDoSomethingWithIt();
else
DoSomethingDuringTheFirstPass();
});
Although I suspect you might be looking for the following extension methods:
public static IEnumerable<IEnumerable<T>> GetParrallelConsumingEnumerable<T>(this IProducerConsumerCollection<T> collection)
{
T item;
while (collection.TryTake(out item))
{
yield return GetParrallelConsumingEnumerableInner(collection, item);
}
}
private static IEnumerable<T> GetParrallelConsumingEnumerableInner<T>(IProducerConsumerCollection<T> collection, T item)
{
yield return item;
while (collection.TryTake(out item))
{
yield return item;
}
}
Which will let get you this result (which I think is what you are after):
Parallel.ForEach(TestBuffer.GetParrallelConsumingEnumerable(), Items =>
{
foreach(var item in Items)
{
DoSomethingWithItem(item);
}
});
for/foreach are usually used for performing a task on multiple items.
while-do/do-while are for:
a. Performing a task on multiple items
that have not yet been enumerated (e.g. a tree).
- In this case you can define a BFS or DFS enumerator and use in a foreach.
b. Performing iterative work on a single item
- Iterative work is not suitable for parallelism
Do not attempt to refactor code from serial to parallel. Instead, consider what your assignment is and how it is best done in parallel. (Refactor algorithm, not code.)
public static void While(
ParallelOptions parallelOptions, Func<bool> condition,
Action<ParallelLoopState> body)
{
Parallel.ForEach(Infinite(), parallelOptions, (ignored, loopState) =>
{
if (condition()) body(loopState);
else loopState.Stop();
});
}
private static IEnumerable<bool> Infinite()
{
while (true) yield return true;
}
Related
new poster here so I hope this makes sense ...
I need to create a collection that I can remove items from in sequence (basically stock market time series data).
The data producer is multi-threaded and doesn't guarantee that the data will come in sequence.
I've looked all around for a solution but the only thing I can come up with is to create my own custom dictionary, using ConcurrentDictionary and implementing the IProducerConsumer interface so it can be used with with BlockingCollection.
The code I have below does work, but produces an error
System.InvalidOperationException: The underlying collection was
modified from outside of the BlockingCollection
when using the GetConsumingEnumerable() for loop, and the next key in the sequence is not present in the dictionary. In this instance I would like to wait for a specified amount of time
and then attempt to take the item from the queue again.
My questions is:
What's the best way to handle the error when there is no key present. At the moment it seems handling the error would require exiting the loop. Perhaps using GetConsumingEnumerable() is not the right way to consume and a while loop would work better?
Code is below - any help/ideas much appreciated.
IProducerConsumer implementation:
public abstract class BlockingDictionary<TKey, TValue> : IProducerConsumerCollection<KeyValuePair<TKey, TValue>> where TKey : notnull
{
protected ConcurrentDictionary<TKey, TValue> _dictionary = new ConcurrentDictionary<TKey, TValue>();
int ICollection.Count => _dictionary.Count;
bool ICollection.IsSynchronized => false;
object ICollection.SyncRoot => throw new NotSupportedException();
public void CopyTo(KeyValuePair<TKey, TValue>[] array, int index)
{
if (array == null)
{
throw new ArgumentNullException("array");
}
_dictionary.ToList().CopyTo(array, index);
}
void ICollection.CopyTo(Array array, int index)
{
if (array == null)
{
throw new ArgumentNullException("array");
}
((ICollection)_dictionary.ToList()).CopyTo(array, index);
}
public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
return ((IEnumerable<KeyValuePair<TKey, TValue>>)_dictionary).GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return ((IEnumerable<KeyValuePair<TKey, TValue>>)this).GetEnumerator();
}
public KeyValuePair<TKey, TValue>[] ToArray()
{
return _dictionary.ToList().ToArray();
}
bool IProducerConsumerCollection<KeyValuePair<TKey, TValue>>.TryAdd(KeyValuePair<TKey, TValue> item)
{
return _dictionary.TryAdd(item.Key, item.Value);
}
public virtual bool TryTake(out KeyValuePair<TKey, TValue> item)
{
item = this.FirstOrDefault();
TValue? value;
return _dictionary.TryRemove(item.Key, out value);
}
}
Time Sequence queue implementation (inherits above)
public class TimeSequenceQueue<T> : BlockingDictionary<DateTime, T>
{
private DateTime _previousTime;
private DateTime _nextTime;
private readonly int _intervalSeconds;
public TimeSequenceQueue(DateTime startTime, int intervalSeconds)
{
_intervalSeconds = intervalSeconds;
_previousTime = startTime;
_nextTime = startTime;
}
public override bool TryTake([MaybeNullWhen(false)] out KeyValuePair<DateTime, T> item)
{
item = _dictionary.SingleOrDefault(x => x.Key == _nextTime);
T? value = default(T);
if (item.Value == null)
return false;
bool result = _dictionary.TryRemove(item.Key, out value);
if (result)
{
_previousTime = _nextTime;
_nextTime = _nextTime.AddSeconds(_intervalSeconds);
}
return result;
}
}
Usage:
BlockingCollection<KeyValuePair<DateTime, object>> _queue = new BlockingCollection<KeyValuePair<DateTime, object>>(new TimeSequenceQueue<object>());
Consuming loop - started in new thread:
foreach (var item in _queue.GetConsumingEnumerable())
{
// feed downstream
}
When using the GetConsumingEnumerable() for loop, and the next key in the sequence is not present in the dictionary [...] I would like to wait for a specified amount of time and then attempt to take the item from the queue again.
I will try to answer this question generally, without paying too much attention to the specifics of your problem. So let's say that you are consuming
a BlockingCollection<T> like this:
foreach (var item in collection.GetConsumingEnumerable())
{
// Do something with the consumed item.
}
...and you want to avoid waiting indefinitely for an item to arrive. You want to wake up every 5 seconds and do something, before waiting/sleeping again.
Here is how you could do it:
while (!collection.IsCompleted)
{
bool consumed = collection.TryTake(out var item, TimeSpan.FromSeconds(5));
if (consumed)
{
// Do something with the consumed item.
}
else
{
// Do something before trying again to take an item.
}
}
The above pattern imitates the actual source code of the BlockingCollection<T>.GetConsumingEnumerable method.
If you want to get fancy you could incorporate this functionality in a custom extension method for the BlockingCollection<T> class, like this:
public static IEnumerable<(bool Consumed, T Item)> GetConsumingEnumerable<T>(
this BlockingCollection<T> source, TimeSpan timeout)
{
while (!source.IsCompleted)
{
bool consumed = source.TryTake(out var item, timeout);
yield return (consumed, item);
}
}
Usage example:
foreach (var (consumed, item) in collection.GetConsumingEnumerable(
TimeSpan.FromSeconds(5)))
{
// Do something depending on whether an item was consumed or not.
}
I have method Foo, which do some CPU intensive computations and returns IEnumerable<T> sequence. I need to check, if that sequence is empty. And if not, call method Bar with that sequence as argument.
I thought about three approaches...
Check, if sequence is empty with Any(). This is ok, if sequence is really empty, which will be case most of the times. But it will have horrible performance, if sequence will contains some elements and Foo will need them compute again...
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Check, if sequence is empty without actually reset the sequence. This sounds like win-win, but I can't find any easy build-in way, how to do it. So I create this obscure workaround and wondering, whether this is really a best approach.
Condition
var source = Foo();
if (!IsEmpty(ref source))
Bar(source);
with IsEmpty implemented as
bool IsEmpty<T>(ref IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
}
}
Also note, that calling Bar with empty sequence is not option...
EDIT:
After some consideration, best answer for my case is from Olivier Jacot-Descombes - avoid that scenario completely. Accepted solution answers this question - if it is really no other way.
I don't know whether your algorithm in Foo allows to determine if the enumeration will be empty without doing the calculations. But if this is the case, return null if the sequence would be empty:
public IEnumerable<T> Foo()
{
if (<check if sequence will be empty>) {
return null;
}
return GetSequence();
}
private IEnumerable<T> GetSequence()
{
...
yield return item;
...
}
Note that if a method uses yield return, it cannot use a simple return to return null. Therefore a second method is needed.
var sequence = Foo();
if (sequence != null) {
Bar(sequence);
}
After reading one of your comments
Foo need to initialize some resources, parse XML file and fill some HashSets, which will be used to filter (yield) returned data.
I suggest another approach. The time consuming part seems to be the initialization. To be able to separate it from the iteration, create a foo calculator class. Something like:
public class FooCalculator<T>
{
private bool _isInitialized;
private string _file;
public FooCalculator(string file)
{
_file = file;
}
private EnsureInitialized()
{
if (_isInitialized) return;
// Parse XML.
// Fill some HashSets.
_isInitialized = true;
}
public IEnumerable<T> Result
{
get {
EnsureInitialized();
...
yield return ...;
...
}
}
}
This ensures that the costly initialization stuff is executed only once. Now you can safely use Any().
Other optimizations are conceivable. The Result property could remember the position of the first returned element, so that if it is called again, it could skip to it immediately.
You would like to call some function Bar<T>(IEnumerable<T> source) if and only if the enumerable source contains at least one element, but you're running into two problems:
There is no method T Peek() in IEnumerable<T> so you would need to actually begin to evaluate the enumerable to see if it's nonempty, but...
You don't want to even partially double-evaluate the enumerable since setting up the enumerable might be expensive.
In that case your approach looks reasonable. You do, however, have some issues with your imlementation:
You need to dispose enumerator after using it.
As pointed out by Ivan Stoev in comments, if the Bar() method attempts to evaluate the IEnumerable<T> more than once (e.g. by calling Any() then foreach (...)) then the results will be undefined because usedEnumerator will have been exhausted by the first enumeration.
To resolve these issues, I'd suggest modifying your API a little and create an extension method IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func) that calls a specified method only if the sequence is nonempty, as shown below:
public static partial class EnumerableExtensions
{
public static bool IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func)
{
if (source == null|| func == null)
throw new ArgumentNullException();
using (var enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
return false;
func(new UsedEnumerator<T>(enumerator));
return true;
}
}
class UsedEnumerator<T> : IEnumerable<T>
{
IEnumerator<T> usedEnumerator;
public UsedEnumerator(IEnumerator<T> usedEnumerator)
{
if (usedEnumerator == null)
throw new ArgumentNullException();
this.usedEnumerator = usedEnumerator;
}
public IEnumerator<T> GetEnumerator()
{
var localEnumerator = System.Threading.Interlocked.Exchange(ref usedEnumerator, null);
if (localEnumerator == null)
// An attempt has been made to enumerate usedEnumerator more than once;
// throw an exception since this is not allowed.
throw new InvalidOperationException();
yield return localEnumerator.Current;
while (localEnumerator.MoveNext())
{
yield return localEnumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
Demo fiddle with unit tests here.
If you can change Bar then how about change it to TryBar that returns false when IEnumerable<T> was empty?
bool TryBar(IEnumerable<Foo> source)
{
var count = 0;
foreach (var x in source)
{
count++;
}
return count > 0;
}
If that doesn't work for you could always create your own IEnumerable<T> wrapper that caches values after they have been iterated once.
One improvement for your IsEmpty would be to check if source is ICollection<T>, and if it is, check .Count (also, dispose the enumerator):
bool IsEmpty<T>(ref IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
return collection.Count == 0;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
enumerator.Dispose();
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
This will work for arrays and lists.
I would, however, rework IsEmpty to return:
IEnumerable<T> NotEmpty<T>(IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
if (collection.Count == 0)
{
return null;
}
return source;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
return CreateIEnumerable(enumerator);
}
enumerator.Dispose();
return null;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
Now, you would check if it returned null.
The accepted answer is probably the best approach but, based on, and I quote:
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Another take would be creating an IEnumerable<T> that partially caches the underlying enumeration. Something along the following lines:
interface IDisposableEnumerable<T>
:IEnumerable<T>, IDisposable
{
}
static class PartiallyCachedEnumerable
{
public static IDisposableEnumerable<T> Create<T>(
IEnumerable<T> source,
int cachedCount)
{
if (source == null)
throw new NullReferenceException(
nameof(source));
if (cachedCount < 1)
throw new ArgumentOutOfRangeException(
nameof(cachedCount));
return new partiallyCachedEnumerable<T>(
source, cachedCount);
}
private class partiallyCachedEnumerable<T>
: IDisposableEnumerable<T>
{
private readonly IEnumerator<T> enumerator;
private bool disposed;
private readonly List<T> cache;
private readonly bool hasMoreItems;
public partiallyCachedEnumerable(
IEnumerable<T> source,
int cachedCount)
{
Debug.Assert(source != null);
Debug.Assert(cachedCount > 0);
enumerator = source.GetEnumerator();
cache = new List<T>(cachedCount);
var count = 0;
while (enumerator.MoveNext() &&
count < cachedCount)
{
cache.Add(enumerator.Current);
count += 1;
}
hasMoreItems = !(count < cachedCount);
}
public void Dispose()
{
if (disposed)
return;
enumerator.Dispose();
disposed = true;
}
public IEnumerator<T> GetEnumerator()
{
foreach (var t in cache)
yield return t;
if (disposed)
yield break;
while (enumerator.MoveNext())
{
yield return enumerator.Current;
cache.Add(enumerator.Current)
}
Dispose();
}
IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
}
}
I got the following extension method:
static class ExtensionMethods
{
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(
this IEnumerable<T> input,
int subsequenceLength)
{
var enumerator = input.GetEnumerator();
SubsequenciseParameter parameter = new SubsequenciseParameter
{
Next = enumerator.MoveNext()
};
while (parameter.Next)
yield return getSubSequence(
enumerator,
subsequenceLength,
parameter);
}
private static IEnumerable<T> getSubSequence<T>(
IEnumerator<T> enumerator,
int subsequenceLength,
SubsequenciseParameter parameter)
{
do
{
lock (enumerator) // this lock makes it "work"
{ // removing this causes exceptions.
if (parameter.Next)
yield return enumerator.Current;
}
} while ((parameter.Next = enumerator.MoveNext())
&& --subsequenceLength > 0);
}
// Needed since you cant use out or ref in yield-return methods...
class SubsequenciseParameter
{
public bool Next { get; set; }
}
}
Its purpose is to split a sequence into subsequences of a given size.
Calling it like this:
foreach (var sub in "abcdefghijklmnopqrstuvwxyz"
.Subsequencise(3)
.**AsParallel**()
.Select(sub =>new String(sub.ToArray()))
{
Console.WriteLine(sub);
}
Console.ReadKey();
works, however there are some empty lines in-between since some of the threads are "too late" and enter the first yield return.
I tried putting more locks everywhere, however I cannot achieve to make this work correct in combination with as parallel.
It's obvious that this example doesn't justify the use of as parallel at all. It is just to demonstrate how the method could be called.
The problem is that using iterators is lazy evaluated, so you return a lazily evaluated iterator which gets used from multiple threads.
You can fix this by rewriting your method as follows:
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(this IEnumerable<T> input, int subsequenceLength)
{
var syncObj = new object();
var enumerator = input.GetEnumerator();
if (!enumerator.MoveNext())
{
yield break;
}
List<T> currentList = new List<T> { enumerator.Current };
int length = 1;
while (enumerator.MoveNext())
{
if (length == subsequenceLength)
{
length = 0;
yield return currentList;
currentList = new List<T>();
}
currentList.Add(enumerator.Current);
++length;
}
yield return currentList;
}
This performs the same function, but doesn't use an iterator to implement the "nested" IEnumerable<T>, avoiding the problem. Note that this also avoids the locking as well as the custom SubsequenciseParameter type.
Is there any prior work of adding tasks to the TPL runtime with a varying priority?
If not, generally speaking, how would I implement this?
Ideally I plan on using the producer-consumer pattern to add "todo" work to the TPL. There may be times where I discover that a low priority job needs to be upgraded to a high priority job (relative to the others).
If anyone has some search keywords I should use when searching for this, please mention them, since I haven't yet found code that will do what I need.
So here is a rather naive concurrent implementation around a rather naive priority queue. The idea here is that there is a sorted set that holds onto pairs of both the real item and a priority, but is given a comparer that just compares the priority. The constructor takes a function that computes the priority for a given object.
As for actual implementation, they're not efficiently implemented, I just lock around everything. Creating more efficient implementations would prevent the use of SortedSet as a priority queue, and re-implementing one of those that can be effectively accessed concurrently is not going to be that easy.
In order to change the priority of an item you'll need to remove the item from the set and then add it again, and to find it without iterating the whole set you'd need to know the old priority as well as the new priority.
public class ConcurrentPriorityQueue<T> : IProducerConsumerCollection<T>
{
private object key = new object();
private SortedSet<Tuple<T, int>> set;
private Func<T, int> prioritySelector;
public ConcurrentPriorityQueue(Func<T, int> prioritySelector, IComparer<T> comparer = null)
{
this.prioritySelector = prioritySelector;
set = new SortedSet<Tuple<T, int>>(
new MyComparer<T>(comparer ?? Comparer<T>.Default));
}
private class MyComparer<T> : IComparer<Tuple<T, int>>
{
private IComparer<T> comparer;
public MyComparer(IComparer<T> comparer)
{
this.comparer = comparer;
}
public int Compare(Tuple<T, int> first, Tuple<T, int> second)
{
var returnValue = first.Item2.CompareTo(second.Item2);
if (returnValue == 0)
returnValue = comparer.Compare(first.Item1, second.Item1);
return returnValue;
}
}
public bool TryAdd(T item)
{
lock (key)
{
return set.Add(Tuple.Create(item, prioritySelector(item)));
}
}
public bool TryTake(out T item)
{
lock (key)
{
if (set.Count > 0)
{
var first = set.First();
item = first.Item1;
return set.Remove(first);
}
else
{
item = default(T);
return false;
}
}
}
public bool ChangePriority(T item, int oldPriority, int newPriority)
{
lock (key)
{
if (set.Remove(Tuple.Create(item, oldPriority)))
{
return set.Add(Tuple.Create(item, newPriority));
}
else
return false;
}
}
public bool ChangePriority(T item)
{
lock (key)
{
var result = set.FirstOrDefault(pair => object.Equals(pair.Item1, item));
if (object.Equals(result.Item1, item))
{
return ChangePriority(item, result.Item2, prioritySelector(item));
}
else
{
return false;
}
}
}
public void CopyTo(T[] array, int index)
{
lock (key)
{
foreach (var item in set.Select(pair => pair.Item1))
{
array[index++] = item;
}
}
}
public T[] ToArray()
{
lock (key)
{
return set.Select(pair => pair.Item1).ToArray();
}
}
public IEnumerator<T> GetEnumerator()
{
return ToArray().AsEnumerable().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public void CopyTo(Array array, int index)
{
lock (key)
{
foreach (var item in set.Select(pair => pair.Item1))
{
array.SetValue(item, index++);
}
}
}
public int Count
{
get { lock (key) { return set.Count; } }
}
public bool IsSynchronized
{
get { return true; }
}
public object SyncRoot
{
get { return key; }
}
}
Once you have an IProducerConsumerCollection<T> instance, which the above object is, you can use it as the internal backing object of a BlockingCollection<T> in order to have an easier to use user interface.
ParallelExtensionsExtras contains several custom TaskSchedulers that could be helpful either directly or as a base for your own scheduler.
Specifically, there are two schedulers that may be interesting for you:
QueuedTaskScheduler, which allows you to schedule Tasks at different priorities, but doesn't allow changing the priority of enqueued Tasks.
ReprioritizableTaskScheduler, which doesn't have different priorities, but allows you to move a specific Task to the front or to the back of the queue. (Though changing priority is O(n) in the number of currently waiting Tasks, which could be a problem if you had many Tasks at the same time.)
Is there a built-in way to convert IEnumerator<T> to IEnumerable<T>?
The easiest way of converting I can think of is via the yield statement
public static IEnumerable<T> ToIEnumerable<T>(this IEnumerator<T> enumerator) {
while ( enumerator.MoveNext() ) {
yield return enumerator.Current;
}
}
compared to the list version this has the advantage of not enumerating the entire list before returning an IEnumerable. using the yield statement you'd only iterate over the items you need, whereas using the list version, you'd first iterate over all items in the list and then all the items you need.
for a little more fun you could change it to
public static IEnumerable<K> Select<K,T>(this IEnumerator<T> e,
Func<K,T> selector) {
while ( e.MoveNext() ) {
yield return selector(e.Current);
}
}
you'd then be able to use linq on your enumerator like:
IEnumerator<T> enumerator;
var someList = from item in enumerator
select new classThatTakesTInConstructor(item);
You could use the following which will kinda work.
public class FakeEnumerable<T> : IEnumerable<T> {
private IEnumerator<T> m_enumerator;
public FakeEnumerable(IEnumerator<T> e) {
m_enumerator = e;
}
public IEnumerator<T> GetEnumerator() {
return m_enumerator;
}
// Rest omitted
}
This will get you into trouble though when people expect successive calls to GetEnumerator to return different enumerators vs. the same one. But if it's a one time only use in a very constrained scenario, this could unblock you.
I do suggest though you try and not do this because I think eventually it will come back to haunt you.
A safer option is along the lines Jonathan suggested. You can expend the enumerator and create a List<T> of the remaining items.
public static List<T> SaveRest<T>(this IEnumerator<T> e) {
var list = new List<T>();
while ( e.MoveNext() ) {
list.Add(e.Current);
}
return list;
}
EnumeratorEnumerable<T>
A threadsafe, resettable adaptor from IEnumerator<T> to IEnumerable<T>
I use Enumerator parameters like in C++ forward_iterator concept.
I agree that this can lead to confusion as too many people will indeed assume Enumerators are /like/ Enumerables, but they are not.
However, the confusion is fed by the fact that IEnumerator contains the Reset method. Here is my idea of the most correct implementation. It leverages the implementation of IEnumerator.Reset()
A major difference between an Enumerable and and Enumerator is, that an Enumerable might be able to create several Enumerators simultaneously. This implementation puts a whole lot of work into making sure that this never happens for the EnumeratorEnumerable<T> type. There are two EnumeratorEnumerableModes:
Blocking (meaning that a second caller will simply wait till the first enumeration is completed)
NonBlocking (meaning that a second (concurrent) request for an enumerator simply throws an exception)
Note 1: 74 lines are implementation, 79 lines are testing code :)
Note 2: I didn't refer to any unit testing framework for SO convenience
using System;
using System.Diagnostics;
using System.Linq;
using System.Collections;
using System.Collections.Generic;
using System.Threading;
namespace EnumeratorTests
{
public enum EnumeratorEnumerableMode
{
NonBlocking,
Blocking,
}
public sealed class EnumeratorEnumerable<T> : IEnumerable<T>
{
#region LockingEnumWrapper
public sealed class LockingEnumWrapper : IEnumerator<T>
{
private static readonly HashSet<IEnumerator<T>> BusyTable = new HashSet<IEnumerator<T>>();
private readonly IEnumerator<T> _wrap;
internal LockingEnumWrapper(IEnumerator<T> wrap, EnumeratorEnumerableMode allowBlocking)
{
_wrap = wrap;
if (allowBlocking == EnumeratorEnumerableMode.Blocking)
Monitor.Enter(_wrap);
else if (!Monitor.TryEnter(_wrap))
throw new InvalidOperationException("Thread conflict accessing busy Enumerator") {Source = "LockingEnumWrapper"};
lock (BusyTable)
{
if (BusyTable.Contains(_wrap))
throw new LockRecursionException("Self lock (deadlock) conflict accessing busy Enumerator") { Source = "LockingEnumWrapper" };
BusyTable.Add(_wrap);
}
// always implicit Reset
_wrap.Reset();
}
#region Implementation of IDisposable and IEnumerator
public void Dispose()
{
lock (BusyTable)
BusyTable.Remove(_wrap);
Monitor.Exit(_wrap);
}
public bool MoveNext() { return _wrap.MoveNext(); }
public void Reset() { _wrap.Reset(); }
public T Current { get { return _wrap.Current; } }
object IEnumerator.Current { get { return Current; } }
#endregion
}
#endregion
private readonly IEnumerator<T> _enumerator;
private readonly EnumeratorEnumerableMode _allowBlocking;
public EnumeratorEnumerable(IEnumerator<T> e, EnumeratorEnumerableMode allowBlocking)
{
_enumerator = e;
_allowBlocking = allowBlocking;
}
private LockRecursionPolicy a;
public IEnumerator<T> GetEnumerator()
{
return new LockingEnumWrapper(_enumerator, _allowBlocking);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class TestClass
{
private static readonly string World = "hello world\n";
public static void Main(string[] args)
{
var master = World.GetEnumerator();
var nonblocking = new EnumeratorEnumerable<char>(master, EnumeratorEnumerableMode.NonBlocking);
var blocking = new EnumeratorEnumerable<char>(master, EnumeratorEnumerableMode.Blocking);
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
try
{
var willRaiseException = from c1 in nonblocking from c2 in nonblocking select new {c1, c2};
Console.WriteLine("Cartesian product: {0}", willRaiseException.Count()); // RAISE
}
catch (Exception e) { Console.WriteLine(e); }
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
try
{
var willSelfLock = from c1 in blocking from c2 in blocking select new { c1, c2 };
Console.WriteLine("Cartesian product: {0}", willSelfLock.Count()); // LOCK
}
catch (Exception e) { Console.WriteLine(e); }
// should not externally throw (exceptions on other threads reported to console)
if (ThreadConflictCombinations(blocking, nonblocking))
throw new InvalidOperationException("Should have thrown an exception on background thread");
if (ThreadConflictCombinations(nonblocking, nonblocking))
throw new InvalidOperationException("Should have thrown an exception on background thread");
if (ThreadConflictCombinations(nonblocking, blocking))
Console.WriteLine("Background thread timed out");
if (ThreadConflictCombinations(blocking, blocking))
Console.WriteLine("Background thread timed out");
Debug.Assert(true); // Must be reached
}
private static bool ThreadConflictCombinations(IEnumerable<char> main, IEnumerable<char> other)
{
try
{
using (main.GetEnumerator())
{
var bg = new Thread(o =>
{
try { other.GetEnumerator(); }
catch (Exception e) { Report(e); }
}) { Name = "background" };
bg.Start();
bool timedOut = !bg.Join(1000); // observe the thread waiting a full second for a lock (or throw the exception for nonblocking)
if (timedOut)
bg.Abort();
return timedOut;
}
} catch
{
throw new InvalidProgramException("Cannot be reached");
}
}
static private readonly object ConsoleSynch = new Object();
private static void Report(Exception e)
{
lock (ConsoleSynch)
Console.WriteLine("Thread:{0}\tException:{1}", Thread.CurrentThread.Name, e);
}
}
}
Note 3: I think the implementation of the thread locking (especially around BusyTable) is quite ugly; However, I didn't want to resort to ReaderWriterLock(LockRecursionPolicy.NoRecursion) and didn't want to assume .Net 4.0 for SpinLock
Solution with use of Factory along with fixing cached IEnumerator issue in JaredPar's answer allows to change the way of enumeration.
Consider a simple example: we want custom List<T> wrapper that allow to enumerate in reverse order along with default enumeration. List<T> already implements IEnumerator for default enumeration, we only need to create IEnumerator that enumerates in reverse order. (We won't use List<T>.AsEnumerable().Reverse() because it enumerates the list twice)
public enum EnumerationType {
Default = 0,
Reverse
}
public class CustomList<T> : IEnumerable<T> {
private readonly List<T> list;
public CustomList(IEnumerable<T> list) => this.list = new List<T>(list);
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
//Default IEnumerable method will return default enumerator factory
public IEnumerator<T> GetEnumerator()
=> GetEnumerable(EnumerationType.Default).GetEnumerator();
public IEnumerable<T> GetEnumerable(EnumerationType enumerationType)
=> enumerationType switch {
EnumerationType.Default => new DefaultEnumeratorFactory(list),
EnumerationType.Reverse => new ReverseEnumeratorFactory(list)
};
//Simple implementation of reverse list enumerator
private class ReverseEnumerator : IEnumerator<T> {
private readonly List<T> list;
private int index;
internal ReverseEnumerator(List<T> list) {
this.list = list;
index = list.Count-1;
Current = default;
}
public void Dispose() { }
public bool MoveNext() {
if(index >= 0) {
Current = list[index];
index--;
return true;
}
Current = default;
return false;
}
public T Current { get; private set; }
object IEnumerator.Current => Current;
void IEnumerator.Reset() {
index = list.Count - 1;
Current = default;
}
}
private abstract class EnumeratorFactory : IEnumerable<T> {
protected readonly List<T> List;
protected EnumeratorFactory(List<T> list) => List = list;
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public abstract IEnumerator<T> GetEnumerator();
}
private class DefaultEnumeratorFactory : EnumeratorFactory {
public DefaultEnumeratorFactory(List<T> list) : base(list) { }
//Default enumerator is already implemented in List<T>
public override IEnumerator<T> GetEnumerator() => List.GetEnumerator();
}
private class ReverseEnumeratorFactory : EnumeratorFactory {
public ReverseEnumeratorFactory(List<T> list) : base(list) { }
public override IEnumerator<T> GetEnumerator() => new ReverseEnumerator(List);
}
}
As Jason Watts said -- no, not directly.
If you really want to, you could loop through the IEnumerator<T>, putting the items into a List<T>, and return that, but I'm guessing that's not what you're looking to do.
The basic reason you can't go that direction (IEnumerator<T> to a IEnumerable<T>) is that IEnumerable<T> represents a set that can be enumerated, but IEnumerator<T> is a specific enumeratation over a set of items -- you can't turn the specific instance back into the thing that created it.
static class Helper
{
public static List<T> SaveRest<T>(this IEnumerator<T> enumerator)
{
var list = new List<T>();
while (enumerator.MoveNext())
{
list.Add(enumerator.Current);
}
return list;
}
public static ArrayList SaveRest(this IEnumerator enumerator)
{
var list = new ArrayList();
while (enumerator.MoveNext())
{
list.Add(enumerator.Current);
}
return list;
}
}
Nope, IEnumerator<> and IEnumerable<> are different beasts entirely.
This is a variant I have written... The specific is a little different. I wanted to do a MoveNext() on an IEnumerable<T>, check the result, and then roll everything in a new IEnumerator<T> that was "complete" (so that included even the element of the IEnumerable<T> I had already extracted)
// Simple IEnumerable<T> that "uses" an IEnumerator<T> that has
// already received a MoveNext(). "eats" the first MoveNext()
// received, then continues normally. For shortness, both IEnumerable<T>
// and IEnumerator<T> are implemented by the same class. Note that if a
// second call to GetEnumerator() is done, the "real" IEnumerator<T> will
// be returned, not this proxy implementation.
public class EnumerableFromStartedEnumerator<T> : IEnumerable<T>, IEnumerator<T>
{
public readonly IEnumerator<T> Enumerator;
public readonly IEnumerable<T> Enumerable;
// Received by creator. Return value of MoveNext() done by caller
protected bool FirstMoveNextSuccessful { get; set; }
// The Enumerator can be "used" only once, then a new enumerator
// can be requested by Enumerable.GetEnumerator()
// (default = false)
protected bool Used { get; set; }
// The first MoveNext() has been already done (default = false)
protected bool DoneMoveNext { get; set; }
public EnumerableFromStartedEnumerator(IEnumerator<T> enumerator, bool firstMoveNextSuccessful, IEnumerable<T> enumerable)
{
Enumerator = enumerator;
FirstMoveNextSuccessful = firstMoveNextSuccessful;
Enumerable = enumerable;
}
public IEnumerator<T> GetEnumerator()
{
if (Used)
{
return Enumerable.GetEnumerator();
}
Used = true;
return this;
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public T Current
{
get
{
// There are various school of though on what should
// happens if called before the first MoveNext() or
// after a MoveNext() returns false. We follow the
// "return default(TInner)" school of thought for the
// before first MoveNext() and the "whatever the
// Enumerator wants" for the after a MoveNext() returns
// false
if (!DoneMoveNext)
{
return default(T);
}
return Enumerator.Current;
}
}
public void Dispose()
{
Enumerator.Dispose();
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public bool MoveNext()
{
if (!DoneMoveNext)
{
DoneMoveNext = true;
return FirstMoveNextSuccessful;
}
return Enumerator.MoveNext();
}
public void Reset()
{
// This will 99% throw :-) Not our problem.
Enumerator.Reset();
// So it is improbable we will arrive here
DoneMoveNext = true;
}
}
Use:
var enumerable = someCollection<T>;
var enumerator = enumerable.GetEnumerator();
bool res = enumerator.MoveNext();
// do whatever you want with res/enumerator.Current
var enumerable2 = new EnumerableFromStartedEnumerator<T>(enumerator, res, enumerable);
Now, the first GetEnumerator() that will be requested to enumerable2 will be given through the enumerator enumerator. From the second onward the enumerable.GetEnumerator() will be used.
The other answers here are ... strange. IEnumerable<T> has just one method, GetEnumerator(). And an IEnumerable<T> must implement IEnumerable, which also has just one method, GetEnumerator() (the difference being that one is generic on T and the other is not). So it should be clear how to turn an IEnumerator<T> into an IEnumerable<T>:
// using modern expression-body syntax
public class IEnumeratorToIEnumerable<T> : IEnumerable<T>
{
private readonly IEnumerator<T> Enumerator;
public IEnumeratorToIEnumerable(IEnumerator<T> enumerator) =>
Enumerator = enumerator;
public IEnumerator<T> GetEnumerator() => Enumerator;
IEnumerator IEnumerable.GetEnumerator() => Enumerator;
}
foreach (var foo in new IEnumeratorToIEnumerable<Foo>(fooEnumerator))
DoSomethingWith(foo);
// and you can also do:
var fooEnumerable = new IEnumeratorToIEnumerable<Foo>(fooEnumerator);
foreach (var foo in fooEnumerable)
DoSomethingWith(foo);
// Some IEnumerators automatically repeat after MoveNext() returns false,
// in which case this is a no-op, but generally it's required.
fooEnumerator.Reset();
foreach (var foo in fooEnumerable)
DoSomethingElseWith(foo);
However, none of this should be needed because it's unusual to have an IEnumerator<T> that doesn't come with an IEnumerable<T> that returns an instance of it from its GetEnumerator method. If you're writing your own IEnumerator<T>, you should certainly provide the IEnumerable<T>. And really it's the other way around ... an IEnumerator<T> is intended to be a private class that iterates over instances of a public class that implements IEnumerable<T>.