Writing custom IEnumerator<T> with iterators - c#

How can I write a custom IEnumerator<T> implementation which needs to maintain some state and still get to use iterator blocks to simplify it? The best I can come up with is something like this:
public class MyEnumerator<T> : IEnumerator<T> {
private IEnumerator<T> _enumerator;
public int Position {get; private set;} // or some other custom properties
public MyEnumerator() {
Position = 0;
_enumerator = MakeEnumerator();
}
private IEnumerator<T> MakeEnumerator() {
// yield return something depending on Position
}
public bool MoveNext() {
bool res = _enumerator.MoveNext();
if (res) Position++;
return res;
}
// delegate Reset and Current to _enumerator as well
}
public class MyCollection<T> : IEnumerable<T> {
IEnumerator<T> IEnumerable<T>.GetEnumerator() {
return GetEnumerator();
}
public MyEnumerator<T> GetEnumerator() {
return new MyEnumerator<T>();
}
...
}

Why do you want to write an iterator class? The whole point of an iterator block is so you don't have to...
i.e.
public IEnumerator<T> GetEnumerator() {
int position = 0; // state
while(whatever) {
position++;
yield return ...something...;
}
}
If you add more context (i,e, why the above can't work), we can probably help more.
But if possible, avoid writing an iterator class. They are lots of work, and easy to get wrong.
By the way, you don't really have to bother with Reset - it is largely deprecated, and shouldn't really ever be used (since it can't be relied to work for an arbitrary enumerator).
If you want to consume an inner iterator, that is fine too:
int position = 0;
foreach(var item in source) {
position++;
yield return position;
}
or if you only have an enumerator:
while(iter.MoveNext()) {
position++;
yield return iter.Current;
}
You might also consider adding the state (as a tuple) to the thing you yield:
class MyState<T> {
public int Position {get;private set;}
public T Current {get;private set;}
public MyState(int position, T current) {...} // assign
}
...
yield return new MyState<Foo>(position, item);
Finally, you could use a LINQ-style extension/delegate approach, with an Action<int,T> to supply the position and value to the caller:
static void Main() {
var values = new[] { "a", "b", "c" };
values.ForEach((pos, s) => Console.WriteLine("{0}: {1}", pos, s));
}
static void ForEach<T>(
this IEnumerable<T> source,
Action<int, T> action) {
if (source == null) throw new ArgumentNullException("source");
if (action == null) throw new ArgumentNullException("action");
int position = 0;
foreach (T item in source) {
action(position++, item);
}
}
Outputs:
0: a
1: b
2: c

I'd have to concur with Marc here. Either write an enumerator class completely yourself if you really want to (just because you can?) or simply use an interator block and yield statements and be done with it. Personally, I'm never touching enumerator classes again. ;-)

#Marc Gravell
But if possible, avoid writing an iterator class. They are lots of work, and easy to get wrong.
That's precisely why I want to use yield machinery inside my iterator, to do the heavy lifting.
You might also consider adding the state (as a tuple) to the thing you yield:
Yes, that works. However, it's an extra allocation at each and every step. If I am interested only in T at most steps, that's an overhead I don't need if I can avoid it.
However, your last suggestion gave me an idea:
public IEnumerator<T> GetEnumerator(Action<T, int> action) {
int position = 0; // state
while(whatever) {
position++;
var t = ...something...;
action(t, position);
yield return t;
}
}
public IEnumerator<T> GetEnumerator() {
return GetEnumerator(DoNothing<T, int>());
}

I made a pretty simple Iterator that borrows the default Enumerator to do most (really all) of the work. The constructor takes an IEnumerator<T> and my implementation simply hands it the work. I've added an Index field to my custom Iterator.
I made a simple example here: https://dotnetfiddle.net/0iGmVz
To use this Iterator setup you would have something like this in your custom Collection/List class:
public class MyList<T> : List<T>{
public new IEnumerator<T> GetEnumerator(){
return new IndexedEnumerator<T>(base.GetEnumerator());
}
}
Now foreach and other built-in functions will get your custom enumerator, and any behavior you don't want to override will use the normal implementation.
public static class Helpers{
//Extension method to get the IndexEnumerator
public static IndexedEnumerator<T> GetIndexedEnumerator<T>(this IEnumerable<T> list){
return new IndexedEnumerator<T>(list.GetEnumerator());
}
}
//base Enumerator methods/implementation
public class BaseEnumerator<T> : IEnumerator<T>{
public BaseEnumerator(IEnumerator<T> enumer){
enumerator = enumer;
}
protected virtual IEnumerator<T> enumerator{get;set;}
protected virtual T current {get;set;}
public virtual bool MoveNext(){
return enumerator.MoveNext();
}
public virtual IEnumerator<T> GetEnumerator(){
return enumerator;
}
public virtual T Current {get{return enumerator.Current;}}
object IEnumerator.Current {get{return enumerator.Current;}}
public virtual void Reset(){}
public virtual void Dispose(){}
}
public class IndexedEnumerator<T> : BaseEnumerator<T>
{
public IndexedEnumerator(IEnumerator<T> enumer):base(enumer){}
public int Index {get; private set;}
public override bool MoveNext(){
Index++;
return enumerator.MoveNext();
}
}

Related

Calling method with IEnumerable<T> sequence as argument, if that sequence is not empty

I have method Foo, which do some CPU intensive computations and returns IEnumerable<T> sequence. I need to check, if that sequence is empty. And if not, call method Bar with that sequence as argument.
I thought about three approaches...
Check, if sequence is empty with Any(). This is ok, if sequence is really empty, which will be case most of the times. But it will have horrible performance, if sequence will contains some elements and Foo will need them compute again...
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Check, if sequence is empty without actually reset the sequence. This sounds like win-win, but I can't find any easy build-in way, how to do it. So I create this obscure workaround and wondering, whether this is really a best approach.
Condition
var source = Foo();
if (!IsEmpty(ref source))
Bar(source);
with IsEmpty implemented as
bool IsEmpty<T>(ref IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
}
}
Also note, that calling Bar with empty sequence is not option...
EDIT:
After some consideration, best answer for my case is from Olivier Jacot-Descombes - avoid that scenario completely. Accepted solution answers this question - if it is really no other way.
I don't know whether your algorithm in Foo allows to determine if the enumeration will be empty without doing the calculations. But if this is the case, return null if the sequence would be empty:
public IEnumerable<T> Foo()
{
if (<check if sequence will be empty>) {
return null;
}
return GetSequence();
}
private IEnumerable<T> GetSequence()
{
...
yield return item;
...
}
Note that if a method uses yield return, it cannot use a simple return to return null. Therefore a second method is needed.
var sequence = Foo();
if (sequence != null) {
Bar(sequence);
}
After reading one of your comments
Foo need to initialize some resources, parse XML file and fill some HashSets, which will be used to filter (yield) returned data.
I suggest another approach. The time consuming part seems to be the initialization. To be able to separate it from the iteration, create a foo calculator class. Something like:
public class FooCalculator<T>
{
private bool _isInitialized;
private string _file;
public FooCalculator(string file)
{
_file = file;
}
private EnsureInitialized()
{
if (_isInitialized) return;
// Parse XML.
// Fill some HashSets.
_isInitialized = true;
}
public IEnumerable<T> Result
{
get {
EnsureInitialized();
...
yield return ...;
...
}
}
}
This ensures that the costly initialization stuff is executed only once. Now you can safely use Any().
Other optimizations are conceivable. The Result property could remember the position of the first returned element, so that if it is called again, it could skip to it immediately.
You would like to call some function Bar<T>(IEnumerable<T> source) if and only if the enumerable source contains at least one element, but you're running into two problems:
There is no method T Peek() in IEnumerable<T> so you would need to actually begin to evaluate the enumerable to see if it's nonempty, but...
You don't want to even partially double-evaluate the enumerable since setting up the enumerable might be expensive.
In that case your approach looks reasonable. You do, however, have some issues with your imlementation:
You need to dispose enumerator after using it.
As pointed out by Ivan Stoev in comments, if the Bar() method attempts to evaluate the IEnumerable<T> more than once (e.g. by calling Any() then foreach (...)) then the results will be undefined because usedEnumerator will have been exhausted by the first enumeration.
To resolve these issues, I'd suggest modifying your API a little and create an extension method IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func) that calls a specified method only if the sequence is nonempty, as shown below:
public static partial class EnumerableExtensions
{
public static bool IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func)
{
if (source == null|| func == null)
throw new ArgumentNullException();
using (var enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
return false;
func(new UsedEnumerator<T>(enumerator));
return true;
}
}
class UsedEnumerator<T> : IEnumerable<T>
{
IEnumerator<T> usedEnumerator;
public UsedEnumerator(IEnumerator<T> usedEnumerator)
{
if (usedEnumerator == null)
throw new ArgumentNullException();
this.usedEnumerator = usedEnumerator;
}
public IEnumerator<T> GetEnumerator()
{
var localEnumerator = System.Threading.Interlocked.Exchange(ref usedEnumerator, null);
if (localEnumerator == null)
// An attempt has been made to enumerate usedEnumerator more than once;
// throw an exception since this is not allowed.
throw new InvalidOperationException();
yield return localEnumerator.Current;
while (localEnumerator.MoveNext())
{
yield return localEnumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
Demo fiddle with unit tests here.
If you can change Bar then how about change it to TryBar that returns false when IEnumerable<T> was empty?
bool TryBar(IEnumerable<Foo> source)
{
var count = 0;
foreach (var x in source)
{
count++;
}
return count > 0;
}
If that doesn't work for you could always create your own IEnumerable<T> wrapper that caches values after they have been iterated once.
One improvement for your IsEmpty would be to check if source is ICollection<T>, and if it is, check .Count (also, dispose the enumerator):
bool IsEmpty<T>(ref IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
return collection.Count == 0;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
enumerator.Dispose();
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
This will work for arrays and lists.
I would, however, rework IsEmpty to return:
IEnumerable<T> NotEmpty<T>(IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
if (collection.Count == 0)
{
return null;
}
return source;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
return CreateIEnumerable(enumerator);
}
enumerator.Dispose();
return null;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
Now, you would check if it returned null.
The accepted answer is probably the best approach but, based on, and I quote:
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Another take would be creating an IEnumerable<T> that partially caches the underlying enumeration. Something along the following lines:
interface IDisposableEnumerable<T>
:IEnumerable<T>, IDisposable
{
}
static class PartiallyCachedEnumerable
{
public static IDisposableEnumerable<T> Create<T>(
IEnumerable<T> source,
int cachedCount)
{
if (source == null)
throw new NullReferenceException(
nameof(source));
if (cachedCount < 1)
throw new ArgumentOutOfRangeException(
nameof(cachedCount));
return new partiallyCachedEnumerable<T>(
source, cachedCount);
}
private class partiallyCachedEnumerable<T>
: IDisposableEnumerable<T>
{
private readonly IEnumerator<T> enumerator;
private bool disposed;
private readonly List<T> cache;
private readonly bool hasMoreItems;
public partiallyCachedEnumerable(
IEnumerable<T> source,
int cachedCount)
{
Debug.Assert(source != null);
Debug.Assert(cachedCount > 0);
enumerator = source.GetEnumerator();
cache = new List<T>(cachedCount);
var count = 0;
while (enumerator.MoveNext() &&
count < cachedCount)
{
cache.Add(enumerator.Current);
count += 1;
}
hasMoreItems = !(count < cachedCount);
}
public void Dispose()
{
if (disposed)
return;
enumerator.Dispose();
disposed = true;
}
public IEnumerator<T> GetEnumerator()
{
foreach (var t in cache)
yield return t;
if (disposed)
yield break;
while (enumerator.MoveNext())
{
yield return enumerator.Current;
cache.Add(enumerator.Current)
}
Dispose();
}
IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
}
}

implement IEnumerator and IEnumerable in ICollection

I am very new to this ICollection stuff and need some guidance of how to implement the IEnumerable and IEnumerator. I have checked Microsoft documentation and I think I understand what was said there (I think). But when I tried to implement it in my case, I was a bit confused and may need some clarification.
Basically, I declared a class T, then another class Ts which implemented ICollection. In Ts, I have a dictionary.
From the main program, I would like to initialize the class Ts like this:
Ts ts= new Ts(){{a,b}, {c,d}};
so, my questions are:
1) is it legal to do that? It appears that it is as the compiler did not complaint although I have not run the test because I have not thoroughly implement IEnumerable and IEnumerator, which brought to my 2nd question
2) How do I implement IEnumerable and IEnumerator?
Below is my pseudo code to illustrate my points.
public class T
{
string itemName;
int quantity;
.....
public T(string s, int q)
{
.....
}
}
public class Ts: ICollection
{
private Dictionary<string, T> inventory= new Dictionary<string,T>();
public void Add(string s, int q)
{
inventory.Add(s, new T(s,q));
}
public IEnumerator<T> GetEnumerator()
{
// please help
}
IEnumerator IEnumerable.GetEnumerator()
{
// what is the proper GetEnumerator here
}
...
implement other method in ICollection
}
extract from the main program
public Ts CollectionOfT = new Ts(){{"bicycle",100},{"Lawn mower",50}};
.........
The proper implementation is to cast your collection to IEnumerable in the explicit implementation:
IEnumerator IEnumerable.GetEnumerator() {
return ((IEnumerable)your_collection_here).GetEnumerator();
}
For the generic version, call GetEnumerator on your collection:
public IEnumerator<T> GetEnumerator() {
return your_collection_here.GetEnumerator();
}
You must have something that is backing your custom collection.. such as a List, Array, etc. Use that in those implementations.
Honestly you don't need to build your own collection "wrapper" around a Dictionary, but if you must, you can delegate pretty much all the calls to the dictionary for the implementation of the ICollection interface.
Hope this helps
public class Ts: ICollection<T>
{
private Dictionary<string, T> inventory= new Dictionary<string,T>();
//public void Add(string s, int q)
//{
// inventory.Add(s, new T(s,q));
//}
public void Add(T item)
{
inventory.Add(item.ItemName,item);
}
public void Add(string s, int q)
{
inventory.Add(s, new T(s, q));
}
public void Clear()
{
inventory.Clear();
}
public bool Contains(T item)
{
return inventory.ContainsValue(item);
}
public void CopyTo(T[] array, int arrayIndex)
{
inventory.Values.CopyTo(array, arrayIndex);
}
public int Count
{
get { return inventory.Count; }
}
public bool IsReadOnly
{
get { return false; }
}
public bool Remove(T item)
{
return inventory.Remove(item.ItemName);
}
public System.Collections.Generic.IEnumerator<T> GetEnumerator()
{
return inventory.Values.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return inventory.Values.GetEnumerator();
}
}
class Program
{
Ts ts = new Ts { { "a", 1 }, { "b", 2 } };
foreach (T t in ts)
{
Console.WriteLine("{0}:{1}",t.ItemName,t.Quantity);
}
}

Does Select() on a List lose track of the size of the collection?

In the following code, is the Select() method smart enough to keep the size of the list somewhere internally for the ToArray() method to be cheap?
List<Thing> bigList = someBigList;
var bigArray = bigList.Select(t => t.SomeField).ToArray();
That's easy to check, without looking at the implementation. Just create a class that implements IList<T>, and put a trace in the Count property:
class MyList<T> : IList<T>
{
private readonly IList<T> _list = new List<T>();
public IEnumerator<T> GetEnumerator()
{
return _list.GetEnumerator();
}
public void Add(T item)
{
_list.Add(item);
}
public void Clear()
{
_list.Clear();
}
public bool Contains(T item)
{
return _list.Contains(item);
}
public void CopyTo(T[] array, int arrayIndex)
{
_list.CopyTo(array, arrayIndex);
}
public bool Remove(T item)
{
return _list.Remove(item);
}
public int Count
{
get
{
Console.WriteLine ("Count accessed");
return _list.Count;
}
}
public bool IsReadOnly
{
get { return _list.IsReadOnly; }
}
public int IndexOf(T item)
{
return _list.IndexOf(item);
}
public void Insert(int index, T item)
{
_list.Insert(index, item);
}
public void RemoveAt(int index)
{
_list.RemoveAt(index);
}
public T this[int index]
{
get { return _list[index]; }
set { _list[index] = value; }
}
#region Implementation of IEnumerable
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
#endregion
}
If the Count property is accessed, this code should print "Count accessed":
var list = new MyList<int> { 1, 2, 3 };
var array = list.Select(x => x).ToArray();
But it doesn't print anything, so no, it doesn't keep track of the count. Of course there could be an optimization specific to List<T>, but it seems unlikely...
No, right now it does not (at least the .NET implementation). From the MS reference sources, Enumerable.ToArray is implemented as
public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
return new Buffer<TSource>(source).ToArray();
}
Buffer<TSource> creates a copy of the source sequence (in array form) on construction by iterating and resizing as necessary; it has a special "fast path" if source is an ICollection<TSource>, but the result of Enumerable.Select unsurprisingly does not implement that interface.
Be that as it may, apart from pure curiosity I don't think that this result means anything. For one, the implementation may change at any point in the future (even though a quick cost-benefit analysis won't find this likely). And in any case, you will suffer at most O(logN) reallocations. For small N the reallocations are not going to be noticeable. For large N, the amount of time spent on iterating over the collection is going to be O(N) and will therefore easily dominate.
When you apply Select operator to enumerable sequence, it creates one of following iterators:
WhereSelectArrayIterator
WhereSelectListIterator
WhereSelectEnumerableIterator
In case of List<T>, WhereSelectListIterator iterator is created. It uses list's iterator to iterate over the list and apply predicate and selector. This is a MoveNext method implementation:
while (this.enumerator.MoveNext())
{
TSource current = this.enumerator.Current;
if ((this.predicate == null) || this.predicate(current))
{
base.current = this.selector(current);
return true;
}
}
As you can see, it does not preserve information about number of items, which matched predicate, thus it does not know count of items in filtered sequence.

Why in this example (got from msdn), in GetEnumerator method , new PeopleEnum returns IEnumerator?

Why in this example from MSDN, in GetEnumerator method, PeopleEnum returns IEnumerator?
public class Person
{
public Person(string fName, string lName)
{
this.firstName = fName;
this.lastName = lName;
}
public string firstName;
public string lastName;
}
public class People : IEnumerable
{
private Person[] _people;
public People(Person[] pArray)
{
_people = new Person[pArray.Length];
for (int i = 0; i < pArray.Length; i++)
{
_people[i] = pArray[i];
}
}
//why???
IEnumerator IEnumerable.GetEnumerator()
{
return (IEnumerator) GetEnumerator();
}
public PeopleEnum GetEnumerator()
{
return new PeopleEnum(_people);
}
}
public class PeopleEnum : IEnumerator
{
public Person[] _people;
// Enumerators are positioned before the first element
// until the first MoveNext() call.
int position = -1;
public PeopleEnum(Person[] list)
{
_people = list;
}
public bool MoveNext()
{
position++;
return (position < _people.Length);
}
public void Reset()
{
position = -1;
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public Person Current
{
get
{
try
{
return _people[position];
}
catch (IndexOutOfRangeException)
{
throw new InvalidOperationException();
}
}
}
UPDATE:
BTW, if Array data type implements ICloneable interface, why msdn has copied pArray to _people by writing a for loop?
It needs to return exactly IEnumerator to properly implement the IEnumerable interface. It is doing this using an "explicit interface implementation", so on the public API you see PeopleEnum, but IEnumerable is still happy
But in reality you would very rarely write an enumerator this way in C# 2.0 or above; you'd use an iterator block (yield return). See C# in Depth chapter 6 (free chapter!).
For info, the reason that PeopleEnum exists at all here is that this looks like a .NET 1.1 sample, where that is the only way to create a typed enumerator. In .NET 2.0 and above there is IEnumerable<T> / IEnumerator<T>, which has a typed (via generics) .Current.
In .NET 2.0 / C# 2.0 (or above) I would have simply:
public class People : IEnumerable<Person> {
/* snip */
public IEnumerator<Person> GetEnumerator() {
return ((IEnumerable<Person>)_people).GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator() { return _people.GetEnumerator();}
}
Types implementing IEnumerable require a method called GetEnumerator that returns an IEnumerator. In that example (which is pretty obsolete as of C# 2.0) there is an enumerator class PeopleEnum that implements IEnumerator. It's what's used internally by the C# foreach statement.
A more up to date example would look more like the following. Note there's no longer a need for a PeopleEnum class now that C# supports iterators. Effectively the compiler does all the heavy lifting for you.
public class People : IEnumerable
{
private Person[] _people;
public People(Person[] pArray)
{
_people = new Person[pArray.Length];
for (int i = 0; i < pArray.Length; i++)
{
_people[i] = pArray[i];
}
}
IEnumerator IEnumerable.GetEnumerator()
{
for (int i=0; i < _people.Length; i++) {
yield return _people[i];
}
}
}

Is there a built-in way to convert IEnumerator to IEnumerable

Is there a built-in way to convert IEnumerator<T> to IEnumerable<T>?
The easiest way of converting I can think of is via the yield statement
public static IEnumerable<T> ToIEnumerable<T>(this IEnumerator<T> enumerator) {
while ( enumerator.MoveNext() ) {
yield return enumerator.Current;
}
}
compared to the list version this has the advantage of not enumerating the entire list before returning an IEnumerable. using the yield statement you'd only iterate over the items you need, whereas using the list version, you'd first iterate over all items in the list and then all the items you need.
for a little more fun you could change it to
public static IEnumerable<K> Select<K,T>(this IEnumerator<T> e,
Func<K,T> selector) {
while ( e.MoveNext() ) {
yield return selector(e.Current);
}
}
you'd then be able to use linq on your enumerator like:
IEnumerator<T> enumerator;
var someList = from item in enumerator
select new classThatTakesTInConstructor(item);
You could use the following which will kinda work.
public class FakeEnumerable<T> : IEnumerable<T> {
private IEnumerator<T> m_enumerator;
public FakeEnumerable(IEnumerator<T> e) {
m_enumerator = e;
}
public IEnumerator<T> GetEnumerator() {
return m_enumerator;
}
// Rest omitted
}
This will get you into trouble though when people expect successive calls to GetEnumerator to return different enumerators vs. the same one. But if it's a one time only use in a very constrained scenario, this could unblock you.
I do suggest though you try and not do this because I think eventually it will come back to haunt you.
A safer option is along the lines Jonathan suggested. You can expend the enumerator and create a List<T> of the remaining items.
public static List<T> SaveRest<T>(this IEnumerator<T> e) {
var list = new List<T>();
while ( e.MoveNext() ) {
list.Add(e.Current);
}
return list;
}
EnumeratorEnumerable<T>
A threadsafe, resettable adaptor from IEnumerator<T> to IEnumerable<T>
I use Enumerator parameters like in C++ forward_iterator concept.
I agree that this can lead to confusion as too many people will indeed assume Enumerators are /like/ Enumerables, but they are not.
However, the confusion is fed by the fact that IEnumerator contains the Reset method. Here is my idea of the most correct implementation. It leverages the implementation of IEnumerator.Reset()
A major difference between an Enumerable and and Enumerator is, that an Enumerable might be able to create several Enumerators simultaneously. This implementation puts a whole lot of work into making sure that this never happens for the EnumeratorEnumerable<T> type. There are two EnumeratorEnumerableModes:
Blocking (meaning that a second caller will simply wait till the first enumeration is completed)
NonBlocking (meaning that a second (concurrent) request for an enumerator simply throws an exception)
Note 1: 74 lines are implementation, 79 lines are testing code :)
Note 2: I didn't refer to any unit testing framework for SO convenience
using System;
using System.Diagnostics;
using System.Linq;
using System.Collections;
using System.Collections.Generic;
using System.Threading;
namespace EnumeratorTests
{
public enum EnumeratorEnumerableMode
{
NonBlocking,
Blocking,
}
public sealed class EnumeratorEnumerable<T> : IEnumerable<T>
{
#region LockingEnumWrapper
public sealed class LockingEnumWrapper : IEnumerator<T>
{
private static readonly HashSet<IEnumerator<T>> BusyTable = new HashSet<IEnumerator<T>>();
private readonly IEnumerator<T> _wrap;
internal LockingEnumWrapper(IEnumerator<T> wrap, EnumeratorEnumerableMode allowBlocking)
{
_wrap = wrap;
if (allowBlocking == EnumeratorEnumerableMode.Blocking)
Monitor.Enter(_wrap);
else if (!Monitor.TryEnter(_wrap))
throw new InvalidOperationException("Thread conflict accessing busy Enumerator") {Source = "LockingEnumWrapper"};
lock (BusyTable)
{
if (BusyTable.Contains(_wrap))
throw new LockRecursionException("Self lock (deadlock) conflict accessing busy Enumerator") { Source = "LockingEnumWrapper" };
BusyTable.Add(_wrap);
}
// always implicit Reset
_wrap.Reset();
}
#region Implementation of IDisposable and IEnumerator
public void Dispose()
{
lock (BusyTable)
BusyTable.Remove(_wrap);
Monitor.Exit(_wrap);
}
public bool MoveNext() { return _wrap.MoveNext(); }
public void Reset() { _wrap.Reset(); }
public T Current { get { return _wrap.Current; } }
object IEnumerator.Current { get { return Current; } }
#endregion
}
#endregion
private readonly IEnumerator<T> _enumerator;
private readonly EnumeratorEnumerableMode _allowBlocking;
public EnumeratorEnumerable(IEnumerator<T> e, EnumeratorEnumerableMode allowBlocking)
{
_enumerator = e;
_allowBlocking = allowBlocking;
}
private LockRecursionPolicy a;
public IEnumerator<T> GetEnumerator()
{
return new LockingEnumWrapper(_enumerator, _allowBlocking);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class TestClass
{
private static readonly string World = "hello world\n";
public static void Main(string[] args)
{
var master = World.GetEnumerator();
var nonblocking = new EnumeratorEnumerable<char>(master, EnumeratorEnumerableMode.NonBlocking);
var blocking = new EnumeratorEnumerable<char>(master, EnumeratorEnumerableMode.Blocking);
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
try
{
var willRaiseException = from c1 in nonblocking from c2 in nonblocking select new {c1, c2};
Console.WriteLine("Cartesian product: {0}", willRaiseException.Count()); // RAISE
}
catch (Exception e) { Console.WriteLine(e); }
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
try
{
var willSelfLock = from c1 in blocking from c2 in blocking select new { c1, c2 };
Console.WriteLine("Cartesian product: {0}", willSelfLock.Count()); // LOCK
}
catch (Exception e) { Console.WriteLine(e); }
// should not externally throw (exceptions on other threads reported to console)
if (ThreadConflictCombinations(blocking, nonblocking))
throw new InvalidOperationException("Should have thrown an exception on background thread");
if (ThreadConflictCombinations(nonblocking, nonblocking))
throw new InvalidOperationException("Should have thrown an exception on background thread");
if (ThreadConflictCombinations(nonblocking, blocking))
Console.WriteLine("Background thread timed out");
if (ThreadConflictCombinations(blocking, blocking))
Console.WriteLine("Background thread timed out");
Debug.Assert(true); // Must be reached
}
private static bool ThreadConflictCombinations(IEnumerable<char> main, IEnumerable<char> other)
{
try
{
using (main.GetEnumerator())
{
var bg = new Thread(o =>
{
try { other.GetEnumerator(); }
catch (Exception e) { Report(e); }
}) { Name = "background" };
bg.Start();
bool timedOut = !bg.Join(1000); // observe the thread waiting a full second for a lock (or throw the exception for nonblocking)
if (timedOut)
bg.Abort();
return timedOut;
}
} catch
{
throw new InvalidProgramException("Cannot be reached");
}
}
static private readonly object ConsoleSynch = new Object();
private static void Report(Exception e)
{
lock (ConsoleSynch)
Console.WriteLine("Thread:{0}\tException:{1}", Thread.CurrentThread.Name, e);
}
}
}
Note 3: I think the implementation of the thread locking (especially around BusyTable) is quite ugly; However, I didn't want to resort to ReaderWriterLock(LockRecursionPolicy.NoRecursion) and didn't want to assume .Net 4.0 for SpinLock
Solution with use of Factory along with fixing cached IEnumerator issue in JaredPar's answer allows to change the way of enumeration.
Consider a simple example: we want custom List<T> wrapper that allow to enumerate in reverse order along with default enumeration. List<T> already implements IEnumerator for default enumeration, we only need to create IEnumerator that enumerates in reverse order. (We won't use List<T>.AsEnumerable().Reverse() because it enumerates the list twice)
public enum EnumerationType {
Default = 0,
Reverse
}
public class CustomList<T> : IEnumerable<T> {
private readonly List<T> list;
public CustomList(IEnumerable<T> list) => this.list = new List<T>(list);
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
//Default IEnumerable method will return default enumerator factory
public IEnumerator<T> GetEnumerator()
=> GetEnumerable(EnumerationType.Default).GetEnumerator();
public IEnumerable<T> GetEnumerable(EnumerationType enumerationType)
=> enumerationType switch {
EnumerationType.Default => new DefaultEnumeratorFactory(list),
EnumerationType.Reverse => new ReverseEnumeratorFactory(list)
};
//Simple implementation of reverse list enumerator
private class ReverseEnumerator : IEnumerator<T> {
private readonly List<T> list;
private int index;
internal ReverseEnumerator(List<T> list) {
this.list = list;
index = list.Count-1;
Current = default;
}
public void Dispose() { }
public bool MoveNext() {
if(index >= 0) {
Current = list[index];
index--;
return true;
}
Current = default;
return false;
}
public T Current { get; private set; }
object IEnumerator.Current => Current;
void IEnumerator.Reset() {
index = list.Count - 1;
Current = default;
}
}
private abstract class EnumeratorFactory : IEnumerable<T> {
protected readonly List<T> List;
protected EnumeratorFactory(List<T> list) => List = list;
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public abstract IEnumerator<T> GetEnumerator();
}
private class DefaultEnumeratorFactory : EnumeratorFactory {
public DefaultEnumeratorFactory(List<T> list) : base(list) { }
//Default enumerator is already implemented in List<T>
public override IEnumerator<T> GetEnumerator() => List.GetEnumerator();
}
private class ReverseEnumeratorFactory : EnumeratorFactory {
public ReverseEnumeratorFactory(List<T> list) : base(list) { }
public override IEnumerator<T> GetEnumerator() => new ReverseEnumerator(List);
}
}
As Jason Watts said -- no, not directly.
If you really want to, you could loop through the IEnumerator<T>, putting the items into a List<T>, and return that, but I'm guessing that's not what you're looking to do.
The basic reason you can't go that direction (IEnumerator<T> to a IEnumerable<T>) is that IEnumerable<T> represents a set that can be enumerated, but IEnumerator<T> is a specific enumeratation over a set of items -- you can't turn the specific instance back into the thing that created it.
static class Helper
{
public static List<T> SaveRest<T>(this IEnumerator<T> enumerator)
{
var list = new List<T>();
while (enumerator.MoveNext())
{
list.Add(enumerator.Current);
}
return list;
}
public static ArrayList SaveRest(this IEnumerator enumerator)
{
var list = new ArrayList();
while (enumerator.MoveNext())
{
list.Add(enumerator.Current);
}
return list;
}
}
Nope, IEnumerator<> and IEnumerable<> are different beasts entirely.
This is a variant I have written... The specific is a little different. I wanted to do a MoveNext() on an IEnumerable<T>, check the result, and then roll everything in a new IEnumerator<T> that was "complete" (so that included even the element of the IEnumerable<T> I had already extracted)
// Simple IEnumerable<T> that "uses" an IEnumerator<T> that has
// already received a MoveNext(). "eats" the first MoveNext()
// received, then continues normally. For shortness, both IEnumerable<T>
// and IEnumerator<T> are implemented by the same class. Note that if a
// second call to GetEnumerator() is done, the "real" IEnumerator<T> will
// be returned, not this proxy implementation.
public class EnumerableFromStartedEnumerator<T> : IEnumerable<T>, IEnumerator<T>
{
public readonly IEnumerator<T> Enumerator;
public readonly IEnumerable<T> Enumerable;
// Received by creator. Return value of MoveNext() done by caller
protected bool FirstMoveNextSuccessful { get; set; }
// The Enumerator can be "used" only once, then a new enumerator
// can be requested by Enumerable.GetEnumerator()
// (default = false)
protected bool Used { get; set; }
// The first MoveNext() has been already done (default = false)
protected bool DoneMoveNext { get; set; }
public EnumerableFromStartedEnumerator(IEnumerator<T> enumerator, bool firstMoveNextSuccessful, IEnumerable<T> enumerable)
{
Enumerator = enumerator;
FirstMoveNextSuccessful = firstMoveNextSuccessful;
Enumerable = enumerable;
}
public IEnumerator<T> GetEnumerator()
{
if (Used)
{
return Enumerable.GetEnumerator();
}
Used = true;
return this;
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public T Current
{
get
{
// There are various school of though on what should
// happens if called before the first MoveNext() or
// after a MoveNext() returns false. We follow the
// "return default(TInner)" school of thought for the
// before first MoveNext() and the "whatever the
// Enumerator wants" for the after a MoveNext() returns
// false
if (!DoneMoveNext)
{
return default(T);
}
return Enumerator.Current;
}
}
public void Dispose()
{
Enumerator.Dispose();
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public bool MoveNext()
{
if (!DoneMoveNext)
{
DoneMoveNext = true;
return FirstMoveNextSuccessful;
}
return Enumerator.MoveNext();
}
public void Reset()
{
// This will 99% throw :-) Not our problem.
Enumerator.Reset();
// So it is improbable we will arrive here
DoneMoveNext = true;
}
}
Use:
var enumerable = someCollection<T>;
var enumerator = enumerable.GetEnumerator();
bool res = enumerator.MoveNext();
// do whatever you want with res/enumerator.Current
var enumerable2 = new EnumerableFromStartedEnumerator<T>(enumerator, res, enumerable);
Now, the first GetEnumerator() that will be requested to enumerable2 will be given through the enumerator enumerator. From the second onward the enumerable.GetEnumerator() will be used.
The other answers here are ... strange. IEnumerable<T> has just one method, GetEnumerator(). And an IEnumerable<T> must implement IEnumerable, which also has just one method, GetEnumerator() (the difference being that one is generic on T and the other is not). So it should be clear how to turn an IEnumerator<T> into an IEnumerable<T>:
// using modern expression-body syntax
public class IEnumeratorToIEnumerable<T> : IEnumerable<T>
{
private readonly IEnumerator<T> Enumerator;
public IEnumeratorToIEnumerable(IEnumerator<T> enumerator) =>
Enumerator = enumerator;
public IEnumerator<T> GetEnumerator() => Enumerator;
IEnumerator IEnumerable.GetEnumerator() => Enumerator;
}
foreach (var foo in new IEnumeratorToIEnumerable<Foo>(fooEnumerator))
DoSomethingWith(foo);
// and you can also do:
var fooEnumerable = new IEnumeratorToIEnumerable<Foo>(fooEnumerator);
foreach (var foo in fooEnumerable)
DoSomethingWith(foo);
// Some IEnumerators automatically repeat after MoveNext() returns false,
// in which case this is a no-op, but generally it's required.
fooEnumerator.Reset();
foreach (var foo in fooEnumerable)
DoSomethingElseWith(foo);
However, none of this should be needed because it's unusual to have an IEnumerator<T> that doesn't come with an IEnumerable<T> that returns an instance of it from its GetEnumerator method. If you're writing your own IEnumerator<T>, you should certainly provide the IEnumerable<T>. And really it's the other way around ... an IEnumerator<T> is intended to be a private class that iterates over instances of a public class that implements IEnumerable<T>.

Categories

Resources