IEnumerable Method with AsParallel - c#

I got the following extension method:
static class ExtensionMethods
{
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(
this IEnumerable<T> input,
int subsequenceLength)
{
var enumerator = input.GetEnumerator();
SubsequenciseParameter parameter = new SubsequenciseParameter
{
Next = enumerator.MoveNext()
};
while (parameter.Next)
yield return getSubSequence(
enumerator,
subsequenceLength,
parameter);
}
private static IEnumerable<T> getSubSequence<T>(
IEnumerator<T> enumerator,
int subsequenceLength,
SubsequenciseParameter parameter)
{
do
{
lock (enumerator) // this lock makes it "work"
{ // removing this causes exceptions.
if (parameter.Next)
yield return enumerator.Current;
}
} while ((parameter.Next = enumerator.MoveNext())
&& --subsequenceLength > 0);
}
// Needed since you cant use out or ref in yield-return methods...
class SubsequenciseParameter
{
public bool Next { get; set; }
}
}
Its purpose is to split a sequence into subsequences of a given size.
Calling it like this:
foreach (var sub in "abcdefghijklmnopqrstuvwxyz"
.Subsequencise(3)
.**AsParallel**()
.Select(sub =>new String(sub.ToArray()))
{
Console.WriteLine(sub);
}
Console.ReadKey();
works, however there are some empty lines in-between since some of the threads are "too late" and enter the first yield return.
I tried putting more locks everywhere, however I cannot achieve to make this work correct in combination with as parallel.
It's obvious that this example doesn't justify the use of as parallel at all. It is just to demonstrate how the method could be called.

The problem is that using iterators is lazy evaluated, so you return a lazily evaluated iterator which gets used from multiple threads.
You can fix this by rewriting your method as follows:
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(this IEnumerable<T> input, int subsequenceLength)
{
var syncObj = new object();
var enumerator = input.GetEnumerator();
if (!enumerator.MoveNext())
{
yield break;
}
List<T> currentList = new List<T> { enumerator.Current };
int length = 1;
while (enumerator.MoveNext())
{
if (length == subsequenceLength)
{
length = 0;
yield return currentList;
currentList = new List<T>();
}
currentList.Add(enumerator.Current);
++length;
}
yield return currentList;
}
This performs the same function, but doesn't use an iterator to implement the "nested" IEnumerable<T>, avoiding the problem. Note that this also avoids the locking as well as the custom SubsequenciseParameter type.

Related

Calling method with IEnumerable<T> sequence as argument, if that sequence is not empty

I have method Foo, which do some CPU intensive computations and returns IEnumerable<T> sequence. I need to check, if that sequence is empty. And if not, call method Bar with that sequence as argument.
I thought about three approaches...
Check, if sequence is empty with Any(). This is ok, if sequence is really empty, which will be case most of the times. But it will have horrible performance, if sequence will contains some elements and Foo will need them compute again...
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Check, if sequence is empty without actually reset the sequence. This sounds like win-win, but I can't find any easy build-in way, how to do it. So I create this obscure workaround and wondering, whether this is really a best approach.
Condition
var source = Foo();
if (!IsEmpty(ref source))
Bar(source);
with IsEmpty implemented as
bool IsEmpty<T>(ref IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
}
}
Also note, that calling Bar with empty sequence is not option...
EDIT:
After some consideration, best answer for my case is from Olivier Jacot-Descombes - avoid that scenario completely. Accepted solution answers this question - if it is really no other way.
I don't know whether your algorithm in Foo allows to determine if the enumeration will be empty without doing the calculations. But if this is the case, return null if the sequence would be empty:
public IEnumerable<T> Foo()
{
if (<check if sequence will be empty>) {
return null;
}
return GetSequence();
}
private IEnumerable<T> GetSequence()
{
...
yield return item;
...
}
Note that if a method uses yield return, it cannot use a simple return to return null. Therefore a second method is needed.
var sequence = Foo();
if (sequence != null) {
Bar(sequence);
}
After reading one of your comments
Foo need to initialize some resources, parse XML file and fill some HashSets, which will be used to filter (yield) returned data.
I suggest another approach. The time consuming part seems to be the initialization. To be able to separate it from the iteration, create a foo calculator class. Something like:
public class FooCalculator<T>
{
private bool _isInitialized;
private string _file;
public FooCalculator(string file)
{
_file = file;
}
private EnsureInitialized()
{
if (_isInitialized) return;
// Parse XML.
// Fill some HashSets.
_isInitialized = true;
}
public IEnumerable<T> Result
{
get {
EnsureInitialized();
...
yield return ...;
...
}
}
}
This ensures that the costly initialization stuff is executed only once. Now you can safely use Any().
Other optimizations are conceivable. The Result property could remember the position of the first returned element, so that if it is called again, it could skip to it immediately.
You would like to call some function Bar<T>(IEnumerable<T> source) if and only if the enumerable source contains at least one element, but you're running into two problems:
There is no method T Peek() in IEnumerable<T> so you would need to actually begin to evaluate the enumerable to see if it's nonempty, but...
You don't want to even partially double-evaluate the enumerable since setting up the enumerable might be expensive.
In that case your approach looks reasonable. You do, however, have some issues with your imlementation:
You need to dispose enumerator after using it.
As pointed out by Ivan Stoev in comments, if the Bar() method attempts to evaluate the IEnumerable<T> more than once (e.g. by calling Any() then foreach (...)) then the results will be undefined because usedEnumerator will have been exhausted by the first enumeration.
To resolve these issues, I'd suggest modifying your API a little and create an extension method IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func) that calls a specified method only if the sequence is nonempty, as shown below:
public static partial class EnumerableExtensions
{
public static bool IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func)
{
if (source == null|| func == null)
throw new ArgumentNullException();
using (var enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
return false;
func(new UsedEnumerator<T>(enumerator));
return true;
}
}
class UsedEnumerator<T> : IEnumerable<T>
{
IEnumerator<T> usedEnumerator;
public UsedEnumerator(IEnumerator<T> usedEnumerator)
{
if (usedEnumerator == null)
throw new ArgumentNullException();
this.usedEnumerator = usedEnumerator;
}
public IEnumerator<T> GetEnumerator()
{
var localEnumerator = System.Threading.Interlocked.Exchange(ref usedEnumerator, null);
if (localEnumerator == null)
// An attempt has been made to enumerate usedEnumerator more than once;
// throw an exception since this is not allowed.
throw new InvalidOperationException();
yield return localEnumerator.Current;
while (localEnumerator.MoveNext())
{
yield return localEnumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
Demo fiddle with unit tests here.
If you can change Bar then how about change it to TryBar that returns false when IEnumerable<T> was empty?
bool TryBar(IEnumerable<Foo> source)
{
var count = 0;
foreach (var x in source)
{
count++;
}
return count > 0;
}
If that doesn't work for you could always create your own IEnumerable<T> wrapper that caches values after they have been iterated once.
One improvement for your IsEmpty would be to check if source is ICollection<T>, and if it is, check .Count (also, dispose the enumerator):
bool IsEmpty<T>(ref IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
return collection.Count == 0;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
enumerator.Dispose();
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
This will work for arrays and lists.
I would, however, rework IsEmpty to return:
IEnumerable<T> NotEmpty<T>(IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
if (collection.Count == 0)
{
return null;
}
return source;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
return CreateIEnumerable(enumerator);
}
enumerator.Dispose();
return null;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
Now, you would check if it returned null.
The accepted answer is probably the best approach but, based on, and I quote:
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Another take would be creating an IEnumerable<T> that partially caches the underlying enumeration. Something along the following lines:
interface IDisposableEnumerable<T>
:IEnumerable<T>, IDisposable
{
}
static class PartiallyCachedEnumerable
{
public static IDisposableEnumerable<T> Create<T>(
IEnumerable<T> source,
int cachedCount)
{
if (source == null)
throw new NullReferenceException(
nameof(source));
if (cachedCount < 1)
throw new ArgumentOutOfRangeException(
nameof(cachedCount));
return new partiallyCachedEnumerable<T>(
source, cachedCount);
}
private class partiallyCachedEnumerable<T>
: IDisposableEnumerable<T>
{
private readonly IEnumerator<T> enumerator;
private bool disposed;
private readonly List<T> cache;
private readonly bool hasMoreItems;
public partiallyCachedEnumerable(
IEnumerable<T> source,
int cachedCount)
{
Debug.Assert(source != null);
Debug.Assert(cachedCount > 0);
enumerator = source.GetEnumerator();
cache = new List<T>(cachedCount);
var count = 0;
while (enumerator.MoveNext() &&
count < cachedCount)
{
cache.Add(enumerator.Current);
count += 1;
}
hasMoreItems = !(count < cachedCount);
}
public void Dispose()
{
if (disposed)
return;
enumerator.Dispose();
disposed = true;
}
public IEnumerator<T> GetEnumerator()
{
foreach (var t in cache)
yield return t;
if (disposed)
yield break;
while (enumerator.MoveNext())
{
yield return enumerator.Current;
cache.Add(enumerator.Current)
}
Dispose();
}
IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
}
}

Can I take the first n elements from an enumeration, and then still use the rest of the enumeration?

Suppose I have an IEnumerable<T>, and I want to take the first element and pass the remaining elements to some other code. I can get the first n elements using Take(n), but how can I then access the remaining elements without causing the enumeration to re-start?
For example, suppose I have a method ReadRecords that accepts the records in a CSV file as IEnumerable<string>. Now suppose that within that method, I want to read the first record (the headers), and then pass the remaining records to a ReadDataRecords method that also takes IEnumerable<string>. Like this:
void ReadCsv(IEnumerable<string> records)
{
var headerRecord = records.Take(1);
var dataRecords = ???
ReadDataRecords(dataRecords);
}
void ReadDataRecords(IEnumerable<string> records)
{
// ...
}
If I were prepared to re-start the enumeration, then I could use dataRecords = records.Skip(1). However, I don't want to re-start it - indeed, it might not be re-startable.
So, is there any way to take the first records, and then the remaining records (other than by reading all the values into a new collection and re-enumerating them)?
This is an interesting question, I think you can use a workaround like this, instead of using LINQ get the enumerator and use it:
private void ReadCsv(IEnumerable<string> records)
{
var enumerator = records.GetEnumerator();
enumerator.MoveNext();
var headerRecord = enumerator.Current;
var dataRecords = GetRemainingRecords(enumerator);
}
public IEnumerable<string> GetRemainingRecords(IEnumerator<string> enumerator)
{
while (enumerator.MoveNext())
{
if (enumerator.Current != null)
yield return enumerator.Current;
}
}
Update: According to your comment here is more extended way of doing this:
public static class CustomEnumerator
{
private static int _counter = 0;
private static IEnumerator enumerator;
public static IEnumerable<T> GetRecords<T>(this IEnumerable<T> source)
{
if (enumerator == null) enumerator = source.GetEnumerator();
if (_counter == 0)
{
enumerator.MoveNext();
_counter++;
yield return (T)enumerator.Current;
}
else
{
while (enumerator.MoveNext())
{
yield return (T)enumerator.Current;
}
_counter = 0;
enumerator = null;
}
}
}
Usage:
private static void ReadCsv(IEnumerable<string> records)
{
var headerRecord = records.GetRecords();
var dataRecords = records.GetRecords();
}
Yes, use the IEnumerator from the IEnumerable, and you can maintain the position across method calls;
A simple example;
public class Program
{
static void Main(string[] args)
{
int[] arr = new [] {1, 2, 3};
IEnumerator enumerator = arr.GetEnumerator();
enumerator.MoveNext();
Console.WriteLine(enumerator.Current);
DoRest(enumerator);
}
static void DoRest(IEnumerator enumerator)
{
while (enumerator.MoveNext())
Console.WriteLine(enumerator.Current);
}
}

Linq optimisation within a foreach

I have been looking for a way of splitting a foreach loop into multiple parts and came across the following code:
foreach(var item in items.Skip(currentPage * itemsPerPage).Take(itemsPerPage))
{
//Do stuff
}
Would items.Skip(currentPage * itemsPerPage).Take(itemsPerPage) be processed in every iteration, or would it be processed once, and have a temporary result used with the foreach loop automatically by the compiler?
No, it would be processed once.
It's the same like:
public IEnumerable<Something> GetData() {
return someData;
}
foreach(var d in GetData()) {
//do something with [d]
}
The foreach construction is equivalent to:
IEnumerator enumerator = myCollection.GetEnumerator();
try
{
while (enumerator.MoveNext())
{
object current = enumerator.Current;
Console.WriteLine(current);
}
}
finally
{
IDisposable e = enumerator as IDisposable;
if (e != null)
{
e.Dispose();
}
}
So, no, myCollection would be processed only once.
Update:
Please note that this depends on the implementation of the IEnumerator that the IEnumerable uses.
In this (evil) example:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Collections;
namespace TestStack
{
class EvilEnumerator<T> : IEnumerator<T> {
private IEnumerable<T> enumerable;
private int index = -1;
public EvilEnumerator(IEnumerable<T> e)
{
enumerable = e;
}
#region IEnumerator<T> Membres
public T Current
{
get { return enumerable.ElementAt(index); }
}
#endregion
#region IDisposable Membres
public void Dispose()
{
}
#endregion
#region IEnumerator Membres
object IEnumerator.Current
{
get { return enumerable.ElementAt(index); }
}
public bool MoveNext()
{
index++;
if (index >= enumerable.Count())
return false;
return true;
}
public void Reset()
{
}
#endregion
}
class DemoEnumerable<T> : IEnumerable<T>
{
private IEnumerable<T> enumerable;
public DemoEnumerable(IEnumerable<T> e)
{
enumerable = e;
}
#region IEnumerable<T> Membres
public IEnumerator<T> GetEnumerator()
{
return new EvilEnumerator<T>(enumerable);
}
#endregion
#region IEnumerable Membres
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
#endregion
}
class Program
{
static void Main(string[] args)
{
IEnumerable<int> numbers = Enumerable.Range(0,100);
DemoEnumerable<int> enumerable = new DemoEnumerable<int>(numbers);
foreach (var item in enumerable)
{
Console.WriteLine(item);
}
}
}
}
Each iteration over enumerable would evaluate numbers two times.
Question:
Would items.Skip(currentPage * itemsPerPage).Take(itemsPerPage) be
processed every iteration, or would it be processed once, and have a
temporary result used with the foreach loop automatically by the
compiler?
Answer:
It would be processed once, not every iteration. You can put the collection into a variable to make the foreach more readable. Illustrated below.
foreach(var item in items.Skip(currentPage * itemsPerPage).Take(itemsPerPage))
{
//Do stuff
}
vs.
List<MyClass> query = items.Skip(currentPage * itemsPerPage).Take(itemsPerPage).ToList();
foreach(var item in query)
{
//Do stuff
}
vs.
IEnumerable<MyClass> query = items.Skip(currentPage * itemsPerPage).Take(itemsPerPage);
foreach(var item in query)
{
//Do stuff
}
The code that you present will only iterate the items in the list once, as others have pointed out.
However, that only gives you the items for one page. If you are handling multiple pages, you must be calling that code once for each page (because somewhere you must be incrementing currentPage, right?).
What I mean is that you must be doing something like this:
for (int currentPage = 0; currentPage < numPages; ++currentPage)
{
foreach (var item in items.Skip(currentPage*itemsPerPage).Take(itemsPerPage))
{
//Do stuff
}
}
Now if you do that, then you will be iterating the sequence multiple times - once for each page. The first iteration will only go as far as the end of the first page, but the next will iterate from the beginning to the end of the second page (via the Skip() and the Take()) - and the next will iterate from the beginning to the end of the third page. And so on.
To avoid that you can write an extension method for IEnumerable<T> which partitions the data into batches (which you could also describe as "paginating" the data into "pages").
Rather than just presenting an IEnumerable of IEnumerables, it can be more useful to wrap each batch in a class to supply the batch index along with the items in the batch, like so:
public sealed class Batch<T>
{
public readonly int Index;
public readonly IEnumerable<T> Items;
public Batch(int index, IEnumerable<T> items)
{
Index = index;
Items = items;
}
}
public static class EnumerableExt
{
// Note: Not threadsafe, so not suitable for use with Parallel.Foreach() or IEnumerable.AsParallel()
public static IEnumerable<Batch<T>> Partition<T>(this IEnumerable<T> input, int batchSize)
{
var enumerator = input.GetEnumerator();
int index = 0;
while (enumerator.MoveNext())
yield return new Batch<T>(index++, nextBatch(enumerator, batchSize));
}
private static IEnumerable<T> nextBatch<T>(IEnumerator<T> enumerator, int blockSize)
{
do { yield return enumerator.Current; }
while (--blockSize > 0 && enumerator.MoveNext());
}
}
This extension method does not buffer the data, and it only iterates through it once.
Given this extension method, it becomes more readable to batch up the items. Note that this example enumerates through ALL items for all pages, unlike the OP's example which only iterates through the items for one page:
var items = Enumerable.Range(10, 50); // Pretend we have 50 items.
int itemsPerPage = 20;
foreach (var page in items.Partition(itemsPerPage))
{
Console.Write("Page " + page.Index + " items: ");
foreach (var i in page.Items)
Console.Write(i + " ");
Console.WriteLine();
}

C#: SkipLast implementation

I needed a method to give me all but the last item in a sequence. This is my current implementation:
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source)
{
using (IEnumerator<T> iterator = source.GetEnumerator())
{
if(iterator.MoveNext())
while(true)
{
var current = iterator.Current;
if(!iterator.MoveNext())
yield break;
yield return current;
}
}
}
What I need it for is to do something with all the items except the last one. In my case I have a sequence of objects with various properties. I then order them by date, and then I need to do an adjustment to all of them except the most recent item (which would be the last one after ordering).
Thing is, I am not too into these enumerators and stuff yet and don't really have anyone here to ask either :p What I am wondering is if this is a good implementation, or if I have done a small or big blunder somewhere. Or if maybe this take on the problem is a weird one, etc.
I guess a more general implementation could have been an AllExceptMaxBy method. Since that is kind of what it is. The MoreLinq has a MaxBy and MinBy method and my method kind of need to do the same, but return every item except the maximum or minimum one.
This is tricky, as "last element" isn't a Markov stopping point: you can't tell that you've got to the last element until you try to get the next one. It's doable, but only if you don't mind permanently being "one element behind". That's basically what your current implementation does, and it looks okay, although I'd probably write it slightly differently.
An alternative approach would be to use foreach, always yielding the previously returned value unless you were at the first iteration:
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source)
{
T previous = default(T);
bool first = true;
foreach (T element in source)
{
if (!first)
{
yield return previous;
}
previous = element;
first = false;
}
}
Another option, closer to your code:
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source)
{
using (IEnumerator<T> iterator = source.GetEnumerator())
{
if(!iterator.MoveNext())
{
yield break;
}
T previous = iterator.Current;
while (iterator.MoveNext())
{
yield return previous;
previous = iterator.Current;
}
}
}
That avoids nesting quite as deeply (by doing an early exit if the sequence is empty) and it uses a "real" while condition instead of while(true)
If you're using .NET 3.5, I guess you could use:
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source)
{
return source.TakeWhile((item, index) => index < source.Count() - 1))
}
Your implementation looks perfectly fine to me - it's probably the way I would do it.
The only simplification I might suggest in relation to your situation is to order the list the other way round (i.e. ascending rather than descending). Although this may not be suitable in your code, it would allow you to simply use collection.Skip(1) to take all items except the most recent one.
If this isn't possible for reasons you haven't shown in your post, then your current implementation is no problem at all.
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source)
{
if (!source.Any())
{
yield break;
}
Queue<T> items = new Queue<T>();
items.Enqueue(source.First());
foreach(T item in source.Skip(1))
{
yield return items.Dequeue();
items.Enqueue(item);
}
}
(Old answer scrapped; this code has been tested and works.) It prints
first
second
FIRST
SECOND
THIRD
public static class ExtNum{
public static IEnumerable skipLast(this IEnumerable source){
if ( ! source.Any())
yield break;
for (int i = 0 ; i <=source.Count()-2 ; i++ )
yield return source.ElementAt(i);
yield break;
}
}
class Program
{
static void Main( string[] args )
{
Queue qq = new Queue();
qq.Enqueue("first");qq.Enqueue("second");qq.Enqueue("third");
List lq = new List();
lq.Add("FIRST"); lq.Add("SECOND"); lq.Add("THIRD"); lq.Add("FOURTH");
foreach(string s1 in qq.skipLast())
Console.WriteLine(s1);
foreach ( string s2 in lq.skipLast())
Console.WriteLine(s2);
}
}
Combining all answers, and using <LangVersion>latest<LangVersion>:
.NET Fiddle
public static class EnumerableExtensions
{
// Source is T[]
public static IEnumerable<T> SkipLast<T>(this T[] source, int count) =>
source.TakeWhile((item, index) => index < source.Length - count);
public static IEnumerable<T> SkipLast<T>(this T[] source) => source.SkipLast(1);
// Source is ICollection<T>
public static IEnumerable<T> SkipLast<T>(this ICollection<T> source, int count) =>
source.TakeWhile((item, index) => index < source.Count - count);
public static IEnumerable<T> SkipLast<T>(this ICollection<T> source) => source.SkipLast(1);
// Source is unknown or IEnumerable<T>
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source, int count)
{
switch (source)
{
case T[] array:
return SkipLast(array, count);
case ICollection<T> collection:
return SkipLast(collection, count);
default:
return skipLast();
}
IEnumerable<T> skipLast()
{
using IEnumerator<T> iterator = source.GetEnumerator();
if (!iterator.MoveNext())
yield break;
Queue<T> items = new Queue<T>(count);
items.Enqueue(iterator.Current);
for (int i = 1; i < count && iterator.MoveNext(); i++)
items.Enqueue(iterator.Current);
while (iterator.MoveNext())
{
yield return items.Dequeue();
items.Enqueue(iterator.Current);
}
}
}
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source)
{
switch (source)
{
case T[] array:
return SkipLast(array);
case ICollection<T> collection:
return SkipLast(collection);
default:
return skipLast();
}
IEnumerable<T> skipLast()
{
using IEnumerator<T> iterator = source.GetEnumerator();
if (!iterator.MoveNext())
yield break;
T previous = iterator.Current;
while (iterator.MoveNext())
{
yield return previous;
previous = iterator.Current;
}
}
}
}

Is there a built-in way to convert IEnumerator to IEnumerable

Is there a built-in way to convert IEnumerator<T> to IEnumerable<T>?
The easiest way of converting I can think of is via the yield statement
public static IEnumerable<T> ToIEnumerable<T>(this IEnumerator<T> enumerator) {
while ( enumerator.MoveNext() ) {
yield return enumerator.Current;
}
}
compared to the list version this has the advantage of not enumerating the entire list before returning an IEnumerable. using the yield statement you'd only iterate over the items you need, whereas using the list version, you'd first iterate over all items in the list and then all the items you need.
for a little more fun you could change it to
public static IEnumerable<K> Select<K,T>(this IEnumerator<T> e,
Func<K,T> selector) {
while ( e.MoveNext() ) {
yield return selector(e.Current);
}
}
you'd then be able to use linq on your enumerator like:
IEnumerator<T> enumerator;
var someList = from item in enumerator
select new classThatTakesTInConstructor(item);
You could use the following which will kinda work.
public class FakeEnumerable<T> : IEnumerable<T> {
private IEnumerator<T> m_enumerator;
public FakeEnumerable(IEnumerator<T> e) {
m_enumerator = e;
}
public IEnumerator<T> GetEnumerator() {
return m_enumerator;
}
// Rest omitted
}
This will get you into trouble though when people expect successive calls to GetEnumerator to return different enumerators vs. the same one. But if it's a one time only use in a very constrained scenario, this could unblock you.
I do suggest though you try and not do this because I think eventually it will come back to haunt you.
A safer option is along the lines Jonathan suggested. You can expend the enumerator and create a List<T> of the remaining items.
public static List<T> SaveRest<T>(this IEnumerator<T> e) {
var list = new List<T>();
while ( e.MoveNext() ) {
list.Add(e.Current);
}
return list;
}
EnumeratorEnumerable<T>
A threadsafe, resettable adaptor from IEnumerator<T> to IEnumerable<T>
I use Enumerator parameters like in C++ forward_iterator concept.
I agree that this can lead to confusion as too many people will indeed assume Enumerators are /like/ Enumerables, but they are not.
However, the confusion is fed by the fact that IEnumerator contains the Reset method. Here is my idea of the most correct implementation. It leverages the implementation of IEnumerator.Reset()
A major difference between an Enumerable and and Enumerator is, that an Enumerable might be able to create several Enumerators simultaneously. This implementation puts a whole lot of work into making sure that this never happens for the EnumeratorEnumerable<T> type. There are two EnumeratorEnumerableModes:
Blocking (meaning that a second caller will simply wait till the first enumeration is completed)
NonBlocking (meaning that a second (concurrent) request for an enumerator simply throws an exception)
Note 1: 74 lines are implementation, 79 lines are testing code :)
Note 2: I didn't refer to any unit testing framework for SO convenience
using System;
using System.Diagnostics;
using System.Linq;
using System.Collections;
using System.Collections.Generic;
using System.Threading;
namespace EnumeratorTests
{
public enum EnumeratorEnumerableMode
{
NonBlocking,
Blocking,
}
public sealed class EnumeratorEnumerable<T> : IEnumerable<T>
{
#region LockingEnumWrapper
public sealed class LockingEnumWrapper : IEnumerator<T>
{
private static readonly HashSet<IEnumerator<T>> BusyTable = new HashSet<IEnumerator<T>>();
private readonly IEnumerator<T> _wrap;
internal LockingEnumWrapper(IEnumerator<T> wrap, EnumeratorEnumerableMode allowBlocking)
{
_wrap = wrap;
if (allowBlocking == EnumeratorEnumerableMode.Blocking)
Monitor.Enter(_wrap);
else if (!Monitor.TryEnter(_wrap))
throw new InvalidOperationException("Thread conflict accessing busy Enumerator") {Source = "LockingEnumWrapper"};
lock (BusyTable)
{
if (BusyTable.Contains(_wrap))
throw new LockRecursionException("Self lock (deadlock) conflict accessing busy Enumerator") { Source = "LockingEnumWrapper" };
BusyTable.Add(_wrap);
}
// always implicit Reset
_wrap.Reset();
}
#region Implementation of IDisposable and IEnumerator
public void Dispose()
{
lock (BusyTable)
BusyTable.Remove(_wrap);
Monitor.Exit(_wrap);
}
public bool MoveNext() { return _wrap.MoveNext(); }
public void Reset() { _wrap.Reset(); }
public T Current { get { return _wrap.Current; } }
object IEnumerator.Current { get { return Current; } }
#endregion
}
#endregion
private readonly IEnumerator<T> _enumerator;
private readonly EnumeratorEnumerableMode _allowBlocking;
public EnumeratorEnumerable(IEnumerator<T> e, EnumeratorEnumerableMode allowBlocking)
{
_enumerator = e;
_allowBlocking = allowBlocking;
}
private LockRecursionPolicy a;
public IEnumerator<T> GetEnumerator()
{
return new LockingEnumWrapper(_enumerator, _allowBlocking);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class TestClass
{
private static readonly string World = "hello world\n";
public static void Main(string[] args)
{
var master = World.GetEnumerator();
var nonblocking = new EnumeratorEnumerable<char>(master, EnumeratorEnumerableMode.NonBlocking);
var blocking = new EnumeratorEnumerable<char>(master, EnumeratorEnumerableMode.Blocking);
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
try
{
var willRaiseException = from c1 in nonblocking from c2 in nonblocking select new {c1, c2};
Console.WriteLine("Cartesian product: {0}", willRaiseException.Count()); // RAISE
}
catch (Exception e) { Console.WriteLine(e); }
foreach (var c in nonblocking) Console.Write(c); // OK (implicit Reset())
foreach (var c in blocking) Console.Write(c); // OK (implicit Reset())
try
{
var willSelfLock = from c1 in blocking from c2 in blocking select new { c1, c2 };
Console.WriteLine("Cartesian product: {0}", willSelfLock.Count()); // LOCK
}
catch (Exception e) { Console.WriteLine(e); }
// should not externally throw (exceptions on other threads reported to console)
if (ThreadConflictCombinations(blocking, nonblocking))
throw new InvalidOperationException("Should have thrown an exception on background thread");
if (ThreadConflictCombinations(nonblocking, nonblocking))
throw new InvalidOperationException("Should have thrown an exception on background thread");
if (ThreadConflictCombinations(nonblocking, blocking))
Console.WriteLine("Background thread timed out");
if (ThreadConflictCombinations(blocking, blocking))
Console.WriteLine("Background thread timed out");
Debug.Assert(true); // Must be reached
}
private static bool ThreadConflictCombinations(IEnumerable<char> main, IEnumerable<char> other)
{
try
{
using (main.GetEnumerator())
{
var bg = new Thread(o =>
{
try { other.GetEnumerator(); }
catch (Exception e) { Report(e); }
}) { Name = "background" };
bg.Start();
bool timedOut = !bg.Join(1000); // observe the thread waiting a full second for a lock (or throw the exception for nonblocking)
if (timedOut)
bg.Abort();
return timedOut;
}
} catch
{
throw new InvalidProgramException("Cannot be reached");
}
}
static private readonly object ConsoleSynch = new Object();
private static void Report(Exception e)
{
lock (ConsoleSynch)
Console.WriteLine("Thread:{0}\tException:{1}", Thread.CurrentThread.Name, e);
}
}
}
Note 3: I think the implementation of the thread locking (especially around BusyTable) is quite ugly; However, I didn't want to resort to ReaderWriterLock(LockRecursionPolicy.NoRecursion) and didn't want to assume .Net 4.0 for SpinLock
Solution with use of Factory along with fixing cached IEnumerator issue in JaredPar's answer allows to change the way of enumeration.
Consider a simple example: we want custom List<T> wrapper that allow to enumerate in reverse order along with default enumeration. List<T> already implements IEnumerator for default enumeration, we only need to create IEnumerator that enumerates in reverse order. (We won't use List<T>.AsEnumerable().Reverse() because it enumerates the list twice)
public enum EnumerationType {
Default = 0,
Reverse
}
public class CustomList<T> : IEnumerable<T> {
private readonly List<T> list;
public CustomList(IEnumerable<T> list) => this.list = new List<T>(list);
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
//Default IEnumerable method will return default enumerator factory
public IEnumerator<T> GetEnumerator()
=> GetEnumerable(EnumerationType.Default).GetEnumerator();
public IEnumerable<T> GetEnumerable(EnumerationType enumerationType)
=> enumerationType switch {
EnumerationType.Default => new DefaultEnumeratorFactory(list),
EnumerationType.Reverse => new ReverseEnumeratorFactory(list)
};
//Simple implementation of reverse list enumerator
private class ReverseEnumerator : IEnumerator<T> {
private readonly List<T> list;
private int index;
internal ReverseEnumerator(List<T> list) {
this.list = list;
index = list.Count-1;
Current = default;
}
public void Dispose() { }
public bool MoveNext() {
if(index >= 0) {
Current = list[index];
index--;
return true;
}
Current = default;
return false;
}
public T Current { get; private set; }
object IEnumerator.Current => Current;
void IEnumerator.Reset() {
index = list.Count - 1;
Current = default;
}
}
private abstract class EnumeratorFactory : IEnumerable<T> {
protected readonly List<T> List;
protected EnumeratorFactory(List<T> list) => List = list;
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public abstract IEnumerator<T> GetEnumerator();
}
private class DefaultEnumeratorFactory : EnumeratorFactory {
public DefaultEnumeratorFactory(List<T> list) : base(list) { }
//Default enumerator is already implemented in List<T>
public override IEnumerator<T> GetEnumerator() => List.GetEnumerator();
}
private class ReverseEnumeratorFactory : EnumeratorFactory {
public ReverseEnumeratorFactory(List<T> list) : base(list) { }
public override IEnumerator<T> GetEnumerator() => new ReverseEnumerator(List);
}
}
As Jason Watts said -- no, not directly.
If you really want to, you could loop through the IEnumerator<T>, putting the items into a List<T>, and return that, but I'm guessing that's not what you're looking to do.
The basic reason you can't go that direction (IEnumerator<T> to a IEnumerable<T>) is that IEnumerable<T> represents a set that can be enumerated, but IEnumerator<T> is a specific enumeratation over a set of items -- you can't turn the specific instance back into the thing that created it.
static class Helper
{
public static List<T> SaveRest<T>(this IEnumerator<T> enumerator)
{
var list = new List<T>();
while (enumerator.MoveNext())
{
list.Add(enumerator.Current);
}
return list;
}
public static ArrayList SaveRest(this IEnumerator enumerator)
{
var list = new ArrayList();
while (enumerator.MoveNext())
{
list.Add(enumerator.Current);
}
return list;
}
}
Nope, IEnumerator<> and IEnumerable<> are different beasts entirely.
This is a variant I have written... The specific is a little different. I wanted to do a MoveNext() on an IEnumerable<T>, check the result, and then roll everything in a new IEnumerator<T> that was "complete" (so that included even the element of the IEnumerable<T> I had already extracted)
// Simple IEnumerable<T> that "uses" an IEnumerator<T> that has
// already received a MoveNext(). "eats" the first MoveNext()
// received, then continues normally. For shortness, both IEnumerable<T>
// and IEnumerator<T> are implemented by the same class. Note that if a
// second call to GetEnumerator() is done, the "real" IEnumerator<T> will
// be returned, not this proxy implementation.
public class EnumerableFromStartedEnumerator<T> : IEnumerable<T>, IEnumerator<T>
{
public readonly IEnumerator<T> Enumerator;
public readonly IEnumerable<T> Enumerable;
// Received by creator. Return value of MoveNext() done by caller
protected bool FirstMoveNextSuccessful { get; set; }
// The Enumerator can be "used" only once, then a new enumerator
// can be requested by Enumerable.GetEnumerator()
// (default = false)
protected bool Used { get; set; }
// The first MoveNext() has been already done (default = false)
protected bool DoneMoveNext { get; set; }
public EnumerableFromStartedEnumerator(IEnumerator<T> enumerator, bool firstMoveNextSuccessful, IEnumerable<T> enumerable)
{
Enumerator = enumerator;
FirstMoveNextSuccessful = firstMoveNextSuccessful;
Enumerable = enumerable;
}
public IEnumerator<T> GetEnumerator()
{
if (Used)
{
return Enumerable.GetEnumerator();
}
Used = true;
return this;
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public T Current
{
get
{
// There are various school of though on what should
// happens if called before the first MoveNext() or
// after a MoveNext() returns false. We follow the
// "return default(TInner)" school of thought for the
// before first MoveNext() and the "whatever the
// Enumerator wants" for the after a MoveNext() returns
// false
if (!DoneMoveNext)
{
return default(T);
}
return Enumerator.Current;
}
}
public void Dispose()
{
Enumerator.Dispose();
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public bool MoveNext()
{
if (!DoneMoveNext)
{
DoneMoveNext = true;
return FirstMoveNextSuccessful;
}
return Enumerator.MoveNext();
}
public void Reset()
{
// This will 99% throw :-) Not our problem.
Enumerator.Reset();
// So it is improbable we will arrive here
DoneMoveNext = true;
}
}
Use:
var enumerable = someCollection<T>;
var enumerator = enumerable.GetEnumerator();
bool res = enumerator.MoveNext();
// do whatever you want with res/enumerator.Current
var enumerable2 = new EnumerableFromStartedEnumerator<T>(enumerator, res, enumerable);
Now, the first GetEnumerator() that will be requested to enumerable2 will be given through the enumerator enumerator. From the second onward the enumerable.GetEnumerator() will be used.
The other answers here are ... strange. IEnumerable<T> has just one method, GetEnumerator(). And an IEnumerable<T> must implement IEnumerable, which also has just one method, GetEnumerator() (the difference being that one is generic on T and the other is not). So it should be clear how to turn an IEnumerator<T> into an IEnumerable<T>:
// using modern expression-body syntax
public class IEnumeratorToIEnumerable<T> : IEnumerable<T>
{
private readonly IEnumerator<T> Enumerator;
public IEnumeratorToIEnumerable(IEnumerator<T> enumerator) =>
Enumerator = enumerator;
public IEnumerator<T> GetEnumerator() => Enumerator;
IEnumerator IEnumerable.GetEnumerator() => Enumerator;
}
foreach (var foo in new IEnumeratorToIEnumerable<Foo>(fooEnumerator))
DoSomethingWith(foo);
// and you can also do:
var fooEnumerable = new IEnumeratorToIEnumerable<Foo>(fooEnumerator);
foreach (var foo in fooEnumerable)
DoSomethingWith(foo);
// Some IEnumerators automatically repeat after MoveNext() returns false,
// in which case this is a no-op, but generally it's required.
fooEnumerator.Reset();
foreach (var foo in fooEnumerable)
DoSomethingElseWith(foo);
However, none of this should be needed because it's unusual to have an IEnumerator<T> that doesn't come with an IEnumerable<T> that returns an instance of it from its GetEnumerator method. If you're writing your own IEnumerator<T>, you should certainly provide the IEnumerable<T>. And really it's the other way around ... an IEnumerator<T> is intended to be a private class that iterates over instances of a public class that implements IEnumerable<T>.

Categories

Resources