C# Itreator best approach - c#

Apart from (IEnumerable Returns GetEnumerator() ,for "foreach" IEnumerble is essential)
almost the following two approaches allow us to iterate over the collection.What is
the advantage of one over another ? (I am not asking the difference between IEnumerable
and IEnumerator).
static void Main()
{
IEnumerator<int> em = MyEnumerator<int>(new int[] { 1, 2, 3, 4 });
IEnumerator<int> e = Collection<int>
(new int[] { 1, 2, 3, 4 }).GetEnumerator();
while (em.MoveNext())
{
Console.WriteLine(em.Current);
}
while (e.MoveNext())
{
Console.WriteLine(e.Current);
}
Console.ReadKey(true);
}
approach 1
public static IEnumerator<T> MyEnumerator<T>(T[] vals )
{
T[] some = vals;
foreach (var v in some)
{
yield return v;
}
}
approach 2
public static IEnumerable<T> Collection<T>(T[] vals)
{
T[] some = vals;
foreach (var v in some)
{
yield return v;
}
}

The main difference is that most API support an imput of IEnumerable<T> but not of IEnumerator<T>.
You also have to remember to call Reset() when using it while the syntax is more evident in IEnumerable<T> (Just call GetEnumerator again). Also see the comment of Eric Lipper about reset being a bad idea; if Reset isn't implemented in your IEnumerator<T> or is buggy it become a one-time-only enumerator (pretty useless in a lot of cases).
Another difference may be that you could have an IEnumerable<T> that could be enumerated from multiple threads at the same time but an IEnumerator<T> store one position in the enumerated data (Imagine a RandomEnumerable or RangeEnumerable).
So the conclusion is that IEnumerable<T> is more versatile, but anyway if you have a function returning an IEnumerator<T> generating the IEnumerable<T> around it is simple.
class EnumeratorWrapper<T> : IEnumerable<T>
{
Func<IEnumerator<T>> m_generator;
public EnumeratorWrapper(Func<IEnumerator<T>> generator)
{
m_generator = generator;
}
public IEnumerator<T> GetEnumerator()
{
return m_generator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return m_generator();
}
}
Just for API consistency using IEnumerable<T> seem the best solution to me.
But the issue with Reset() in IEnumerator<T> make it an unusable solution anyway, so IEnumerable<T> is the way to go.

Correct me if I'm wrong but the only difference is the difference between IEnumerable and IEnumerator and since you specifically said you're not asking the difference, both are a good...

Related

Just when is a stackoverflow fair and sensible?

Code updated
For fixing the bug of a filtered Interminable, the following code is updated and merged into original:
public static bool IsInfinity(this IEnumerable x) {
var it=
x as Infinity??((Func<object>)(() => {
var info=x.GetType().GetField("source", bindingAttr);
return null!=info?info.GetValue(x):x;
}))();
return it is Infinity;
}
bindingAttr is declared a constant.
Summary
I'm trying to implement an infinite enumerable, but encountered something seem to be illogical, and temporarily run out of idea. I need some direction to complete the code, becoming a semantic, logical, and reasonable design.
The whole story
I've asked the question a few hours ago:
Is an infinite enumerable still "enumerable"?
This might not be a good pattern of implementation. What I'm trying to do, is implement an enumerable to present infinity, in a logical and semantic way(I thought ..). I would put the code at the last of this post.
The big problem is, it's just for presenting of infinite enumerable, but the enumeration on it in fact doesn't make any sense, since there are no real elements of it.
So, besides provide dummy elements for the enumeration, there are four options I can imagine, and three lead to the StackOverflowException.
Throw an InvalidOperationException once it's going to be enumerated.
public IEnumerator<T> GetEnumerator() {
for(var message="Attempted to enumerate an infinite enumerable"; ; )
throw new InvalidOperationException(message);
}
and 3. are technically equivalent, let the stack overflowing occurs when it's really overflowed.
public IEnumerator<T> GetEnumerator() {
foreach(var x in this)
yield return x;
}
public IEnumerator<T> GetEnumerator() {
return this.GetEnumerator();
}
(described in 2)
Don't wait for it happens, throw StackOverflowException directly.
public IEnumerator<T> GetEnumerator() {
throw new StackOverflowException("... ");
}
The tricky things are:
If option 1 is applied, that is, enumerate on this enumerable, becomes an invalid operation. Isn't it weird to say that this lamp isn't used to illuminate(though it's true in my case).
If option 2 or option 3 is applied, that is, we planned the stack overflowing. Is it really as the title, just when stackoverflow is fair and sensible? Perfectly logical and reasonable?
The last choice is option 4. However, the stack in fact does not really overflow, since we prevented it by throwing a fake StackOverflowException. This reminds me that when Tom Cruise plays John Anderton said that: "But it didn't fall. You caught it. The fact that you prevented it from happening doesnt change the fact that it was going to happen."
Some good ways to avoid the illogical problems?
The code is compile-able and testable, note that one of OPTION_1 to OPTION_4 shoule be defined before compile.
Simple test
var objects=new object[] { };
Debug.Print("{0}", objects.IsInfinity());
var infObjects=objects.AsInterminable();
Debug.Print("{0}", infObjects.IsInfinity());
Classes
using System.Collections.Generic;
using System.Collections;
using System;
public static partial class Interminable /* extensions */ {
public static Interminable<T> AsInterminable<T>(this IEnumerable<T> x) {
return Infinity.OfType<T>();
}
public static Infinity AsInterminable(this IEnumerable x) {
return Infinity.OfType<object>();
}
public static bool IsInfinity(this IEnumerable x) {
var it=
x as Infinity??((Func<object>)(() => {
var info=x.GetType().GetField("source", bindingAttr);
return null!=info?info.GetValue(x):x;
}))();
return it is Infinity;
}
const BindingFlags bindingAttr=
BindingFlags.Instance|BindingFlags.NonPublic;
}
public abstract partial class Interminable<T>: Infinity, IEnumerable<T> {
IEnumerator IEnumerable.GetEnumerator() {
return this.GetEnumerator();
}
#if OPTION_1
public IEnumerator<T> GetEnumerator() {
for(var message="Attempted to enumerate an infinite enumerable"; ; )
throw new InvalidOperationException(message);
}
#endif
#if OPTION_2
public IEnumerator<T> GetEnumerator() {
foreach(var x in this)
yield return x;
}
#endif
#if OPTION_3
public IEnumerator<T> GetEnumerator() {
return this.GetEnumerator();
}
#endif
#if OPTION_4
public IEnumerator<T> GetEnumerator() {
throw new StackOverflowException("... ");
}
#endif
public Infinity LongCount<U>(
Func<U, bool> predicate=default(Func<U, bool>)) {
return this;
}
public Infinity Count<U>(
Func<U, bool> predicate=default(Func<U, bool>)) {
return this;
}
public Infinity LongCount(
Func<T, bool> predicate=default(Func<T, bool>)) {
return this;
}
public Infinity Count(
Func<T, bool> predicate=default(Func<T, bool>)) {
return this;
}
}
public abstract partial class Infinity: IFormatProvider, ICustomFormatter {
partial class Instance<T>: Interminable<T> {
public static readonly Interminable<T> instance=new Instance<T>();
}
object IFormatProvider.GetFormat(Type formatType) {
return typeof(ICustomFormatter)!=formatType?null:this;
}
String ICustomFormatter.Format(
String format, object arg, IFormatProvider formatProvider) {
return "Infinity";
}
public override String ToString() {
return String.Format(this, "{0}", this);
}
public static Interminable<T> OfType<T>() {
return Instance<T>.instance;
}
}
public IEnumerator<T> GetEnumerator()
{
while (true)
yield return default(T);
}
This will create an infinite enumerator - a foreach on it will never end and will just continue to give out the default value.
Note that you will not be able to determine IsInfinity() the way you wrote in your code. That is because new Infinity().Where(o => o == /*do any kind of comparison*/) will still be infinite but will have a different type.
As mentioned in the other post you linked, an infinite enumeration makes perfectly sense for C# to enumerate and there are an huge amount of real-world examples where people write enumerators that just do never end(first thing that springs off my mind is a random number generator).
So you have a particular case in your mathematical problem, where you need to define a special value (infinite number of points of intersection). Usually, that is where I use simple static constants for. Just define some static constant IEnumerable and test against it to find out whether your algorithm had the "infinite number of intersection" as result.
To more specific answer your current question: DO NOT EVER EVER cause a real stack overflow. This is about the nastiest thing you can do to users of your code. It can not be caught and will immediately terminate your process(probably the only exception is when you are running inside an attached instrumenting debugger).
If at all, I would use NotSupportedException which is used in other places to signal that some class do not support a feature(E.g. ICollections may throw this in Remove() if they are read-only).
If I understand correctly -- infinite is a confusing word here. I think you need a monad which is either enumerable or not. But let's stick with infinite for now.
I cannot think of a nice way of implementing this in C#. All ways this could be implemented don't integrate with C# generators.
With C# generator, you can only emit valid values; so there's no way to indicate that this is an infinite enumerable. I don't like idea of throwing exceptions from generator to indicate that it is infinite; because to check that it is infinite, you will have to to try-catch every time.
If you don't need to support generators, then I see following options :
Implement sentinel enumerable:
public class InfiniteEnumerable<T>: IEnumerable<T> {
private static InfiniteEnumerable<T> val;
public static InfiniteEnumerable<T> Value {
get {
return val;
}
}
public IEnumerator<T> GetEnumerator() {
throw new InvalidOperationException(
"This enumerable cannot be enumerated");
}
IEnumerator IEnumerable.GetEnumerator() {
throw new InvalidOperationException(
"This enumerable cannot be enumerated");
}
}
Sample usage:
IEnumerable<int> enumerable=GetEnumerable();
if(enumerable==InfiniteEnumerable<int>.Value) {
// This is 'infinite' enumerable.
}
else {
// enumerate it here.
}
Implement Infinitable<T> wrapper:
public class Infinitable<T>: IEnumerable<T> {
private IEnumerable<T> enumerable;
private bool isInfinite;
public Infinitable(IEnumerable<T> enumerable) {
this.enumerable=enumerable;
this.isInfinite=false;
}
public Infinitable() {
this.isInfinite=true;
}
public bool IsInfinite {
get {
return isInfinite;
}
}
public IEnumerator<T> GetEnumerator() {
if(isInfinite) {
throw new InvalidOperationException(
"The enumerable cannot be enumerated");
}
return this.enumerable.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator() {
if(isInfinite) {
throw new InvalidOperationException(
"The enumerable cannot be enumerated");
}
return this.enumerable.GetEnumerator();
}
}
Sample usage:
Infinitable<int> enumerable=GetEnumerable();
if(enumerable.IsInfinite) {
// This is 'infinite' enumerable.
}
else {
// enumerate it here.
foreach(var i in enumerable) {
}
}
Infinite sequences may be perfectly iterable/enumerable. Natural numbers are enumerable and so are rational numbers or PI digits. Infinite is the opposite of finite, not enumerable.
The variants that you've provided don't represent the infinite sequences. There are infinitely many different infinite sequences and you can see that they're different by iterating through them. Your idea, on the other hand, is to have a singleton, which goes against that diversity.
If you have something that cannot be enumerated (like the set of real numbers), then you just shouldn't define it as IEnumerable as it's breaking the contract.
If you want to discern between finite and infinite enumerable sequences, just crate a new interface IInfiniteEnumerable : IEnumerable and mark infinite sequences with it.
Interface that marks infinite sequences
public interface IInfiniteEnumerable<T> : IEnumerable<T> {
}
A wrapper to convert an existing IEnumerable<T> to IInfiniteEnumerable<T> (IEnumerables are easily created with C#'s yield syntax, but we need to convert them to IInfiniteEnumerable )
public class InfiniteEnumerableWrapper<T> : IInfiniteEnumerable<T> {
IEnumerable<T> _enumerable;
public InfiniteEnumerableWrapper(IEnumerable<T> enumerable) {
_enumerable = enumerable;
}
public IEnumerator<T> GetEnumerator() {
return _enumerable.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator() {
return _enumerable.GetEnumerator();
}
}
Some infinity-aware routines (like calculating the sequence length)
//TryGetCount() returns null if the sequence is infinite
public static class EnumerableExtensions {
public static int? TryGetCount<T>(this IEnumerable<T> sequence) {
if (sequence is IInfiniteEnumerable<T>) {
return null;
} else {
return sequence.Count();
}
}
}
Two examples of sequences - a finite range sequence and the infinite Fibonacci sequence.
public class Sequences {
public static IEnumerable<int> GetIntegerRange(int start, int count) {
return Enumerable.Range(start, count);
}
public static IInfiniteEnumerable<int> GetFibonacciSequence() {
return new InfiniteEnumerableWrapper<int>(GetFibonacciSequenceInternal());
}
static IEnumerable<int> GetFibonacciSequenceInternal() {
var p = 0;
var q = 1;
while (true) {
yield return p;
var newQ = p + q;
p = q;
q = newQ;
}
}
}
A test app that generates random sequences and tries to calculate their lengths.
public class TestApp {
public static void Main() {
for (int i = 0; i < 20; i++) {
IEnumerable<int> sequence = GetRandomSequence();
Console.WriteLine(sequence.TryGetCount() ?? double.PositiveInfinity);
}
Console.ReadLine();
}
static Random _rng = new Random();
//Randomly generates an finite or infinite sequence
public static IEnumerable<int> GetRandomSequence() {
int random = _rng.Next(5) * 10;
if (random == 0) {
return Sequences.GetFibonacciSequence();
} else {
return Sequences.GetIntegerRange(0, random);
}
}
}
The program output something like this:
20
40
20
10
20
10
20
Infinity
40
30
40
Infinity
Infinity
40
40
30
20
30
40
30

Is there a way to Memorize or Materialize an IEnumerable?

When given an d you could be dealing with a fixed sequence like a list or array, an AST that will enumerate some external datasource, or even an AST on some existing collection. Is there a way to safely "materialize" the enumerable so that enumeration operations like foreach, count, etc. don't execute the AST each time?
I've often used .ToArray() to create this represenation but if the underlying storage is already a list or other fixed sequence, that seems like wasted copying. It would be nice if i could do
var enumerable = someEnumerable.Materialize();
if(enumberable.Any() {
foreach(var item in enumerable) {
...
}
} else {
...
}
Without having to worry that .Any() and foreach try to enumerate the sequence twice and without it unccessarily copying the enumerable.
Easy enough:
public static IList<TSource> Materialize<TSource>(this IEnumerable<TSource> source)
{
if (source is IList<TSource>)
{
// Already a list, use it as is
return (IList<TSource>)source;
}
else
{
// Not a list, materialize it to a list
return source.ToList();
}
}
Original answer:
Same as Thomas's answer, just a bit better according to me:
public static ICollection<T> Materialize<T>(this IEnumerable<T> source)
{
// Null check...
return source as ICollection<T> ?? source.ToList();
}
Please note that this tend to return the existing collection itself if its a valid collection type, or produces a new collection otherwise. While the two are subtly different, I don't think it could be an issue.
Edit:
Today this is a better solution:
public static IReadOnlyCollection<T> Materialize<T>(this IEnumerable<T> source)
{
// Null check...
switch (source)
{
case ICollection<T> collection:
return new ReadOnlyCollectionAdapter<T>(collection);
case IReadOnlyCollection<T> readOnlyCollection:
return readOnlyCollection;
default:
return source.ToList();
}
}
public class ReadOnlyCollectionAdapter<T> : IReadOnlyCollection<T>
{
readonly ICollection<T> m_source;
public ReadOnlyCollectionAdapter(ICollection<T> source) => m_source = source;
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public int Count => m_source.Count;
public IEnumerator<T> GetEnumerator() => m_source.GetEnumerator();
}
Check out this blog post I wrote a couple of years ago: http://www.fallingcanbedeadly.com/posts/crazy-extention-methods-tolazylist
In it, I define a method called ToLazyList that effectively does what you're looking for.
As written, it will eventually make a full copy of the input sequence, although you could tweak it so that instances of IList don't get wrapped in a LazyList, which would prevent this from happening (this action, however, would carry with it the assumption that any IList you get is already effectively memoized).

Grouping consecutive identical items: IEnumerable<T> to IEnumerable<IEnumerable<T>>

I've got an interresting problem: Given an IEnumerable<string>, is it possible to yield a sequence of IEnumerable<IEnumerable<string>> that groups identical adjacent strings in one pass?
Let me explain.
1. Basic illustrative sample :
Considering the following IEnumerable<string> (pseudo representation):
{"a","b","b","b","c","c","d"}
How to get an IEnumerable<IEnumerable<string>> that would yield something of the form:
{ // IEnumerable<IEnumerable<string>>
{"a"}, // IEnumerable<string>
{"b","b","b"}, // IEnumerable<string>
{"c","c"}, // IEnumerable<string>
{"d"} // IEnumerable<string>
}
The method prototype would be:
public IEnumerable<IEnumerable<string>> Group(IEnumerable<string> items)
{
// todo
}
But it could also be :
public void Group(IEnumerable<string> items, Action<IEnumerable<string>> action)
{
// todo
}
...where action would be called for each subsequence.
2. More complicated sample
Ok, the first sample is very simple, and only aims to make the high level intent clear.
Now imagine we are dealing with IEnumerable<Anything>, where Anything is a type defined like this:
public class Anything
{
public string Key {get;set;}
public double Value {get;set;}
}
We now want to generate the subsequences based on the Key, (group every consecutive Anything that have the same key) to later use them in order to calculate the total value by group:
public void Compute(IEnumerable<Anything> items)
{
Console.WriteLine(items.Sum(i=>i.Value));
}
// then somewhere, assuming the Group method
// that returns an IEnumerable<IEnumerable<Anything>> actually exists:
foreach(var subsequence in Group(allItems))
{
Compute(subsequence);
}
3. Important notes
Only one iteration over the original sequence
No intermediary collections allocations (we can assume millions of items in the original sequence, and millions consecutives items in each group)
Keeping enumerators and defered execution behavior
We can assume that resulting subsequences will be iterated only once, and will be iterated in order.
Is it possible, and how would you write it?
Is this what you are looking for?
Iterate list only once.
Defer execution.
No intermediate collections (my other post failed on this criterion).
This solution relies on object state because it's difficult to share state between two IEnumerable methods that use yield (no ref or out params).
internal class Program
{
static void Main(string[] args)
{
var result = new[] { "a", "b", "b", "b", "c", "c", "d" }.Partition();
foreach (var r in result)
{
Console.WriteLine("Group".PadRight(16, '='));
foreach (var s in r)
Console.WriteLine(s);
}
}
}
internal static class PartitionExtension
{
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> src)
{
var grouper = new DuplicateGrouper<T>();
return grouper.GroupByDuplicate(src);
}
}
internal class DuplicateGrouper<T>
{
T CurrentKey;
IEnumerator<T> Itr;
bool More;
public IEnumerable<IEnumerable<T>> GroupByDuplicate(IEnumerable<T> src)
{
using(Itr = src.GetEnumerator())
{
More = Itr.MoveNext();
while (More)
yield return GetDuplicates();
}
}
IEnumerable<T> GetDuplicates()
{
CurrentKey = Itr.Current;
while (More && CurrentKey.Equals(Itr.Current))
{
yield return Itr.Current;
More = Itr.MoveNext();
}
}
}
Edit: Added extension method for cleaner usage. Fixed loop test logic so that "More" is evaluated first.
Edit: Dispose the enumerator when finished
Way Better Solution That Meets All Requirements
OK, scrap my previous solution (I'll leave it below, just for reference). Here's a much better approach that occurred to me after making my initial post.
Write a new class that implements IEnumerator<T> and provides a few additional properties: IsValid and Previous. This is all you really need to resolve the whole mess with having to maintain state inside an iterator block using yield.
Here's how I did it (pretty trivial, as you can see):
internal class ChipmunkEnumerator<T> : IEnumerator<T> {
private readonly IEnumerator<T> _internal;
private T _previous;
private bool _isValid;
public ChipmunkEnumerator(IEnumerator<T> e) {
_internal = e;
_isValid = false;
}
public bool IsValid {
get { return _isValid; }
}
public T Previous {
get { return _previous; }
}
public T Current {
get { return _internal.Current; }
}
public bool MoveNext() {
if (_isValid)
_previous = _internal.Current;
return (_isValid = _internal.MoveNext());
}
public void Dispose() {
_internal.Dispose();
}
#region Explicit Interface Members
object System.Collections.IEnumerator.Current {
get { return Current; }
}
void System.Collections.IEnumerator.Reset() {
_internal.Reset();
_previous = default(T);
_isValid = false;
}
#endregion
}
(I called this a ChipmunkEnumerator because maintaining the previous value reminded me of how chipmunks have pouches in their cheeks where they keep nuts. Does it really matter? Stop making fun of me.)
Now, utilizing this class in an extension method to provide exactly the behavior you want isn't so tough!
Notice that below I've defined GroupConsecutive to actually return an IEnumerable<IGrouping<TKey, T>> for the simple reason that, if these are grouped by key anyway, it makes sense to return an IGrouping<TKey, T> rather than just an IEnumerable<T>. As it turns out, this will help us out later anyway...
public static IEnumerable<IGrouping<TKey, T>> GroupConsecutive<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
using (var e = new ChipmunkEnumerator<T>(source.GetEnumerator())) {
if (!e.MoveNext())
yield break;
while (e.IsValid) {
yield return e.GetNextDuplicateGroup(keySelector);
}
}
}
public static IEnumerable<IGrouping<T, T>> GroupConsecutive<T>(this IEnumerable<T> source)
where T : IEquatable<T> {
return source.GroupConsecutive(x => x);
}
private static IGrouping<TKey, T> GetNextDuplicateGroup<T, TKey>(this ChipmunkEnumerator<T> e, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
return new Grouping<TKey, T>(keySelector(e.Current), e.EnumerateNextDuplicateGroup(keySelector));
}
private static IEnumerable<T> EnumerateNextDuplicateGroup<T, TKey>(this ChipmunkEnumerator<T> e, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
do {
yield return e.Current;
} while (e.MoveNext() && keySelector(e.Previous).Equals(keySelector(e.Current)));
}
(To implement these methods, I wrote a simple Grouping<TKey, T> class that implements IGrouping<TKey, T> in the most straightforward way possible. I've omitted the code just so as to keep moving along...)
OK, check it out. I think the code example below pretty well captures something resembling the more realistic scenario you described in your updated question.
var entries = new List<KeyValuePair<string, int>> {
new KeyValuePair<string, int>( "Dan", 10 ),
new KeyValuePair<string, int>( "Bill", 12 ),
new KeyValuePair<string, int>( "Dan", 14 ),
new KeyValuePair<string, int>( "Dan", 20 ),
new KeyValuePair<string, int>( "John", 1 ),
new KeyValuePair<string, int>( "John", 2 ),
new KeyValuePair<string, int>( "Bill", 5 )
};
var dupeGroups = entries
.GroupConsecutive(entry => entry.Key);
foreach (var dupeGroup in dupeGroups) {
Console.WriteLine(
"Key: {0} Sum: {1}",
dupeGroup.Key.PadRight(5),
dupeGroup.Select(entry => entry.Value).Sum()
);
}
Output:
Key: Dan Sum: 10
Key: Bill Sum: 12
Key: Dan Sum: 34
Key: John Sum: 3
Key: Bill Sum: 5
Notice this also fixes the problem with my original answer of dealing with IEnumerator<T> objects that were value types. (With this approach, it doesn't matter.)
There's still going to be a problem if you try calling ToList here, as you will find out if you try it. But considering you included deferred execution as a requirement, I doubt you would be doing that anyway. For a foreach, it works.
Original, Messy, and Somewhat Stupid Solution
Something tells me I'm going to get totally refuted for saying this, but...
Yes, it is possible (I think). See below for a damn messy solution I threw together. (Catches an exception to know when it's finished, so you know it's a great design!)
Now, Jon's point about there being a very real problem in the event that you try to do, for instance, ToList, and then access the values in the resulting list by index, is totally valid. But if your only intention here is to be able to loop over an IEnumerable<T> using a foreach -- and you're only doing this in your own code -- then, well, I think this could work for you.
Anyway, here's a quick example of how it works:
var ints = new int[] { 1, 3, 3, 4, 4, 4, 5, 2, 3, 1, 6, 6, 6, 5, 7, 7, 8 };
var dupeGroups = ints.GroupConsecutiveDuplicates(EqualityComparer<int>.Default);
foreach (var dupeGroup in dupeGroups) {
Console.WriteLine(
"New dupe group: " +
string.Join(", ", dupeGroup.Select(i => i.ToString()).ToArray())
);
}
Output:
New dupe group: 1
New dupe group: 3, 3
New dupe group: 4, 4, 4
New dupe group: 5
New dupe group: 2
New dupe group: 3
New dupe group: 1
New dupe group: 6, 6, 6
New dupe group: 5
New dupe group: 7, 7
New dupe group: 8
And now for the (messy as crap) code:
Note that since this approach requires passing the actual enumerator around between a few different methods, it will not work if that enumerator is a value type, as calls to MoveNext in one method are only affecting a local copy.
public static IEnumerable<IEnumerable<T>> GroupConsecutiveDuplicates<T>(this IEnumerable<T> source, IEqualityComparer<T> comparer) {
using (var e = source.GetEnumerator()) {
if (e.GetType().IsValueType)
throw new ArgumentException(
"This method will not work on a value type enumerator."
);
// get the ball rolling
if (!e.MoveNext()) {
yield break;
}
IEnumerable<T> nextDuplicateGroup;
while (e.FindMoreDuplicates(comparer, out nextDuplicateGroup)) {
yield return nextDuplicateGroup;
}
}
}
private static bool FindMoreDuplicates<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer, out IEnumerable<T> duplicates) {
duplicates = enumerator.GetMoreDuplicates(comparer);
return duplicates != null;
}
private static IEnumerable<T> GetMoreDuplicates<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer) {
try {
if (enumerator.Current != null)
return enumerator.GetMoreDuplicatesInner(comparer);
else
return null;
} catch (InvalidOperationException) {
return null;
}
}
private static IEnumerable<T> GetMoreDuplicatesInner<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer) {
while (enumerator.Current != null) {
var current = enumerator.Current;
yield return current;
if (!enumerator.MoveNext())
break;
if (!comparer.Equals(current, enumerator.Current))
break;
}
}
Your second bullet is the problematic one. Here's why:
var groups = CallMagicGetGroupsMethod().ToList();
foreach (string x in groups[3])
{
...
}
foreach (string x in groups[0])
{
...
}
Here, it's trying to iterate over the fourth group and then the first group... that's clearly only going to work if all the groups are buffered or it can reread the sequence, neither of which is ideal.
I suspect you want a more "reactive" approach - I don't know offhand whether Reactive Extensions does what you want (the "consecutive" requirement is unusual) but you should basically provide some sort of action to be executed on each group... that way the method won't need to worry about having to return you something which could be used later on, after it's already finished reading.
Let me know if you'd like me to try to find a solution within Rx, or whether you would be happy with something like:
void GroupConsecutive(IEnumerable<string> items,
Action<IEnumerable<string>> action)
Here's a solution that I think satisfies your requirements, works with any type of data item, and is quite short and readable:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> list)
{
var current = list.FirstOrDefault();
while (!Equals(current, default(T))) {
var cur = current;
Func<T, bool> equalsCurrent = item => item.Equals(cur);
yield return list.TakeWhile(equalsCurrent);
list = list.SkipWhile(equalsCurrent);
current = list.FirstOrDefault();
}
}
Notes:
Deferred execution is there (both TakeWhile and SkipWhile do it).
I think this iterates over the entire collection only once (with SkipWhile); it does iterate over the collection once more when you process the returned IEnumerables, but the partitioning itself iterates only once.
If you don't care about value types, you can add a constraint and change the while condition to a test for null.
If I am somehow mistaken, I 'd be especially interested in comments pointing out the mistakes!
Very Important Aside:
This solution will not allow you to enumerate the produced enumerables in any order other than the one it provides them in. However, I think the original poster has been pretty clear in comments that this is not a problem.

Passing a single item as IEnumerable<T>

Is there a common way to pass a single item of type T to a method which expects an IEnumerable<T> parameter? Language is C#, framework version 2.0.
Currently I am using a helper method (it's .Net 2.0, so I have a whole bunch of casting/projecting helper methods similar to LINQ), but this just seems silly:
public static class IEnumerableExt
{
// usage: IEnumerableExt.FromSingleItem(someObject);
public static IEnumerable<T> FromSingleItem<T>(T item)
{
yield return item;
}
}
Other way would of course be to create and populate a List<T> or an Array and pass it instead of IEnumerable<T>.
[Edit] As an extension method it might be named:
public static class IEnumerableExt
{
// usage: someObject.SingleItemAsEnumerable();
public static IEnumerable<T> SingleItemAsEnumerable<T>(this T item)
{
yield return item;
}
}
Am I missing something here?
[Edit2] We found someObject.Yield() (as #Peter suggested in the comments below) to be the best name for this extension method, mainly for brevity, so here it is along with the XML comment if anyone wants to grab it:
public static class IEnumerableExt
{
/// <summary>
/// Wraps this object instance into an IEnumerable<T>
/// consisting of a single item.
/// </summary>
/// <typeparam name="T"> Type of the object. </typeparam>
/// <param name="item"> The instance that will be wrapped. </param>
/// <returns> An IEnumerable<T> consisting of a single item. </returns>
public static IEnumerable<T> Yield<T>(this T item)
{
yield return item;
}
}
Well, if the method expects an IEnumerable you've got to pass something that is a list, even if it contains one element only.
passing
new[] { item }
as the argument should be enough I think
In C# 3.0 you can utilize the System.Linq.Enumerable class:
// using System.Linq
Enumerable.Repeat(item, 1);
This will create a new IEnumerable that only contains your item.
Your helper method is the cleanest way to do it, IMO. If you pass in a list or an array, then an unscrupulous piece of code could cast it and change the contents, leading to odd behaviour in some situations. You could use a read-only collection, but that's likely to involve even more wrapping. I think your solution is as neat as it gets.
In C# 3 (I know you said 2), you can write a generic extension method which might make the syntax a little more acceptable:
static class IEnumerableExtensions
{
public static IEnumerable<T> ToEnumerable<T>(this T item)
{
yield return item;
}
}
client code is then item.ToEnumerable().
This helper method works for item or many.
public static IEnumerable<T> ToEnumerable<T>(params T[] items)
{
return items;
}
I'm kind of surprised that no one suggested a new overload of the method with an argument of type T to simplify the client API.
public void DoSomething<T>(IEnumerable<T> list)
{
// Do Something
}
public void DoSomething<T>(T item)
{
DoSomething(new T[] { item });
}
Now your client code can just do this:
MyItem item = new MyItem();
Obj.DoSomething(item);
or with a list:
List<MyItem> itemList = new List<MyItem>();
Obj.DoSomething(itemList);
Either (as has previously been said)
MyMethodThatExpectsAnIEnumerable(new[] { myObject });
or
MyMethodThatExpectsAnIEnumerable(Enumerable.Repeat(myObject, 1));
As a side note, the last version can also be nice if you want an empty list of an anonymous object, e.g.
var x = MyMethodThatExpectsAnIEnumerable(Enumerable.Repeat(new { a = 0, b = "x" }, 0));
I agree with #EarthEngine's comments to the original post, which is that 'AsSingleton' is a better name. See this wikipedia entry. Then it follows from the definition of singleton that if a null value is passed as an argument that 'AsSingleton' should return an IEnumerable with a single null value instead of an empty IEnumerable which would settle the if (item == null) yield break; debate. I think the best solution is to have two methods: 'AsSingleton' and 'AsSingletonOrEmpty'; where, in the event that a null is passed as an argument, 'AsSingleton' will return a single null value and 'AsSingletonOrEmpty' will return an empty IEnumerable. Like this:
public static IEnumerable<T> AsSingletonOrEmpty<T>(this T source)
{
if (source == null)
{
yield break;
}
else
{
yield return source;
}
}
public static IEnumerable<T> AsSingleton<T>(this T source)
{
yield return source;
}
Then, these would, more or less, be analogous to the 'First' and 'FirstOrDefault' extension methods on IEnumerable which just feels right.
This is 30% faster than yield or Enumerable.Repeat when used in foreach due to this C# compiler optimization, and of the same performance in other cases.
public struct SingleSequence<T> : IEnumerable<T> {
public struct SingleEnumerator : IEnumerator<T> {
private readonly SingleSequence<T> _parent;
private bool _couldMove;
public SingleEnumerator(ref SingleSequence<T> parent) {
_parent = parent;
_couldMove = true;
}
public T Current => _parent._value;
object IEnumerator.Current => Current;
public void Dispose() { }
public bool MoveNext() {
if (!_couldMove) return false;
_couldMove = false;
return true;
}
public void Reset() {
_couldMove = true;
}
}
private readonly T _value;
public SingleSequence(T value) {
_value = value;
}
public IEnumerator<T> GetEnumerator() {
return new SingleEnumerator(ref this);
}
IEnumerator IEnumerable.GetEnumerator() {
return new SingleEnumerator(ref this);
}
}
in this test:
// Fastest among seqs, but still 30x times slower than direct sum
// 49 mops vs 37 mops for yield, or c.30% faster
[Test]
public void SingleSequenceStructForEach() {
var sw = new Stopwatch();
sw.Start();
long sum = 0;
for (var i = 0; i < 100000000; i++) {
foreach (var single in new SingleSequence<int>(i)) {
sum += single;
}
}
sw.Stop();
Console.WriteLine($"Elapsed {sw.ElapsedMilliseconds}");
Console.WriteLine($"Mops {100000.0 / sw.ElapsedMilliseconds * 1.0}");
}
As I have just found, and seen that user LukeH suggested too, a nice simple way of doing this is as follows:
public static void PerformAction(params YourType[] items)
{
// Forward call to IEnumerable overload
PerformAction(items.AsEnumerable());
}
public static void PerformAction(IEnumerable<YourType> items)
{
foreach (YourType item in items)
{
// Do stuff
}
}
This pattern will allow you to call the same functionality in a multitude of ways: a single item; multiple items (comma-separated); an array; a list; an enumeration, etc.
I'm not 100% sure on the efficiency of using the AsEnumerable method though, but it does work a treat.
Update: The AsEnumerable function looks pretty efficient! (reference)
Although it's overkill for one method, I believe some people may find the Interactive Extensions useful.
The Interactive Extensions (Ix) from Microsoft includes the following method.
public static IEnumerable<TResult> Return<TResult>(TResult value)
{
yield return value;
}
Which can be utilized like so:
var result = EnumerableEx.Return(0);
Ix adds new functionality not found in the original Linq extension methods, and is a direct result of creating the Reactive Extensions (Rx).
Think, Linq Extension Methods + Ix = Rx for IEnumerable.
You can find both Rx and Ix on CodePlex.
I recently asked the same thing on another post
Is there a way to call a C# method requiring an IEnumerable<T> with a single value? ...with benchmarking.
I wanted people stopping by here to see the brief benchmark comparison shown at that newer post for 4 of the approaches presented in these answers.
It seems that simply writing new[] { x } in the arguments to the method is the shortest and fastest solution.
This may not be any better but it's kind of cool:
Enumerable.Range(0, 1).Select(i => item);
Sometimes I do this, when I'm feeling impish:
"_".Select(_ => 3.14) // or whatever; any type is fine
This is the same thing with less shift key presses, heh:
from _ in "_" select 3.14
For a utility function I find this to be the least verbose, or at least more self-documenting than an array, although it'll let multiple values slide; as a plus it can be defined as a local function:
static IEnumerable<T> Enumerate (params T[] v) => v;
// usage:
IEnumerable<double> example = Enumerate(1.234);
Here are all of the other ways I was able to think of (runnable here):
using System;
using System.Collections.Generic;
using System.Linq;
public class Program {
public static IEnumerable<T> ToEnumerable1 <T> (T v) {
yield return v;
}
public static T[] ToEnumerable2 <T> (params T[] vs) => vs;
public static void Main () {
static IEnumerable<T> ToEnumerable3 <T> (params T[] v) => v;
p( new string[] { "three" } );
p( new List<string> { "three" } );
p( ToEnumerable1("three") ); // our utility function (yield return)
p( ToEnumerable2("three") ); // our utility function (params)
p( ToEnumerable3("three") ); // our local utility function (params)
p( Enumerable.Empty<string>().Append("three") );
p( Enumerable.Empty<string>().DefaultIfEmpty("three") );
p( Enumerable.Empty<string>().Prepend("three") );
p( Enumerable.Range(3, 1) ); // only for int
p( Enumerable.Range(0, 1).Select(_ => "three") );
p( Enumerable.Repeat("three", 1) );
p( "_".Select(_ => "three") ); // doesn't have to be "_"; just any one character
p( "_".Select(_ => 3.3333) );
p( from _ in "_" select 3.0f );
p( "a" ); // only for char
// these weren't available for me to test (might not even be valid):
// new Microsoft.Extensions.Primitives.StringValues("three")
}
static void p <T> (IEnumerable<T> e) =>
Console.WriteLine(string.Join(' ', e.Select((v, k) => $"[{k}]={v,-8}:{v.GetType()}").DefaultIfEmpty("<empty>")));
}
For those wondering about performance, while #mattica has provided some benchmarking information in a similar question referenced above, My benchmark tests, however, have provided a different result.
In .NET 7, yield return value is ~9% faster than new T[] { value } and allocates 75% the amount of memory. In most cases, this is already hyper-performant and is as good as you'll ever need.
I was curious if a custom single collection implementation would be faster or more lightweight. It turns out because yield return is implemented as IEnumerator<T> and IEnumerable<T>, the only way to beat it in terms of allocation is to do that in my implementation as well.
If you're passing IEnumerable<> to an outside library, I would strongly recommend not doing this unless you're very familiar with what you're building. That being said, I made a very simple (not-reuse-safe) implementation which was able to beat the yield method by 5ns and allocated only half as much as the array.
Because all tests were passed an IEnumerable<T>, value types generally performed worse than reference types. The best implementation I had was actually the simplest - you can look at the SingleCollection class in the gist I linked to. (This was 2ns faster than yield return, but allocated 88% of what the array would, compared to the 75% allocated for yield return.)
TL:DR; if you care about speed, use yield return item. If you really care about speed, use a SingleCollection.
The easiest way I'd say would be new T[]{item};; there's no syntax to do this. The closest equivalent that I can think of is the params keyword, but of course that requires you to have access to the method definition and is only usable with arrays.
Enumerable.Range(1,1).Select(_ => {
//Do some stuff... side effects...
return item;
});
The above code is useful when using like
var existingOrNewObject = MyData.Where(myCondition)
.Concat(Enumerable.Range(1,1).Select(_ => {
//Create my object...
return item;
})).Take(1).First();
In the above code snippet there is no empty/null check, and it is guaranteed to have only one object returned without afraid of exceptions. Furthermore, because it is lazy, the closure will not be executed until it is proved there is no existing data fits the criteria.
To be filed under "Not necessarily a good solution, but still...a solution" or "Stupid LINQ tricks", you could combine Enumerable.Empty<>() with Enumerable.Append<>()...
IEnumerable<string> singleElementEnumerable = Enumerable.Empty<string>().Append("Hello, World!");
...or Enumerable.Prepend<>()...
IEnumerable<string> singleElementEnumerable = Enumerable.Empty<string>().Prepend("Hello, World!");
The latter two methods are available since .NET Framework 4.7.1 and .NET Core 1.0.
This is a workable solution if one were really intent on using existing methods instead of writing their own, though I'm undecided if this is more or less clear than the Enumerable.Repeat<>() solution. This is definitely longer code (partly due to type parameter inference not being possible for Empty<>()) and creates twice as many enumerator objects, however.
Rounding out this "Did you know these methods exist?" answer, Array.Empty<>() could be substituted for Enumerable.Empty<>(), but it's hard to argue that makes the situation...better.
I'm a bit late to the party but I'll share my way anyway.
My problem was that I wanted to bind the ItemSource or a WPF TreeView to a single object. The hierarchy looks like this:
Project > Plot(s) > Room(s)
There was always going to be only one Project but I still wanted to Show the project in the Tree, without having to pass a Collection with only that one object in it like some suggested.
Since you can only pass IEnumerable objects as ItemSource I decided to make my class IEnumerable:
public class ProjectClass : IEnumerable<ProjectClass>
{
private readonly SingleItemEnumerator<AufmassProjekt> enumerator;
...
public IEnumerator<ProjectClass > GetEnumerator() => this.enumerator;
IEnumerator IEnumerable.GetEnumerator() => this.GetEnumerator();
}
And create my own Enumerator accordingly:
public class SingleItemEnumerator : IEnumerator
{
private bool hasMovedOnce;
public SingleItemEnumerator(object current)
{
this.Current = current;
}
public bool MoveNext()
{
if (this.hasMovedOnce) return false;
this.hasMovedOnce = true;
return true;
}
public void Reset()
{ }
public object Current { get; }
}
public class SingleItemEnumerator<T> : IEnumerator<T>
{
private bool hasMovedOnce;
public SingleItemEnumerator(T current)
{
this.Current = current;
}
public void Dispose() => (this.Current as IDisposable).Dispose();
public bool MoveNext()
{
if (this.hasMovedOnce) return false;
this.hasMovedOnce = true;
return true;
}
public void Reset()
{ }
public T Current { get; }
object IEnumerator.Current => this.Current;
}
This is probably not the "cleanest" solution but it worked for me.
EDIT
To uphold the single responsibility principle as #Groo pointed out I created a new wrapper class:
public class SingleItemWrapper : IEnumerable
{
private readonly SingleItemEnumerator enumerator;
public SingleItemWrapper(object item)
{
this.enumerator = new SingleItemEnumerator(item);
}
public object Item => this.enumerator.Current;
public IEnumerator GetEnumerator() => this.enumerator;
}
public class SingleItemWrapper<T> : IEnumerable<T>
{
private readonly SingleItemEnumerator<T> enumerator;
public SingleItemWrapper(T item)
{
this.enumerator = new SingleItemEnumerator<T>(item);
}
public T Item => this.enumerator.Current;
public IEnumerator<T> GetEnumerator() => this.enumerator;
IEnumerator IEnumerable.GetEnumerator() => this.GetEnumerator();
}
Which I used like this
TreeView.ItemSource = new SingleItemWrapper(itemToWrap);
EDIT 2
I corrected a mistake with the MoveNext() method.
I prefer
public static IEnumerable<T> Collect<T>(this T item, params T[] otherItems)
{
yield return item;
foreach (var otherItem in otherItems)
{
yield return otherItem;
}
}
This lets you call item.Collect() if you want the singleton, but it also lets you call item.Collect(item2, item3) if you want

What is the best way to convert an IEnumerator to a generic IEnumerator?

I am writing a custom ConfigurationElementCollection for a custom ConfigurationHandler in C#.NET 3.5 and I am wanting to expose the IEnumerator as a generic IEnumerator.
What would be the best way to achieve this?
I am currently using the code:
public new IEnumerator<GenericObject> GetEnumerator()
{
var list = new List();
var baseEnum = base.GetEnumerator();
while(baseEnum.MoveNext())
{
var obj = baseEnum.Current as GenericObject;
if (obj != null)
list.Add(obj);
}
return list.GetEnumerator();
}
Cheers
I don't believe there's anything in the framework, but you could easily write one:
IEnumerator<T> Cast<T>(IEnumerator iterator)
{
while (iterator.MoveNext())
{
yield return (T) iterator.Current;
}
}
It's tempting to just call Enumerable.Cast<T> from LINQ and then call GetEnumerator() on the result - but if your class already implements IEnumerable<T> and T is a value type, that acts as a no-op, so the GetEnumerator() call recurses and throws a StackOverflowException. It's safe to use return foo.Cast<T>.GetEnumerator(); when foo is definitely a different object (which doesn't delegate back to this one) but otherwise, you're probably best off using the code above.
IEnumerable<T> already derives from IEnumerable so there's no need to do any conversion. You can simply cast to it...well actually it's implicit no cast necessary.
IEnumerable<T> enumerable = GetGenericFromSomewhere();
IEnumerable sadOldEnumerable = enumerable;
return sadOldEnumerable.GetEnumerator();
Going the other way round isn't much more difficult with LINQ:
var fancyEnumerable = list.OfType<GenericObject>();
return fancyEnumerable.GetEnumerator();
You can use OfType<T> and Cast<T>.
public static IEnumerable Digits()
{
return new[]{1, 15, 68, 1235, 12390, 1239};
}
var enumerable = Digits().OfType<int>();
foreach (var item in enumerable)
// var is here an int. Without the OfType<int(), it would be an object
Console.WriteLine(i);
To get an IEnumerator<T> instead of an IEnumerable<T> you can just make a call to GetEnumerator()
var enumerator = Digits().OfType<int>().GetEnumerator();
I was running into the same Stack Overflow problem mentioned is some of the comments. In my case it was due to the fact that the GetEnumerator call needed to be to the base.GetEnumerator otherwise you loop within your own GetEnumerator redefinition.
This is the code that was Throwing the Stack Overflow. The use of the foreach statement call the same GetEnumerator function I'm trying to overload:
public new IEnumerator<T> GetEnumerator()
{
foreach (T type in this)
{
yield return type;
}
}
I've ended up with a simplified version of the original post as you don't need to use a List holder.
public class ElementCollection<T> : ConfigurationElementCollection, IList<T>
...
public new IEnumerator<T> GetEnumerator()
{
var baseEnum = base.GetEnumerator();
while (baseEnum.MoveNext())
{
yield return baseEnum.Current as T;
}
}
...
}
This works for me.
IEnumerator<DataColumn> columnEnumerator = dt.Columns.Cast<DataColumn>().GetEnumerator();

Categories

Resources