Foreach calls MoveNext in Enumerator first, Current second

Foreach calls MoveNext in Enumerator first, Current second - c#

My issue is basically in order of the functions/property called. I have a custom linked list, circular one. So I made a custom enumenator and all. The issue is, the foreach cycle actually calls the MoveNext() method of enumerator first, therefore moving from the actual first Node of the cycle to the second node, which is kind of bad if you want to have your items in actual order.
Question is, am I doing something wrong, and if not, how to compensate for this?
Code of Enumerator is simple as can be. This, basically:
class EnumeratorLinkedList : IEnumerator<Node>
{
private Node current;
private Node first;
private bool didWeMove;
public EnumeratorSpojovySeznam(Node current)
{
this.current = current;
this.first = current;
didWeMove = false;
}
public Node Current => current;
object System.Collections.IEnumerator.Current => Current;
public bool MoveNext()
{
if ((didWeMove == true && current == first)) return false;
current = current.Next;
didWeMove = true;
return true;
}
public void Dispose()
{
}
public void Reset()
{
throw new NotImplementedException();
}
}

Do you consider using C# iterators to implement enumerator for your circular LinkedList? If you do, it is very simple. Iterators provide convenient and easy way to implement enumerators.
Here is how enumerator can be implemented for your circular LinkedList:
class LinkedList : IEnumerable<Node>
{
private Node first;
public IEnumerator<Node> GetEnumerator()
{
// Check if LinkedList is empty.
// If it is empty we immediately break enumeration.
if (first == null)
yield break;
// Here goes logic for enumerating circular LinkedList.
Node current = first;
do
{
yield return current;
current = current.Next;
} while (current != first);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
As you can see an implementation using iterator is more straightforward and intuitive.
Here is complete sample that demonstrates usage of iterators.

Related

Understanding Enumerator in List<T>

The source code of Enumerator is:
public struct Enumerator : IEnumerator<T>, System.Collections.IEnumerator {
private List<T> list;
private int index;
private int version;
private T current;
...
public bool MoveNext() {
List<T> localList = list; <--------------Q1
if (version == localList._version && ((uint)index < (uint)localList._size)) {
current = localList._items[index];
index++;
return true;
}
return MoveNextRare();
}
private bool MoveNextRare() {
if (version != list._version) {
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
}
index = list._size + 1; <-----------------Q2
current = default(T);
return false;
}
void System.Collections.IEnumerator.Reset() {
if (version != list._version) {
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
}
index = 0;
current = default(T);
}
...
}
I have some questions on this iterator pattern:
Q1-Why MoveNext method need to define a localList, can't it just use the private field list directly since List<T> is already a reference type, why need to create an alias for it?
Q2- MoveNextRare method will invoke when the index is out of range of the last element in the list, so what's the point to increment it, why not just leave it untouched, because when Reset() calls, index will be set to 0 anyway?

For the first question I don't have any answer, maybe it just an relic from some previous implementation, maybe it somehow improves performance (though I would wonder how and why, so my bet is on the first guess). Also in the Core implementation list field is marked as readonly.
As for the second one - it has nothing to do with Reset, but with System.Collections.IEnumerator.Current implementation:
Object System.Collections.IEnumerator.Current {
get {
if( index == 0 || index == list._size + 1) { // check second comparasion
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumOpCantHappen);
}
return Current;
}

Calling method with IEnumerable<T> sequence as argument, if that sequence is not empty

I have method Foo, which do some CPU intensive computations and returns IEnumerable<T> sequence. I need to check, if that sequence is empty. And if not, call method Bar with that sequence as argument.
I thought about three approaches...
Check, if sequence is empty with Any(). This is ok, if sequence is really empty, which will be case most of the times. But it will have horrible performance, if sequence will contains some elements and Foo will need them compute again...
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Check, if sequence is empty without actually reset the sequence. This sounds like win-win, but I can't find any easy build-in way, how to do it. So I create this obscure workaround and wondering, whether this is really a best approach.
Condition
var source = Foo();
if (!IsEmpty(ref source))
Bar(source);
with IsEmpty implemented as
bool IsEmpty<T>(ref IEnumerable<T> source)
{
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
}
}
Also note, that calling Bar with empty sequence is not option...
EDIT:
After some consideration, best answer for my case is from Olivier Jacot-Descombes - avoid that scenario completely. Accepted solution answers this question - if it is really no other way.

I don't know whether your algorithm in Foo allows to determine if the enumeration will be empty without doing the calculations. But if this is the case, return null if the sequence would be empty:
public IEnumerable<T> Foo()
{
if (<check if sequence will be empty>) {
return null;
}
return GetSequence();
}
private IEnumerable<T> GetSequence()
{
...
yield return item;
...
}
Note that if a method uses yield return, it cannot use a simple return to return null. Therefore a second method is needed.
var sequence = Foo();
if (sequence != null) {
Bar(sequence);
}
After reading one of your comments
Foo need to initialize some resources, parse XML file and fill some HashSets, which will be used to filter (yield) returned data.
I suggest another approach. The time consuming part seems to be the initialization. To be able to separate it from the iteration, create a foo calculator class. Something like:
public class FooCalculator<T>
{
private bool _isInitialized;
private string _file;
public FooCalculator(string file)
{
_file = file;
}
private EnsureInitialized()
{
if (_isInitialized) return;
// Parse XML.
// Fill some HashSets.
_isInitialized = true;
}
public IEnumerable<T> Result
{
get {
EnsureInitialized();
...
yield return ...;
...
}
}
}
This ensures that the costly initialization stuff is executed only once. Now you can safely use Any().
Other optimizations are conceivable. The Result property could remember the position of the first returned element, so that if it is called again, it could skip to it immediately.

You would like to call some function Bar<T>(IEnumerable<T> source) if and only if the enumerable source contains at least one element, but you're running into two problems:
There is no method T Peek() in IEnumerable<T> so you would need to actually begin to evaluate the enumerable to see if it's nonempty, but...
You don't want to even partially double-evaluate the enumerable since setting up the enumerable might be expensive.
In that case your approach looks reasonable. You do, however, have some issues with your imlementation:
You need to dispose enumerator after using it.
As pointed out by Ivan Stoev in comments, if the Bar() method attempts to evaluate the IEnumerable<T> more than once (e.g. by calling Any() then foreach (...)) then the results will be undefined because usedEnumerator will have been exhausted by the first enumeration.
To resolve these issues, I'd suggest modifying your API a little and create an extension method IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func) that calls a specified method only if the sequence is nonempty, as shown below:
public static partial class EnumerableExtensions
{
public static bool IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func)
{
if (source == null|| func == null)
throw new ArgumentNullException();
using (var enumerator = source.GetEnumerator())
{
if (!enumerator.MoveNext())
return false;
func(new UsedEnumerator<T>(enumerator));
return true;
}
}
class UsedEnumerator<T> : IEnumerable<T>
{
IEnumerator<T> usedEnumerator;
public UsedEnumerator(IEnumerator<T> usedEnumerator)
{
if (usedEnumerator == null)
throw new ArgumentNullException();
this.usedEnumerator = usedEnumerator;
}
public IEnumerator<T> GetEnumerator()
{
var localEnumerator = System.Threading.Interlocked.Exchange(ref usedEnumerator, null);
if (localEnumerator == null)
// An attempt has been made to enumerate usedEnumerator more than once;
// throw an exception since this is not allowed.
throw new InvalidOperationException();
yield return localEnumerator.Current;
while (localEnumerator.MoveNext())
{
yield return localEnumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
Demo fiddle with unit tests here.

If you can change Bar then how about change it to TryBar that returns false when IEnumerable<T> was empty?
bool TryBar(IEnumerable<Foo> source)
{
var count = 0;
foreach (var x in source)
{
count++;
}
return count > 0;
}
If that doesn't work for you could always create your own IEnumerable<T> wrapper that caches values after they have been iterated once.

One improvement for your IsEmpty would be to check if source is ICollection<T>, and if it is, check .Count (also, dispose the enumerator):
bool IsEmpty<T>(ref IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
return collection.Count == 0;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
source = CreateIEnumerable(enumerator);
return false;
}
enumerator.Dispose();
return true;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
This will work for arrays and lists.
I would, however, rework IsEmpty to return:
IEnumerable<T> NotEmpty<T>(IEnumerable<T> source)
{
if (source is ICollection<T> collection)
{
if (collection.Count == 0)
{
return null;
}
return source;
}
var enumerator = source.GetEnumerator();
if (enumerator.MoveNext())
{
return CreateIEnumerable(enumerator);
}
enumerator.Dispose();
return null;
IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
{
yield return usedEnumerator.Current;
while (usedEnumerator.MoveNext())
{
yield return usedEnumerator.Current;
}
usedEnumerator.Dispose();
}
}
Now, you would check if it returned null.

The accepted answer is probably the best approach but, based on, and I quote:
Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...
Another take would be creating an IEnumerable<T> that partially caches the underlying enumeration. Something along the following lines:
interface IDisposableEnumerable<T>
:IEnumerable<T>, IDisposable
{
}
static class PartiallyCachedEnumerable
{
public static IDisposableEnumerable<T> Create<T>(
IEnumerable<T> source,
int cachedCount)
{
if (source == null)
throw new NullReferenceException(
nameof(source));
if (cachedCount < 1)
throw new ArgumentOutOfRangeException(
nameof(cachedCount));
return new partiallyCachedEnumerable<T>(
source, cachedCount);
}
private class partiallyCachedEnumerable<T>
: IDisposableEnumerable<T>
{
private readonly IEnumerator<T> enumerator;
private bool disposed;
private readonly List<T> cache;
private readonly bool hasMoreItems;
public partiallyCachedEnumerable(
IEnumerable<T> source,
int cachedCount)
{
Debug.Assert(source != null);
Debug.Assert(cachedCount > 0);
enumerator = source.GetEnumerator();
cache = new List<T>(cachedCount);
var count = 0;
while (enumerator.MoveNext() &&
count < cachedCount)
{
cache.Add(enumerator.Current);
count += 1;
}
hasMoreItems = !(count < cachedCount);
}
public void Dispose()
{
if (disposed)
return;
enumerator.Dispose();
disposed = true;
}
public IEnumerator<T> GetEnumerator()
{
foreach (var t in cache)
yield return t;
if (disposed)
yield break;
while (enumerator.MoveNext())
{
yield return enumerator.Current;
cache.Add(enumerator.Current)
}
Dispose();
}
IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
}
}

Clear in LinkedList for O(N). Why?

The built-in implementation of the LinkedList is a doubly-linked circular list.
I see the following implementation of the Clear method:
public void Clear() {
LinkedListNode<T> current = head;
while (current != null ) {
LinkedListNode<T> temp = current;
current = current.Next;
temp.Invalidate();
}
head = null;
count = 0;
version++;
}
So, this clearing method apparently works for O(N).
public sealed class LinkedListNode<T> {
internal LinkedList<T> list;
internal LinkedListNode<T> next;
internal LinkedListNode<T> prev;
internal T item;
public LinkedListNode( T value) {
this.item = value;
}
internal LinkedListNode(LinkedList<T> list, T value) {
this.list = list;
this.item = value;
}
public LinkedList<T> List {
get { return list;}
}
public LinkedListNode<T> Next {
get { return next == null || next == list.head? null: next;}
}
public LinkedListNode<T> Previous {
get { return prev == null || this == list.head? null: prev;}
}
public T Value {
get { return item;}
set { item = value;}
}
internal void Invalidate() {
list = null;
next = null;
prev = null;
}
}
I wonder why can't we just assign null to head instead of nulling all the references? As far as I can see, assigning null to the head will result in breaking the circle and all the nodes will be collected by GC pretty soon. Or am I missing something?

This is done so that once the linked list is cleared, anything holding references to its old nodes will throw errors if it tries to navigate the list.
If the old links weren't set to null, the old linked list would remain accessible if anything had a reference to any of its nodes, which would (a) hide the problem because the code would continue to apparently work and (b) keep alive the node objects in memory.

IEnumerable<T> Skip on unlimited sequence

I have a simple implementation of Fibonacci sequence using BigInteger:
internal class FibonacciEnumerator : IEnumerator<BigInteger>
{
private BigInteger _previous = 1;
private BigInteger _current = 0;
public void Dispose(){}
public bool MoveNext() {return true;}
public void Reset()
{
_previous = 1;
_current = 0;
}
public BigInteger Current
{
get
{
var temp = _current;
_current += _previous;
_previous = temp;
return _current;
}
}
object IEnumerator.Current { get { return Current; }
}
}
internal class FibonacciSequence : IEnumerable<BigInteger>
{
private readonly FibonacciEnumerator _f = new FibonacciEnumerator();
public IEnumerator<BigInteger> GetEnumerator(){return _f;}
IEnumerator IEnumerable.GetEnumerator(){return GetEnumerator();}
}
It is an unlimited sequence as the MoveNext() always returns true.
When called using
var fs = new FibonacciSequence();
fs.Take(10).ToList().ForEach(_ => Console.WriteLine(_));
the output is as expected (1,1,2,3,5,8,...)
I want to select 10 items but starting at 100th position. I tried calling it via
fs.Skip(100).Take(10).ToList().ForEach(_ => Console.WriteLine(_));
but this does not work, as it outputs ten elements from the beginning (i.e. the output is again 1,1,2,3,5,8,...).
I can skip it by calling SkipWhile
fs.SkipWhile((b,index) => index < 100).Take(10).ToList().ForEach(_ => Console.WriteLine(_));
which correctly outputs 10 elements starting from the 100th element.
Is there something else that needs/can be implemented in the enumerator to make the Skip(...) work?

Skip(n) doesn't access Current, it just calls MoveNext() n times.
So you need to perform the increment in MoveNext(), which is the logical place for that operation anyway:
Current does not move the position of the enumerator, and consecutive calls to Current return the same object until either MoveNext or Reset is called.

CodeCaster's answer is spot on - I'd just like to point out that you don't really need to implement your own enumerable for something like this:
public IEnumerable<BigInteger> FibonacciSequence()
{
var previous = BigInteger.One;
var current = BigInteger.Zero;
while (true)
{
yield return current;
var temp = current;
current += previous;
previous = temp;
}
}
The compiler will create both the enumerator and the enumerable for you. For a simple enumerable like this the difference isn't really all that big (you just avoid tons of boilerplate), but if you actually need something more complicated than a simple recursive function, it makes a huge difference.

Move your logic into MoveNext:
public bool MoveNext()
{
var temp = _current;
_current += _previous;
_previous = temp;
return true;
}
public void Reset()
{
_previous = 1;
_current = 0;
}
public BigInteger Current
{
get
{
return _current;
}
}
Skip(10) is simply calling MoveNext 10 times, and then Current. It also makes more logical sense to have the operation done in MoveNext, rather than current.

How to add functionality of treatment to my own linked list

My question is to add treat with foreach my own linked list.
I created my linked list by seeing example here
And I want to add LINQ to my linked list. Where i can see it? Or how i can implement it?

You just need to implement IEnumerabe<object> inside your LinkedList class:
public class LinkedList : IEnumerable<object>
{
// your code
// ....
public IEnumerator<object> GetEnumerator()
{
var current = this.head;
while (current != null)
{
yield return current.data;
current = current.next;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
Then, Linq will work:
var result = myLinkedList.Where(data => /* some condition */)
.Select(data => data.ToString();
As far as IEnumerable implementation is lazy evaluted (yield return) you would want to throw exception when list was modified while itering (to prevent bugs or sth):
public class LinkedList : IEnumerable<object>
{
long version = 0;
public void Add(object data) //it is required for all
methods that modyfies collection
{
// yout code
this.vesion += 1;
}
public IEnumerator<object> GetEnumerator()
{
var current = this.head;
var v = this.version;
while (current != null)
{
if (this.version != v)
throw new InvalidOperationException("Collection was modified");
yield return current.data;
current = current.next;
}
}
}

Your linked list needs to implement IEnumerable<T>.

You can implement a AsEnumerable() to use linq
public IEnumerable<T> AsEnumerable<T>()
{
var current = root;
while (current != null)
{
yield return current;
current = current.NextNode;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Foreach calls MoveNext in Enumerator first, Current second - c#

Related

Understanding Enumerator in List<T>

Calling method with IEnumerable<T> sequence as argument, if that sequence is not empty

Clear in LinkedList for O(N). Why?

IEnumerable<T> Skip on unlimited sequence

How to add functionality of treatment to my own linked list

Categories

Resources