IEnumerable<T> Skip on unlimited sequence

IEnumerable<T> Skip on unlimited sequence - c#

I have a simple implementation of Fibonacci sequence using BigInteger:
internal class FibonacciEnumerator : IEnumerator<BigInteger>
{
private BigInteger _previous = 1;
private BigInteger _current = 0;
public void Dispose(){}
public bool MoveNext() {return true;}
public void Reset()
{
_previous = 1;
_current = 0;
}
public BigInteger Current
{
get
{
var temp = _current;
_current += _previous;
_previous = temp;
return _current;
}
}
object IEnumerator.Current { get { return Current; }
}
}
internal class FibonacciSequence : IEnumerable<BigInteger>
{
private readonly FibonacciEnumerator _f = new FibonacciEnumerator();
public IEnumerator<BigInteger> GetEnumerator(){return _f;}
IEnumerator IEnumerable.GetEnumerator(){return GetEnumerator();}
}
It is an unlimited sequence as the MoveNext() always returns true.
When called using
var fs = new FibonacciSequence();
fs.Take(10).ToList().ForEach(_ => Console.WriteLine(_));
the output is as expected (1,1,2,3,5,8,...)
I want to select 10 items but starting at 100th position. I tried calling it via
fs.Skip(100).Take(10).ToList().ForEach(_ => Console.WriteLine(_));
but this does not work, as it outputs ten elements from the beginning (i.e. the output is again 1,1,2,3,5,8,...).
I can skip it by calling SkipWhile
fs.SkipWhile((b,index) => index < 100).Take(10).ToList().ForEach(_ => Console.WriteLine(_));
which correctly outputs 10 elements starting from the 100th element.
Is there something else that needs/can be implemented in the enumerator to make the Skip(...) work?

Skip(n) doesn't access Current, it just calls MoveNext() n times.
So you need to perform the increment in MoveNext(), which is the logical place for that operation anyway:
Current does not move the position of the enumerator, and consecutive calls to Current return the same object until either MoveNext or Reset is called.

CodeCaster's answer is spot on - I'd just like to point out that you don't really need to implement your own enumerable for something like this:
public IEnumerable<BigInteger> FibonacciSequence()
{
var previous = BigInteger.One;
var current = BigInteger.Zero;
while (true)
{
yield return current;
var temp = current;
current += previous;
previous = temp;
}
}
The compiler will create both the enumerator and the enumerable for you. For a simple enumerable like this the difference isn't really all that big (you just avoid tons of boilerplate), but if you actually need something more complicated than a simple recursive function, it makes a huge difference.

Move your logic into MoveNext:
public bool MoveNext()
{
var temp = _current;
_current += _previous;
_previous = temp;
return true;
}
public void Reset()
{
_previous = 1;
_current = 0;
}
public BigInteger Current
{
get
{
return _current;
}
}
Skip(10) is simply calling MoveNext 10 times, and then Current. It also makes more logical sense to have the operation done in MoveNext, rather than current.

Related

Understanding Enumerator in List<T>

The source code of Enumerator is:
public struct Enumerator : IEnumerator<T>, System.Collections.IEnumerator {
private List<T> list;
private int index;
private int version;
private T current;
...
public bool MoveNext() {
List<T> localList = list; <--------------Q1
if (version == localList._version && ((uint)index < (uint)localList._size)) {
current = localList._items[index];
index++;
return true;
}
return MoveNextRare();
}
private bool MoveNextRare() {
if (version != list._version) {
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
}
index = list._size + 1; <-----------------Q2
current = default(T);
return false;
}
void System.Collections.IEnumerator.Reset() {
if (version != list._version) {
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
}
index = 0;
current = default(T);
}
...
}
I have some questions on this iterator pattern:
Q1-Why MoveNext method need to define a localList, can't it just use the private field list directly since List<T> is already a reference type, why need to create an alias for it?
Q2- MoveNextRare method will invoke when the index is out of range of the last element in the list, so what's the point to increment it, why not just leave it untouched, because when Reset() calls, index will be set to 0 anyway?

For the first question I don't have any answer, maybe it just an relic from some previous implementation, maybe it somehow improves performance (though I would wonder how and why, so my bet is on the first guess). Also in the Core implementation list field is marked as readonly.
As for the second one - it has nothing to do with Reset, but with System.Collections.IEnumerator.Current implementation:
Object System.Collections.IEnumerator.Current {
get {
if( index == 0 || index == list._size + 1) { // check second comparasion
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumOpCantHappen);
}
return Current;
}

How to apply contain on last record and delete if found in LINQ?

I have a list of strings like
AAPL,28/03/2012,88.34,88.778,87.187,88.231,163682382
AAPL,29/03/2012,87.54,88.08,86.747,87.123,151551216
FB,30/03/2012,86.967,87.223,85.42,85.65,182255227
Now I want to delete only last record if it does not contains AAPL(symbol name) using LINQ.
Below I have write my code which contains multiple line but I want to make it single line code,
fileLines = System.IO.File.ReadAllLines(fileName).AsParallel().Skip(1).ToList();
var lastLine = fileLines.Last();
if (!lastLine.Contains(item.sym))
{
fileLines.RemoveAt(fileLines.Count - 1);
}
So How can I make all it in single line linq query ?

You could use the ternary operator to decide on the tail to concatenate as follows.
fileLines
= fileLines.Take(fileLines.Count())
.Concat(fileLines.Last().Contains(item.sym) ? Enumerable.Empty
: new string[]{ item.sym });
You could formulate it even more contracted as follows.
fileLines
= System.IO.File.ReadAllLines(fileName)
.AsParallel()
.Skip(1)
.Take(fileLines.Count())
.Concat(fileLines.Last().Contains(item.sym) ? Enumerable.Empty
: new string[]{ item.sym });
.ToList();
That being said, such an endeavour is questionable. The accumulation of lazily evaluated Linq extension methods is difficult to debug.

I understand you need to simplify the filtering operation, and, from what I see in your case, you're missing only one piece of information (i.e whether or not current item is the last one in an enumerated collection) that will help you define your predicate. What I'm about to write now might not seem "a simple single line"; however, it's gonna be a reusable extension that will provide this piece of information (and more) without performing extra and unnecessary loops or iterations.
The final product of that will be:
IEnumerable<string> fileLines = System.IO.File.ReadLines(fileName).RichWhere((item, originalIndex, countedIndex, hasMoreItems) => hasMoreItems || item.StartsWith("AAPL"));
The LINQ-like extension that I wrote inspired by Microsoft's Enumerable at ReferenceSource:
public delegate bool RichPredicate<T>(T item, int originalIndex, int countedIndex, bool hasMoreItems);
public static class EnumerableExtensions
{
/// <remarks>
/// This was contributed by Aly El-Haddad as an answer to this Stackoverflow.com question:
/// https://stackoverflow.com/q/54829095/3602352
/// </remarks>
public static IEnumerable<T> RichWhere<T>(this IEnumerable<T> source, RichPredicate<T> predicate)
{
return new RichWhereIterator<T>(source, predicate);
}
private class RichWhereIterator<T> : IEnumerable<T>, IEnumerator<T>
{
private readonly int threadId;
private readonly IEnumerable<T> source;
private readonly RichPredicate<T> predicate;
private IEnumerator<T> enumerator;
private int state;
private int countedIndex = -1;
private int originalIndex = -1;
private bool hasMoreItems;
public RichWhereIterator(IEnumerable<T> source, RichPredicate<T> predicate)
{
threadId = Thread.CurrentThread.ManagedThreadId;
this.source = source ?? throw new ArgumentNullException(nameof(source));
this.predicate = predicate ?? ((item, originalIndex, countedIndex, hasMoreItems) => true);
}
public T Current { get; private set; }
object IEnumerator.Current => Current;
public void Dispose()
{
if (enumerator is IDisposable disposable)
disposable.Dispose();
enumerator = null;
originalIndex = -1;
countedIndex = -1;
hasMoreItems = false;
Current = default(T);
state = -1;
}
public bool MoveNext()
{
switch (state)
{
case 1:
enumerator = source.GetEnumerator();
if (!(hasMoreItems = enumerator.MoveNext()))
{
Dispose();
break;
}
++originalIndex;
state = 2;
goto case 2;
case 2:
if (!hasMoreItems) //last predicate returned true and that was the last item
{
Dispose();
break;
}
T current = enumerator.Current;
hasMoreItems = enumerator.MoveNext();
++originalIndex;
if (predicate(current, originalIndex - 1, countedIndex + 1, hasMoreItems))
{
++countedIndex;
Current = current;
return true;
}
else if (hasMoreItems)
{ goto case 2; }
//predicate returned false and there're no more items
Dispose();
break;
}
return false;
}
public void Reset()
{
Current = default(T);
hasMoreItems = false;
originalIndex = -1;
countedIndex = -1;
state = 1;
}
public IEnumerator<T> GetEnumerator()
{
if (threadId == Thread.CurrentThread.ManagedThreadId && state == 0)
{
state = 1;
return this;
}
return new RichWhereIterator<T>(source, predicate) { state = 1 };
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
RichPredicate<T>, which could be thought of as Func<T, int, int, bool, bool> provide this information about each item:
item: the item to evaluate.
originalIndex: the index of that item in its original IEnumerable<T> source (the one which was directly passed to RichWhere).
countedIndex: the index of that item IF the predicate would evaluate to true.
hasMoreItems: tells whether or not this would be the last item from the original IEnumerable<T> source.

Can I take the first n elements from an enumeration, and then still use the rest of the enumeration?

Suppose I have an IEnumerable<T>, and I want to take the first element and pass the remaining elements to some other code. I can get the first n elements using Take(n), but how can I then access the remaining elements without causing the enumeration to re-start?
For example, suppose I have a method ReadRecords that accepts the records in a CSV file as IEnumerable<string>. Now suppose that within that method, I want to read the first record (the headers), and then pass the remaining records to a ReadDataRecords method that also takes IEnumerable<string>. Like this:
void ReadCsv(IEnumerable<string> records)
{
var headerRecord = records.Take(1);
var dataRecords = ???
ReadDataRecords(dataRecords);
}
void ReadDataRecords(IEnumerable<string> records)
{
// ...
}
If I were prepared to re-start the enumeration, then I could use dataRecords = records.Skip(1). However, I don't want to re-start it - indeed, it might not be re-startable.
So, is there any way to take the first records, and then the remaining records (other than by reading all the values into a new collection and re-enumerating them)?

This is an interesting question, I think you can use a workaround like this, instead of using LINQ get the enumerator and use it:
private void ReadCsv(IEnumerable<string> records)
{
var enumerator = records.GetEnumerator();
enumerator.MoveNext();
var headerRecord = enumerator.Current;
var dataRecords = GetRemainingRecords(enumerator);
}
public IEnumerable<string> GetRemainingRecords(IEnumerator<string> enumerator)
{
while (enumerator.MoveNext())
{
if (enumerator.Current != null)
yield return enumerator.Current;
}
}
Update: According to your comment here is more extended way of doing this:
public static class CustomEnumerator
{
private static int _counter = 0;
private static IEnumerator enumerator;
public static IEnumerable<T> GetRecords<T>(this IEnumerable<T> source)
{
if (enumerator == null) enumerator = source.GetEnumerator();
if (_counter == 0)
{
enumerator.MoveNext();
_counter++;
yield return (T)enumerator.Current;
}
else
{
while (enumerator.MoveNext())
{
yield return (T)enumerator.Current;
}
_counter = 0;
enumerator = null;
}
}
}
Usage:
private static void ReadCsv(IEnumerable<string> records)
{
var headerRecord = records.GetRecords();
var dataRecords = records.GetRecords();
}

Yes, use the IEnumerator from the IEnumerable, and you can maintain the position across method calls;
A simple example;
public class Program
{
static void Main(string[] args)
{
int[] arr = new [] {1, 2, 3};
IEnumerator enumerator = arr.GetEnumerator();
enumerator.MoveNext();
Console.WriteLine(enumerator.Current);
DoRest(enumerator);
}
static void DoRest(IEnumerator enumerator)
{
while (enumerator.MoveNext())
Console.WriteLine(enumerator.Current);
}
}

converting an unknown long\int\short to double in C#

In my app I have a scenario in which I get a list of unknown type that could be either int\long\short.
I need to convert this list to double.
what is the quickest and most efficient way to achieve this? (it needs to be as fast as it could be)

I assume you have List<object> and you need to convert it to List<double>
Try this, this will work for all types which implements IConvertible. long, int, short, float,etc...
var doubleList = objectList.Select(x=> Convert.ToDouble(x)).ToList();

try this
List<double> doubleList = intList.ConvertAll(x => (double)x);

Nicely simple:
var doubleList = listOfObjects.Select(i => Convert.ToDouble(i)).ToList();
Micro-optimising because you say "most efficient" is important:
int count = listOfObjects.Count;
var doubleList = new List<double>(listOfObjects.Count);
for(int i = 0; i != count; ++i)
doubleList.Add(Convert.ToDouble(listOfObjects[i]));
However, "most efficient" depends on just what you need it to be most efficient at. You get different efficiencies with:
public class DoubleList : IList<double>
{
private readonly List<object> _source; // Change to IList<object> if that's a possibility
public DoubleList(List<object> source)
{
_source = _source;
}
// Hide half-supported implementation from main interface
double IList<double>.this[int index]
{
get { return Convert.ToDouble(_source[index]); }
set { throw new NotSupportedException("Read-only collection"); }
}
public double this[int index]
{
get { return Convert.ToDouble(_source[index]); }
}
public int Count
{
get { return _source.Count; }
}
bool ICollection<double>.IsReadOnly
{
get { return true; }
}
/* Lots of boring and obvious implementations skipped */
public struct Enumerator : IEnumerator<double>
{
// note, normally we'd just use yield return in the
// GetEnumerator(), and we certainly wouldn't use
// a struct here (there are issues), but this
// optimisation is in the spirit of "most efficient"
private List<object>.Enumerator _en; //Mutable struct! Don't make field readonly!
public double Current
{
get { return Convert.ToDouble(_en.Current); }
}
object IEnumerator.Current
{
get { return Current; }
}
public void Dispose()
{
_en.Dispose();
}
public bool MoveNext()
{
return _en.MoveNext();
}
public void Reset()
{
_en.Reset();
}
}
public Enumerator GetEnumerator()
{
return new Enumerator(_source.GetEnumerator());
}
IEnumerator<double> IEnumerable<double>.GetEnumerator()
{
return GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
var doubleList = new DoubleList(listOfObjects);
This moves around what happens when in such a way as to change what costs what. You'll return in constant time, but actually using the list will be more expensive. However, if you're only going to look at a few fields, or perhaps only going to obtain the count and then enumerate through it, then the fact that this doesn't do a full copy can make it much more efficient.

IEnumerable Method with AsParallel

I got the following extension method:
static class ExtensionMethods
{
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(
this IEnumerable<T> input,
int subsequenceLength)
{
var enumerator = input.GetEnumerator();
SubsequenciseParameter parameter = new SubsequenciseParameter
{
Next = enumerator.MoveNext()
};
while (parameter.Next)
yield return getSubSequence(
enumerator,
subsequenceLength,
parameter);
}
private static IEnumerable<T> getSubSequence<T>(
IEnumerator<T> enumerator,
int subsequenceLength,
SubsequenciseParameter parameter)
{
do
{
lock (enumerator) // this lock makes it "work"
{ // removing this causes exceptions.
if (parameter.Next)
yield return enumerator.Current;
}
} while ((parameter.Next = enumerator.MoveNext())
&& --subsequenceLength > 0);
}
// Needed since you cant use out or ref in yield-return methods...
class SubsequenciseParameter
{
public bool Next { get; set; }
}
}
Its purpose is to split a sequence into subsequences of a given size.
Calling it like this:
foreach (var sub in "abcdefghijklmnopqrstuvwxyz"
.Subsequencise(3)
.**AsParallel**()
.Select(sub =>new String(sub.ToArray()))
{
Console.WriteLine(sub);
}
Console.ReadKey();
works, however there are some empty lines in-between since some of the threads are "too late" and enter the first yield return.
I tried putting more locks everywhere, however I cannot achieve to make this work correct in combination with as parallel.
It's obvious that this example doesn't justify the use of as parallel at all. It is just to demonstrate how the method could be called.

The problem is that using iterators is lazy evaluated, so you return a lazily evaluated iterator which gets used from multiple threads.
You can fix this by rewriting your method as follows:
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(this IEnumerable<T> input, int subsequenceLength)
{
var syncObj = new object();
var enumerator = input.GetEnumerator();
if (!enumerator.MoveNext())
{
yield break;
}
List<T> currentList = new List<T> { enumerator.Current };
int length = 1;
while (enumerator.MoveNext())
{
if (length == subsequenceLength)
{
length = 0;
yield return currentList;
currentList = new List<T>();
}
currentList.Add(enumerator.Current);
++length;
}
yield return currentList;
}
This performs the same function, but doesn't use an iterator to implement the "nested" IEnumerable<T>, avoiding the problem. Note that this also avoids the locking as well as the custom SubsequenciseParameter type.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

IEnumerable<T> Skip on unlimited sequence - c#

Related

Understanding Enumerator in List<T>

How to apply contain on last record and delete if found in LINQ?

Can I take the first n elements from an enumeration, and then still use the rest of the enumeration?

converting an unknown long\int\short to double in C#

IEnumerable Method with AsParallel

Categories

Resources