I frequently find myself writing code like this:
List<int> list = new List<int> { 1, 3, 5 };
foreach (int i in list) {
Console.Write("{0}\t", i.ToString()); }
Console.WriteLine();
Better would be something like this:
List<int> list = new List<int> { 1, 3, 5 };
Console.WriteLine("{0}\t", list);
I suspect there's some clever way of doing this, but I don't see it. Does anybody have a better solution than the first block?
Do this:
list.ForEach(i => Console.Write("{0}\t", i));
EDIT: To others that have responded - he wants them all on the same line, with tabs between them. :)
A different approach, just for kicks:
Console.WriteLine(string.Join("\t", list));
new List<int> { 1, 3, 5 }.ForEach(Console.WriteLine);
If there is a piece of code that you repeat all the time according to Don't Repeat Yourself you should put it in your own library and call that. With that in mind there are 2 aspects to getting the right answer here. The first is clarity and brevity in the code that calls the library function. The second is the performance implications of foreach.
First let's think about the clarity and brevity in the calling code.
You can do foreach in a number of ways:
for loop
foreach loop
Collection.ForEach
Out of all the ways to do a foreach List.ForEach with a lamba is the clearest and briefest.
list.ForEach(i => Console.Write("{0}\t", i));
So at this stage it may look like the List.ForEach is the way to go. However what's the performance of this? It's true that in this case the time to write to the console will govern the performance of the code. When we know something about performance of a particular language feature we should certainly at least consider it.
According to Duston Campbell's performance measurements of foreach the fastest way of iterating the list under optimised code is using a for loop without a call to List.Count.
The for loop however is a verbose construct. It's also seen as a very iterative way of doing things which doesn't match with the current trend towards functional idioms.
So can we get brevity, clarity and performance? We can by using an extension method. In an ideal world we would create an extension method on Console that takes a list and writes it with a delimiter. We can't do this because Console is a static class and extension methods only work on instances of classes. Instead we need to put the extension method on the list itself (as per David B's suggestion):
public static void WriteLine(this List<int> theList)
{
foreach (int i in list)
{
Console.Write("{0}\t", t.ToString());
}
Console.WriteLine();
}
This code is going to used in many places so we should carry out the following improvements:
Instead of using foreach we should use the fastest way of iterating the collection which is a for loop with a cached count.
Currently only List can be passed as an argument. As a library function we can generalise it through a small amount of effort.
Using List limits us to just Lists, Using IList allows this code to work with Arrays too.
Since the extension method will be on an IList we need to change the name to make it clearer what we are writing to:
Here's how the code for the function would look:
public static void WriteToConsole<T>(this IList<T> collection)
{
int count = collection.Count();
for(int i = 0; i < count; ++i)
{
Console.Write("{0}\t", collection[i].ToString(), delimiter);
}
Console.WriteLine();
}
We can improve this even further by allowing the client to pass in the delimiter. We could then provide a second function that writes to console with the standard delimiter like this:
public static void WriteToConsole<T>(this IList<T> collection)
{
WriteToConsole<T>(collection, "\t");
}
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
int count = collection.Count();
for(int i = 0; i < count; ++i)
{
Console.Write("{0}{1}", collection[i].ToString(), delimiter);
}
Console.WriteLine();
}
So now, given that we want a brief, clear performant way of writing lists to the console we have one. Here is entire source code including a demonstration of using the the library function:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleWritelineTest
{
public static class Extensions
{
public static void WriteToConsole<T>(this IList<T> collection)
{
WriteToConsole<T>(collection, "\t");
}
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
int count = collection.Count();
for(int i = 0; i < count; ++i)
{
Console.Write("{0}{1}", collection[i].ToString(), delimiter);
}
Console.WriteLine();
}
}
internal class Foo
{
override public string ToString()
{
return "FooClass";
}
}
internal class Program
{
static void Main(string[] args)
{
var myIntList = new List<int> {1, 2, 3, 4, 5};
var myDoubleList = new List<double> {1.1, 2.2, 3.3, 4.4};
var myDoubleArray = new Double[] {12.3, 12.4, 12.5, 12.6};
var myFooList = new List<Foo> {new Foo(), new Foo(), new Foo()};
// Using the standard delimiter /t
myIntList.WriteToConsole();
myDoubleList.WriteToConsole();
myDoubleArray.WriteToConsole();
myFooList.WriteToConsole();
// Using our own delimiter ~
myIntList.WriteToConsole("~");
Console.Read();
}
}
}
=======================================================
You might think that this should be the end of the answer. However there is a further piece of generalisation that can be done. It's not clear from fatcat's question if he is always writing to the console. Perhaps something else is to be done in the foreach. In that case Jason Bunting's answer is going to give that generality. Here is his answer again:
list.ForEach(i => Console.Write("{0}\t", i));
That is unless we make one more refinement to our extension methods and add FastForEach as below:
public static void FastForEach<T>(this IList<T> collection, Action<T> actionToPerform)
{
int count = collection.Count();
for (int i = 0; i < count; ++i)
{
actionToPerform(collection[i]);
}
Console.WriteLine();
}
This allows us to execute any arbitrary code against every element in the collection using the fastest possible iteration method.
We can even change the WriteToConsole function to use FastForEach
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
collection.FastForEach(item => Console.Write("{0}{1}", item.ToString(), delimiter));
}
So now the entire source code, including an example usage of FastForEach is:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleWritelineTest
{
public static class Extensions
{
public static void WriteToConsole<T>(this IList<T> collection)
{
WriteToConsole<T>(collection, "\t");
}
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
collection.FastForEach(item => Console.Write("{0}{1}", item.ToString(), delimiter));
}
public static void FastForEach<T>(this IList<T> collection, Action<T> actionToPerform)
{
int count = collection.Count();
for (int i = 0; i < count; ++i)
{
actionToPerform(collection[i]);
}
Console.WriteLine();
}
}
internal class Foo
{
override public string ToString()
{
return "FooClass";
}
}
internal class Program
{
static void Main(string[] args)
{
var myIntList = new List<int> {1, 2, 3, 4, 5};
var myDoubleList = new List<double> {1.1, 2.2, 3.3, 4.4};
var myDoubleArray = new Double[] {12.3, 12.4, 12.5, 12.6};
var myFooList = new List<Foo> {new Foo(), new Foo(), new Foo()};
// Using the standard delimiter /t
myIntList.WriteToConsole();
myDoubleList.WriteToConsole();
myDoubleArray.WriteToConsole();
myFooList.WriteToConsole();
// Using our own delimiter ~
myIntList.WriteToConsole("~");
// What if we want to write them to separate lines?
myIntList.FastForEach(item => Console.WriteLine(item.ToString()));
Console.Read();
}
}
}
List<int> list = new List<int> { 1, 3, 5 };
list.ForEach(x => Console.WriteLine(x));
Edit: Dammit! took too long to open visual studio to test it.
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
a.ForEach(p => Console.WriteLine(p));
edit: ahhh he beat me to it.
list.ForEach(x=>Console.WriteLine(x));
Also you can do join:
var qwe = new List<int> {5, 2, 3, 8};
Console.WriteLine(string.Join("\t", qwe));
public static void WriteLine(this List<int> theList)
{
foreach (int i in list)
{
Console.Write("{0}\t", t.ToString());
}
Console.WriteLine();
}
Then, later...
list.WriteLine();
Related
I frequently find myself writing code like this:
List<int> list = new List<int> { 1, 3, 5 };
foreach (int i in list) {
Console.Write("{0}\t", i.ToString()); }
Console.WriteLine();
Better would be something like this:
List<int> list = new List<int> { 1, 3, 5 };
Console.WriteLine("{0}\t", list);
I suspect there's some clever way of doing this, but I don't see it. Does anybody have a better solution than the first block?
Do this:
list.ForEach(i => Console.Write("{0}\t", i));
EDIT: To others that have responded - he wants them all on the same line, with tabs between them. :)
A different approach, just for kicks:
Console.WriteLine(string.Join("\t", list));
new List<int> { 1, 3, 5 }.ForEach(Console.WriteLine);
If there is a piece of code that you repeat all the time according to Don't Repeat Yourself you should put it in your own library and call that. With that in mind there are 2 aspects to getting the right answer here. The first is clarity and brevity in the code that calls the library function. The second is the performance implications of foreach.
First let's think about the clarity and brevity in the calling code.
You can do foreach in a number of ways:
for loop
foreach loop
Collection.ForEach
Out of all the ways to do a foreach List.ForEach with a lamba is the clearest and briefest.
list.ForEach(i => Console.Write("{0}\t", i));
So at this stage it may look like the List.ForEach is the way to go. However what's the performance of this? It's true that in this case the time to write to the console will govern the performance of the code. When we know something about performance of a particular language feature we should certainly at least consider it.
According to Duston Campbell's performance measurements of foreach the fastest way of iterating the list under optimised code is using a for loop without a call to List.Count.
The for loop however is a verbose construct. It's also seen as a very iterative way of doing things which doesn't match with the current trend towards functional idioms.
So can we get brevity, clarity and performance? We can by using an extension method. In an ideal world we would create an extension method on Console that takes a list and writes it with a delimiter. We can't do this because Console is a static class and extension methods only work on instances of classes. Instead we need to put the extension method on the list itself (as per David B's suggestion):
public static void WriteLine(this List<int> theList)
{
foreach (int i in list)
{
Console.Write("{0}\t", t.ToString());
}
Console.WriteLine();
}
This code is going to used in many places so we should carry out the following improvements:
Instead of using foreach we should use the fastest way of iterating the collection which is a for loop with a cached count.
Currently only List can be passed as an argument. As a library function we can generalise it through a small amount of effort.
Using List limits us to just Lists, Using IList allows this code to work with Arrays too.
Since the extension method will be on an IList we need to change the name to make it clearer what we are writing to:
Here's how the code for the function would look:
public static void WriteToConsole<T>(this IList<T> collection)
{
int count = collection.Count();
for(int i = 0; i < count; ++i)
{
Console.Write("{0}\t", collection[i].ToString(), delimiter);
}
Console.WriteLine();
}
We can improve this even further by allowing the client to pass in the delimiter. We could then provide a second function that writes to console with the standard delimiter like this:
public static void WriteToConsole<T>(this IList<T> collection)
{
WriteToConsole<T>(collection, "\t");
}
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
int count = collection.Count();
for(int i = 0; i < count; ++i)
{
Console.Write("{0}{1}", collection[i].ToString(), delimiter);
}
Console.WriteLine();
}
So now, given that we want a brief, clear performant way of writing lists to the console we have one. Here is entire source code including a demonstration of using the the library function:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleWritelineTest
{
public static class Extensions
{
public static void WriteToConsole<T>(this IList<T> collection)
{
WriteToConsole<T>(collection, "\t");
}
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
int count = collection.Count();
for(int i = 0; i < count; ++i)
{
Console.Write("{0}{1}", collection[i].ToString(), delimiter);
}
Console.WriteLine();
}
}
internal class Foo
{
override public string ToString()
{
return "FooClass";
}
}
internal class Program
{
static void Main(string[] args)
{
var myIntList = new List<int> {1, 2, 3, 4, 5};
var myDoubleList = new List<double> {1.1, 2.2, 3.3, 4.4};
var myDoubleArray = new Double[] {12.3, 12.4, 12.5, 12.6};
var myFooList = new List<Foo> {new Foo(), new Foo(), new Foo()};
// Using the standard delimiter /t
myIntList.WriteToConsole();
myDoubleList.WriteToConsole();
myDoubleArray.WriteToConsole();
myFooList.WriteToConsole();
// Using our own delimiter ~
myIntList.WriteToConsole("~");
Console.Read();
}
}
}
=======================================================
You might think that this should be the end of the answer. However there is a further piece of generalisation that can be done. It's not clear from fatcat's question if he is always writing to the console. Perhaps something else is to be done in the foreach. In that case Jason Bunting's answer is going to give that generality. Here is his answer again:
list.ForEach(i => Console.Write("{0}\t", i));
That is unless we make one more refinement to our extension methods and add FastForEach as below:
public static void FastForEach<T>(this IList<T> collection, Action<T> actionToPerform)
{
int count = collection.Count();
for (int i = 0; i < count; ++i)
{
actionToPerform(collection[i]);
}
Console.WriteLine();
}
This allows us to execute any arbitrary code against every element in the collection using the fastest possible iteration method.
We can even change the WriteToConsole function to use FastForEach
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
collection.FastForEach(item => Console.Write("{0}{1}", item.ToString(), delimiter));
}
So now the entire source code, including an example usage of FastForEach is:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleWritelineTest
{
public static class Extensions
{
public static void WriteToConsole<T>(this IList<T> collection)
{
WriteToConsole<T>(collection, "\t");
}
public static void WriteToConsole<T>(this IList<T> collection, string delimiter)
{
collection.FastForEach(item => Console.Write("{0}{1}", item.ToString(), delimiter));
}
public static void FastForEach<T>(this IList<T> collection, Action<T> actionToPerform)
{
int count = collection.Count();
for (int i = 0; i < count; ++i)
{
actionToPerform(collection[i]);
}
Console.WriteLine();
}
}
internal class Foo
{
override public string ToString()
{
return "FooClass";
}
}
internal class Program
{
static void Main(string[] args)
{
var myIntList = new List<int> {1, 2, 3, 4, 5};
var myDoubleList = new List<double> {1.1, 2.2, 3.3, 4.4};
var myDoubleArray = new Double[] {12.3, 12.4, 12.5, 12.6};
var myFooList = new List<Foo> {new Foo(), new Foo(), new Foo()};
// Using the standard delimiter /t
myIntList.WriteToConsole();
myDoubleList.WriteToConsole();
myDoubleArray.WriteToConsole();
myFooList.WriteToConsole();
// Using our own delimiter ~
myIntList.WriteToConsole("~");
// What if we want to write them to separate lines?
myIntList.FastForEach(item => Console.WriteLine(item.ToString()));
Console.Read();
}
}
}
List<int> list = new List<int> { 1, 3, 5 };
list.ForEach(x => Console.WriteLine(x));
Edit: Dammit! took too long to open visual studio to test it.
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
a.ForEach(p => Console.WriteLine(p));
edit: ahhh he beat me to it.
list.ForEach(x=>Console.WriteLine(x));
Also you can do join:
var qwe = new List<int> {5, 2, 3, 8};
Console.WriteLine(string.Join("\t", qwe));
public static void WriteLine(this List<int> theList)
{
foreach (int i in list)
{
Console.Write("{0}\t", t.ToString());
}
Console.WriteLine();
}
Then, later...
list.WriteLine();
My question is basically what's a good programming practice. In case of IEnumerable each item is evaluated at a time where as in case of ToList the whole collection gets iterated before it starts the for loop.
As per below code which function (GetBool1 vs GetBool2) should be used and why.
public class TestListAndEnumerable1
{
public static void Test()
{
GetBool1();
GetBool2();
Console.ReadLine();
}
private static void GetBool1()
{
var list = new List<int> {0,1,2,3,4,5,6,7,8,9};
foreach (var item in list.Where(PrintAndEvaluate))
{
Thread.Sleep(1000);
}
}
private static bool PrintAndEvaluate(int x)
{
Console.WriteLine("Hi from " + x);
return x%2==0;
}
private static void GetBool2()
{
List<int> list = new List<int> { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
foreach (var item in list.Where(PrintAndEvaluate).ToList())
{
Thread.Sleep(1000);
}
}
}
The bahviour of the two loops is different. In the first case the Console will be written to as each item is iterated and evaluated, and a Sleep will occur between each Console.Write.
In the second case the Console Writes will also be evaluated, but these evaluations will all occur before the Sleeps - these occur only when all the PrintAndEvaluate calls have finished.
The second case enumerates the members of the list twice, allocating and fragmenting memory as it does so.
If your question is "which is most efficient" then the answer is the first example, but if you want to know "is there another more efficient method" then just use a loop like;
for(int counter = 0 ; counter <= list.Count; counter ++)
{
if(PrintAndEvaluate(list[counter]))
{
Thread.Sleep(1000);
}
}
This prevents the construction of an instance of an Iterator class so does not contribute to heap fragmentation.
GetBool1 should be used.
The sole difference between the two methods is the presence of the ToList() call. Right?
Let's look at the significance of the ToList call by first reading its docs:
Creates a List<T> from an IEnumerable<T>.
This means that a new list will be created when you call ToList. As you may know, creating a new list takes time and memory.
On the other hand GetBool1 does not have a ToList call, so it does not take as much time to execute.
GetBool1 is the better option. For the option2, even you convert the IEnumerable to List, when you call foreach it called the GetEnumerator again. But the differences are very little. I make a little change of your code to output the execution time:
public static void Test()
{
var list = new List<int>();
for (int i = 0; i < 10000; i++)
{
list.Add(i);
}
GetBool1(list);
GetBool2(list);
GetBool3(list);
Console.ReadLine();
}
private static void GetBool1(List<int> list)
{
System.Diagnostics.Stopwatch watcher = new System.Diagnostics.Stopwatch();
watcher.Start();
foreach (var item in list.Where(PrintAndEvaluate))
{
Thread.Sleep(1);
}
watcher.Stop();
Console.WriteLine("GetBool1 - {0}", watcher.ElapsedMilliseconds);
}
private static bool PrintAndEvaluate(int x)
{
return x % 2 == 0;
}
private static void GetBool2(List<int> list)
{
System.Diagnostics.Stopwatch watcher = new System.Diagnostics.Stopwatch();
watcher.Start();
foreach (var item in list.Where(PrintAndEvaluate).ToList())
{
Thread.Sleep(1);
}
watcher.Stop();
Console.WriteLine("GetBool2 - {0}", watcher.ElapsedMilliseconds);
}
The output is:
I have the following code:
private void AddMissingValue(ref string[] someArray) {
string mightbemissing="Another Value";
if (!someArray.Contains(mightbemissing)) {
var lst=someArray.ToList();
lst.Add(mightbemissing);
someArray=lst.ToArray();
}
}
While this works (Add an item to an array if missing), I wonder if this can be done in a smarter way? I don't like converting the array twice and writing so many lines for such a simple task.
Is there a better way? Maybe using LinQ?
General idea is right - array is a fixed-sized collection and you cannot add an item to it without recreating an array.
Your method can be written in a slightly more elegant way using LINQ .Concat method without creating a List:
private void AddMissingValue(ref string[] someArray)
{
string mightbemissing = "Another Value";
if (!someArray.Contains(mightbemissing))
{
someArray = someArray.Concat(new[] { mightbemissing }).ToArray();
}
}
This implementation takes N * 2 operations which is better than your N * 3, but it is still enumerating it multiple times and is quadratic for adding N items to your array.
If you are going to perform this operation too often, then changing your code to use dynamic-size collections (f.i., List) would be a more effective way.
Even if you decide to continue using arrays, it probably (imo) will look better if you return modified array instead of using ref:
private string[] AddMissingValue(string[] someArray)
{
string mightbemissing = "Another Value";
return someArray.Contains(mightbemissing)
? someArray
: someArray.Concat(new[] { mightbemissing }).ToArray();
}
// Example usage:
string[] yourInputArray = ...;
yourInputArray = AddMissingValue(yourInputArray);
LINQ-style and the most performant
Another implementation which comes to my mind and is the best (O(N)) in terms of performance (not against dynamic-size collections, but against previous solutions) and is LINQ-styled:
public static class CollectionExtensions
{
public static IEnumerable<T> AddIfNotExists<T>(this IEnumerable<T> enumerable, T value)
{
bool itemExists = false;
foreach (var item in enumerable)
{
if (!itemExists && value.Equals(item))
itemExists = true;
yield return item;
}
if (!itemExists)
yield return value;
}
}
// Example usage:
string[] arr = ...;
arr = arr.AddIfNotExists("Another Value").ToArray();
This implementation with yield is used to prevent multiple enumeration.
If you need to add multiple items, then it can even be rewritten this way, and it seems to still be linear:
public static IEnumerable<T> AddIfNotExists<T>(this IEnumerable<T> enumerable, params T[] value)
{
HashSet<T> notExistentItems = new HashSet<T>(value);
foreach (var item in enumerable)
{
if (notExistentItems.Contains(item))
notExistentItems.Remove(item);
yield return item;
}
foreach (var notExistentItem in notExistentItems)
yield return notExistentItem;
}
// Usage example:
int[] arr = new[] { 1, 2, 3 };
arr = arr.AddIfNotExists(2, 3, 4, 5).ToArray(); // 1, 2, 3, 4, 5
You have to resize the array, see
https://msdn.microsoft.com/en-us/library/bb348051(v=vs.110).aspx
for details. Implementation:
// static: it seems that you don't want "this" in the method
private static void AddMissingValue(ref string[] someArray) {
string mightbemissing = "Another Value";
if (!someArray.Contains(mightbemissing)) {
Array.Resize(ref someArray, someArray.Length + 1);
someArray[someArray.Length - 1] = mightbemissing;
}
}
In you current implementation, you copy all the items twice which can be unwanted if the array is large
...
var lst=someArray.ToList(); // first: all data copied from array to list
lst.Add(mightbemissing);
someArray=lst.ToArray(); // second: all data copied from list to array
A better design, however, is to switch from fixed size array string[] to, say, List<string>:
List<string> someList = ...
if (!someList.Contains(mightbemissing))
someList.Add(mightbemissing); // <- just Add
if all the values should be not null and unique you can do further improvement:
HashSet<string> someHash = ...
someHash.Add(mightbemissing);
I'm looking for some advice on writing some thread safe, optimized, elegant code to do the following:
I want a static method to return a sequence of integers. So for example, the application starts, thread 1 calls the GetSequence method and says it wants to take 3, so it gets an integer array back consisting of 0,1,2. Thread 2 then calls the method and says give me 4, so it returns 3,4,5,6. Multiple threads can simultaneously call this method.
To give an idea of the sort of thing I'm thinking of, here's my attempt at this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace SequenceNumberService
{
class Program
{
static void Main(string[] args)
{
int[] numbers = NumberSequenceService.GetSequence(3);
foreach (var item in numbers)
{
Console.WriteLine(item.ToString());
}
// Writes out:
// 0
// 1
// 2
Console.ReadLine();
}
}
public static class NumberSequenceService
{
private static int m_LastNumber;
private static object m_Lock = new Object();
public static int[] GetSequence(int take)
{
int[] returnVal = new int[take];
int lastNumber;
// Increment the last audit number, based on the take value.
// It is here where I am concerned that there is a threading issue, as multiple threads
// may hit these lines of code at the same time. Should I just put a lock around these two lines
// of code, or is there a nicer way to achieve this.
lock (m_Lock)
{
m_LastNumber = m_LastNumber + take;
lastNumber = m_LastNumber;
}
for (int i = take; i > 0; i--)
{
returnVal[take - i] = lastNumber - i;
}
return returnVal;
}
}
}
My questions therefore are:
Am I approaching this in the best way, or is there another way to achieve this?
Any suggestions for optimizing this code?
Many thanks in advance for any help.
You probably want to look into the Interlocked class and it's Increment and Add methods:
public static Int32 num = 0;
public static Int32 GetSequence()
{
return Interlocked.Increment(ref num);
}
public static IEnumerable<Int32> GetSequenceRange(Int32 count)
{
var newValue = Interlocked.Add(ref num, count);
return Enumerable.Range(newValue - count, count);
}
Context: C# 3.0, .Net 3.5
Suppose I have a method that generates random numbers (forever):
private static IEnumerable<int> RandomNumberGenerator() {
while (true) yield return GenerateRandomNumber(0, 100);
}
I need to group those numbers in groups of 10, so I would like something like:
foreach (IEnumerable<int> group in RandomNumberGenerator().Slice(10)) {
Assert.That(group.Count() == 10);
}
I have defined Slice method, but I feel there should be one already defined. Here is my Slice method, just for reference:
private static IEnumerable<T[]> Slice<T>(IEnumerable<T> enumerable, int size) {
var result = new List<T>(size);
foreach (var item in enumerable) {
result.Add(item);
if (result.Count == size) {
yield return result.ToArray();
result.Clear();
}
}
}
Question: is there an easier way to accomplish what I'm trying to do? Perhaps Linq?
Note: above example is a simplification, in my program I have an Iterator that scans given matrix in a non-linear fashion.
EDIT: Why Skip+Take is no good.
Effectively what I want is:
var group1 = RandomNumberGenerator().Skip(0).Take(10);
var group2 = RandomNumberGenerator().Skip(10).Take(10);
var group3 = RandomNumberGenerator().Skip(20).Take(10);
var group4 = RandomNumberGenerator().Skip(30).Take(10);
without the overhead of regenerating number (10+20+30+40) times. I need a solution that will generate exactly 40 numbers and break those in 4 groups by 10.
Are Skip and Take of any use to you?
Use a combination of the two in a loop to get what you want.
So,
list.Skip(10).Take(10);
Skips the first 10 records and then takes the next 10.
I have done something similar. But I would like it to be simpler:
//Remove "this" if you don't want it to be a extension method
public static IEnumerable<IList<T>> Chunks<T>(this IEnumerable<T> xs, int size)
{
var curr = new List<T>(size);
foreach (var x in xs)
{
curr.Add(x);
if (curr.Count == size)
{
yield return curr;
curr = new List<T>(size);
}
}
}
I think yours are flawed. You return the same array for all your chunks/slices so only the last chunk/slice you take would have the correct data.
Addition: Array version:
public static IEnumerable<T[]> Chunks<T>(this IEnumerable<T> xs, int size)
{
var curr = new T[size];
int i = 0;
foreach (var x in xs)
{
curr[i % size] = x;
if (++i % size == 0)
{
yield return curr;
curr = new T[size];
}
}
}
Addition: Linq version (not C# 2.0). As pointed out, it will not work on infinite sequences and will be a great deal slower than the alternatives:
public static IEnumerable<T[]> Chunks<T>(this IEnumerable<T> xs, int size)
{
return xs.Select((x, i) => new { x, i })
.GroupBy(xi => xi.i / size, xi => xi.x)
.Select(g => g.ToArray());
}
Using Skip and Take would be a very bad idea. Calling Skip on an indexed collection may be fine, but calling it on any arbitrary IEnumerable<T> is liable to result in enumeration over the number of elements skipped, which means that if you're calling it repeatedly you're enumerating over the sequence an order of magnitude more times than you need to be.
Complain of "premature optimization" all you want; but that is just ridiculous.
I think your Slice method is about as good as it gets. I was going to suggest a different approach that would provide deferred execution and obviate the intermediate array allocation, but that is a dangerous game to play (i.e., if you try something like ToList on such a resulting IEnumerable<T> implementation, without enumerating over the inner collections, you'll end up in an endless loop).
(I've removed what was originally here, as the OP's improvements since posting the question have since rendered my suggestions here redundant.)
Let's see if you even need the complexity of Slice. If your random number generates is stateless, I would assume each call to it would generate unique random numbers, so perhaps this would be sufficient:
var group1 = RandomNumberGenerator().Take(10);
var group2 = RandomNumberGenerator().Take(10);
var group3 = RandomNumberGenerator().Take(10);
var group4 = RandomNumberGenerator().Take(10);
Each call to Take returns a new group of 10 numbers.
Now, if your random number generator re-seeds itself with a specific value each time it's iterated, this won't work. You'll simply get the same 10 values for each group. So instead, you would use:
var generator = RandomNumberGenerator();
var group1 = generator.Take(10);
var group2 = generator.Take(10);
var group3 = generator.Take(10);
var group4 = generator.Take(10);
This maintains an instance of the generator so that you can continue retrieving values without re-seeding the generator.
You could use the Skip and Take methods with any Enumerable object.
For your edit :
How about a function that takes a slice number and a slice size as a parameter?
private static IEnumerable<T> Slice<T>(IEnumerable<T> enumerable, int sliceSize, int sliceNumber) {
return enumerable.Skip(sliceSize * sliceNumber).Take(sliceSize);
}
It seems like we'd prefer for an IEnumerable<T> to have a fixed position counter so that we can do
var group1 = items.Take(10);
var group2 = items.Take(10);
var group3 = items.Take(10);
var group4 = items.Take(10);
and get successive slices rather than getting the first 10 items each time. We can do that with a new implementation of IEnumerable<T> which keeps one instance of its Enumerator and returns it on every call of GetEnumerator:
public class StickyEnumerable<T> : IEnumerable<T>, IDisposable
{
private IEnumerator<T> innerEnumerator;
public StickyEnumerable( IEnumerable<T> items )
{
innerEnumerator = items.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return innerEnumerator;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return innerEnumerator;
}
public void Dispose()
{
if (innerEnumerator != null)
{
innerEnumerator.Dispose();
}
}
}
Given that class, we could implement Slice with
public static IEnumerable<IEnumerable<T>> Slices<T>(this IEnumerable<T> items, int size)
{
using (StickyEnumerable<T> sticky = new StickyEnumerable<T>(items))
{
IEnumerable<T> slice;
do
{
slice = sticky.Take(size).ToList();
yield return slice;
} while (slice.Count() == size);
}
yield break;
}
That works in this case, but StickyEnumerable<T> is generally a dangerous class to have around if the consuming code isn't expecting it. For example,
using (var sticky = new StickyEnumerable<int>(Enumerable.Range(1, 10)))
{
var first = sticky.Take(2);
var second = sticky.Take(2);
foreach (int i in second)
{
Console.WriteLine(i);
}
foreach (int i in first)
{
Console.WriteLine(i);
}
}
prints
1
2
3
4
rather than
3
4
1
2
Take a look at Take(), TakeWhile() and Skip()
I think the use of Slice() would be a bit misleading. I think of that as a means to give me a chuck of an array into a new array and not causing side effects. In this scenario you would actually move the enumerable forward 10.
A possible better approach is to just use the Linq extension Take(). I don't think you would need to use Skip() with a generator.
Edit: Dang, I have been trying to test this behavior with the following code
Note: this is wasn't really correct, I leave it here so others don't fall into the same mistake.
var numbers = RandomNumberGenerator();
var slice = numbers.Take(10);
public static IEnumerable<int> RandomNumberGenerator()
{
yield return random.Next();
}
but the Count() for slice is alway 1. I also tried running it through a foreach loop since I know that the Linq extensions are generally lazily evaluated and it only looped once. I eventually did the code below instead of the Take() and it works:
public static IEnumerable<int> Slice(this IEnumerable<int> enumerable, int size)
{
var list = new List<int>();
foreach (var count in Enumerable.Range(0, size)) list.Add(enumerable.First());
return list;
}
If you notice I am adding the First() to the list each time, but since the enumerable that is being passed in is the generator from RandomNumberGenerator() the result is different every time.
So again with a generator using Skip() is not needed since the result will be different. Looping over an IEnumerable is not always side effect free.
Edit: I'll leave the last edit just so no one falls into the same mistake, but it worked fine for me just doing this:
var numbers = RandomNumberGenerator();
var slice1 = numbers.Take(10);
var slice2 = numbers.Take(10);
The two slices were different.
I had made some mistakes in my original answer but some of the points still stand. Skip() and Take() are not going to work the same with a generator as it would a list. Looping over an IEnumerable is not always side effect free. Anyway here is my take on getting a list of slices.
public static IEnumerable<int> RandomNumberGenerator()
{
while(true) yield return random.Next();
}
public static IEnumerable<IEnumerable<int>> Slice(this IEnumerable<int> enumerable, int size, int count)
{
var slices = new List<List<int>>();
foreach (var iteration in Enumerable.Range(0, count)){
var list = new List<int>();
list.AddRange(enumerable.Take(size));
slices.Add(list);
}
return slices;
}
I got this solution for the same problem:
int[] ints = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
IEnumerable<IEnumerable<int>> chunks = Chunk(ints, 2, t => t.Dump());
//won't enumerate, so won't do anything unless you force it:
chunks.ToList();
IEnumerable<T> Chunk<T, R>(IEnumerable<R> src, int n, Func<IEnumerable<R>, T> action){
IEnumerable<R> head;
IEnumerable<R> tail = src;
while (tail.Any())
{
head = tail.Take(n);
tail = tail.Skip(n);
yield return action(head);
}
}
if you just want the chunks returned, not do anything with them, use chunks = Chunk(ints, 2, t => t). What I would really like is to have to have t=>t as default action, but I haven't found out how to do that yet.