Thoughts on foreach with Enumerable.Range vs traditional for loop - c#

In C# 3.0, I'm liking this style:
// Write the numbers 1 thru 7
foreach (int index in Enumerable.Range( 1, 7 ))
{
Console.WriteLine(index);
}
over the traditional for loop:
// Write the numbers 1 thru 7
for (int index = 1; index <= 7; index++)
{
Console.WriteLine( index );
}
Assuming 'n' is small so performance is not an issue, does anyone object to the new style over the traditional style?

I find the latter's "minimum-to-maximum" format a lot clearer than Range's "minimum-count" style for this purpose. Also, I don't think it's really a good practice to make a change like this from the norm that is not faster, not shorter, not more familiar, and not obviously clearer.
That said, I'm not against the idea in general. If you came up to me with syntax that looked something like foreach (int x from 1 to 8) then I'd probably agree that that would be an improvement over a for loop. However, Enumerable.Range is pretty clunky.

This is just for fun. (I'd just use the standard "for (int i = 1; i <= 10; i++)" loop format myself.)
foreach (int i in 1.To(10))
{
Console.WriteLine(i); // 1,2,3,4,5,6,7,8,9,10
}
// ...
public static IEnumerable<int> To(this int from, int to)
{
if (from < to)
{
while (from <= to)
{
yield return from++;
}
}
else
{
while (from >= to)
{
yield return from--;
}
}
}
You could also add a Step extension method too:
foreach (int i in 5.To(-9).Step(2))
{
Console.WriteLine(i); // 5,3,1,-1,-3,-5,-7,-9
}
// ...
public static IEnumerable<T> Step<T>(this IEnumerable<T> source, int step)
{
if (step == 0)
{
throw new ArgumentOutOfRangeException("step", "Param cannot be zero.");
}
return source.Where((x, i) => (i % step) == 0);
}

In C# 6.0 with the use of
using static System.Linq.Enumerable;
you can simplify it to
foreach (var index in Range(1, 7))
{
Console.WriteLine(index);
}

You can actually do this in C# (by providing To and Do as extension methods on int and IEnumerable<T> respectively):
1.To(7).Do(Console.WriteLine);
SmallTalk forever!

I kind of like the idea. It's very much like Python. Here's my version in a few lines:
static class Extensions
{
public static IEnumerable<int> To(this int from, int to, int step = 1) {
if (step == 0)
throw new ArgumentOutOfRangeException("step", "step cannot be zero");
// stop if next `step` reaches or oversteps `to`, in either +/- direction
while (!(step > 0 ^ from < to) && from != to) {
yield return from;
from += step;
}
}
}
It works like Python's:
0.To(4) → [ 0, 1, 2, 3 ]
4.To(0) → [ 4, 3, 2, 1 ]
4.To(4) → [ ]
7.To(-3, -3) → [ 7, 4, 1, -2 ]

I think the foreach + Enumerable.Range is less error prone (you have less control and less ways to do it wrong, like decreasing the index inside the body so the loop would never end, etc.)
The readability problem is about the Range function semantics, that can change from one language to another (e.g if given just one parameter will it begin from 0 or 1, or is the end included or excluded or is the second parameter a count instead a end value).
About the performance, I think the compiler should be smart enough to optimize both loops so they execute at a similar speed, even with large ranges (I suppose that Range does not create a collection, but of course an iterator).

I think Range is useful for working with some range inline:
var squares = Enumerable.Range(1, 7).Select(i => i * i);
You can each over. Requires converting to list but keeps things compact when that's what you want.
Enumerable.Range(1, 7).ToList().ForEach(i => Console.WriteLine(i));
But other than for something like this, I'd use traditional for loop.

It seems like quite a long winded approach to a problem that's already solved. There's a whole state machine behind the Enumerable.Range that isn't really needed.
The traditional format is fundamental to development and familiar to all. I don't really see any advantage to your new style.

I'd like to have the syntax of some other languages like Python, Haskell, etc.
// Write the numbers 1 thru 7
foreach (int index in [1..7])
{
Console.WriteLine(index);
}
Fortunatly, we got F# now :)
As for C#, I'll have to stick with the Enumerable.Range method.

#Luke:
I reimplemented your To() extension method and used the Enumerable.Range() method to do it.
This way it comes out a little shorter and uses as much infrastructure given to us by .NET as possible:
public static IEnumerable<int> To(this int from, int to)
{
return from < to
? Enumerable.Range(from, to - from + 1)
: Enumerable.Range(to, from - to + 1).Reverse();
}

How to use a new syntax today
Because of this question I tried out some things to come up with a nice syntax without waiting for first-class language support. Here's what I have:
using static Enumerizer;
// prints: 0 1 2 3 4 5 6 7 8 9
foreach (int i in 0 <= i < 10)
Console.Write(i + " ");
Not the difference between <= and <.
I also created a proof of concept repository on GitHub with even more functionality (reversed iteration, custom step size).
A minimal and very limited implementation of the above loop would look something like like this:
public readonly struct Enumerizer
{
public static readonly Enumerizer i = default;
public Enumerizer(int start) =>
Start = start;
public readonly int Start;
public static Enumerizer operator <(int start, Enumerizer _) =>
new Enumerizer(start);
public static Enumerizer operator >(int _, Enumerizer __) =>
throw new NotImplementedException();
public static IEnumerable<int> operator <=(Enumerizer start, int end)
{
for (int i = start.Start; i < end; i++)
yield return i;
}
public static IEnumerable<int> operator >=(Enumerizer _, int __) =>
throw new NotImplementedException();
}

There is no significant performance difference between traditional iteration and range iteration, as Nick Chapsas pointed out in his excellent YouTube video. Even the benchmark showed there is some difference in nanoseconds for the small number of iterations. As the loop gets quite big, the difference is almost gone.
Here is an elegant way of iterating in a range loop from his content:
private static void Test()
{
foreach (var i in 1..5)
{
}
}
Using this extension:
public static class Extension
{
public static CustomIntEnumerator GetEnumerator(this Range range)
{
return new CustomIntEnumerator(range);
}
public static CustomIntEnumerator GetEnumerator(this int number)
{
return new CustomIntEnumerator(new Range(0, number));
}
}
public ref struct CustomIntEnumerator
{
private int _current;
private readonly int _end;
public CustomIntEnumerator(Range range)
{
if (range.End.IsFromEnd)
{
throw new NotSupportedException();
}
_current = range.Start.Value - 1;
_end = range.End.Value;
}
public int Current => _current;
public bool MoveNext()
{
_current++;
return _current <= _end;
}
}
Benchmark result:
I loved this way of implementation. But, the biggest issue with this approach is its inability to use it in the async method.

I'm sure everybody has their personal preferences (many would prefer the later just because it is familiar over almost all programming languages), but I am like you and starting to like the foreach more and more, especially now that you can define a range.

In my opinion the Enumerable.Range() way is more declarative. New and unfamiliar to people? Certainly. But I think this declarative approach yields the same benefits as most other LINQ-related language features.

I imagine there could be scenarios where Enumerable.Range(index, count) is clearer when dealing with expressions for the parameters, especially if some of the values in that expression are altered within the loop. In the case of for the expression would be evaluated based on the state after the current iteration, whereas Enumerable.Range() is evaluated up-front.
Other than that, I'd agree that sticking with for would normally be better (more familiar/readable to more people... readable is a very important value in code that needs to be maintained).

I agree that in many (or even most cases) foreach is much more readable than a standard for-loop when simply iterating over a collection. However, your choice of using Enumerable.Range(index, count) isn't a strong example of the value of foreach over for.
For a simple range starting from 1, Enumerable.Range(index, count) looks quite readable. However, if the range starts with a different index, it becomes less readable because you have to properly perform index + count - 1 to determine what the last element will be. For example…
// Write the numbers 2 thru 8
foreach (var index in Enumerable.Range( 2, 7 ))
{
Console.WriteLine(index);
}
In this case, I much prefer the second example.
// Write the numbers 2 thru 8
for (int index = 2; index <= 8; index++)
{
Console.WriteLine(index);
}

Strictly speaking, you misuse enumeration.
Enumerator provides the means to access all the objects in a container one-by-one, but it does not guarantee the order.
It is OK to use enumeration to find the biggest number in an array. If you are using it to find, say, first non-zero element, you are relying on the implementation detail you should not know about. In your example, the order seems to be important to you.
Edit: I am wrong. As Luke pointed out (see comments) it is safe to rely on the order when enumerating an array in C#. This is different from, for example, using "for in" for enumerating an array in Javascript .

I do like the foreach + Enumerable.Range approach and use it sometimes.
// does anyone object to the new style over the traditional style?
foreach (var index in Enumerable.Range(1, 7))
I object to the var abuse in your proposal. I appreciate var, but, damn, just write int in this case! ;-)

Just throwing my hat into the ring.
I define this...
namespace CustomRanges {
public record IntRange(int From, int Thru, int step = 1) : IEnumerable<int> {
public IEnumerator<int> GetEnumerator() {
for (var i = From; i <= Thru; i += step)
yield return i;
}
IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
};
public static class Definitions {
public static IntRange FromTo(int from, int to, int step = 1)
=> new IntRange(from, to - 1, step);
public static IntRange FromThru(int from, int thru, int step = 1)
=> new IntRange(from, thru, step);
public static IntRange CountFrom(int from, int count)
=> new IntRange(from, from + count - 1);
public static IntRange Count(int count)
=> new IntRange(0, count);
// Add more to suit your needs. For instance, you could add in reversing ranges, etc.
}
}
Then anywhere I want to use it, I add this at the top of the file...
using static CustomRanges.Definitions;
And use it like this...
foreach(var index in FromTo(1, 4))
Debug.WriteLine(index);
// Prints 1, 2, 3
foreach(var index in FromThru(1, 4))
Debug.WriteLine(index);
// Prints 1, 2, 3, 4
foreach(var index in FromThru(2, 10, 2))
Debug.WriteLine(index);
// Prints 2, 4, 6, 8, 10
foreach(var index in CountFrom(7, 4))
Debug.WriteLine(index);
// Prints 7, 8, 9, 10
foreach(var index in Count(5))
Debug.WriteLine(index);
// Prints 0, 1, 2, 3, 4
foreach(var _ in Count(4))
Debug.WriteLine("A");
// Prints A, A, A, A
The nice thing about this approach is by the names, you know exactly if the end is included or not.

Related

How to Order By or Sort an integer List and select the Nth element

I have a list, and I want to select the fifth highest element from it:
List<int> list = new List<int>();
list.Add(2);
list.Add(18);
list.Add(21);
list.Add(10);
list.Add(20);
list.Add(80);
list.Add(23);
list.Add(81);
list.Add(27);
list.Add(85);
But OrderbyDescending is not working for this int list...
int fifth = list.OrderByDescending(x => x).Skip(4).First();
Depending on the severity of the list not having more than 5 elements you have 2 options.
If the list never should be over 5 i would catch it as an exception:
int fifth;
try
{
fifth = list.OrderByDescending(x => x).ElementAt(4);
}
catch (ArgumentOutOfRangeException)
{
//Handle the exception
}
If you expect that it will be less than 5 elements then you could leave it as default and check it for that.
int fifth = list.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == 0)
{
//handle default
}
This is still some what flawed because you could end up having the fifth element being 0. This can be solved by typecasting the list into a list of nullable ints at before the linq:
var newList = list.Select(i => (int?)i).ToList();
int? fifth = newList.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == null)
{
//handle default
}
Without LINQ expressions:
int result;
if(list != null && list.Count >= 5)
{
list.Sort();
result = list[list.Count - 5];
}
else // define behavior when list is null OR has less than 5 elements
This has a better performance compared to LINQ expressions, although the LINQ solutions presented in my second answer are comfortable and reliable.
In case you need extreme performance for a huge List of integers, I'd recommend a more specialized algorithm, like in Matthew Watson's answer.
Attention: The List gets modified when the Sort() method is called. If you don't want that, you must work with a copy of your list, like this:
List<int> copy = new List<int>(original);
List<int> copy = original.ToList();
The easiest way to do this is to just sort the data and take N items from the front. This is the recommended way for small data sets - anything more complicated is just not worth it otherwise.
However, for large data sets it can be a lot quicker to do what's known as a Partial Sort.
There are two main ways to do this: Use a heap, or use a specialised quicksort.
The article I linked describes how to use a heap. I shall present a partial sort below:
public static IList<T> PartialSort<T>(IList<T> data, int k) where T : IComparable<T>
{
int start = 0;
int end = data.Count - 1;
while (end > start)
{
var index = partition(data, start, end);
var rank = index + 1;
if (rank >= k)
{
end = index - 1;
}
else if ((index - start) > (end - index))
{
quickSort(data, index + 1, end);
end = index - 1;
}
else
{
quickSort(data, start, index - 1);
start = index + 1;
}
}
return data;
}
static int partition<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
T x = lst[start];
int i = start;
for (int j = start + 1; j <= end; j++)
{
if (lst[j].CompareTo(x) < 0) // Or "> 0" to reverse sort order.
{
i = i + 1;
swap(lst, i, j);
}
}
swap(lst, start, i);
return i;
}
static void swap<T>(IList<T> lst, int p, int q)
{
T temp = lst[p];
lst[p] = lst[q];
lst[q] = temp;
}
static void quickSort<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
if (start >= end)
return;
int index = partition(lst, start, end);
quickSort(lst, start, index - 1);
quickSort(lst, index + 1, end);
}
Then to access the 5th largest element in a list you could do this:
PartialSort(list, 5);
Console.WriteLine(list[4]);
For large data sets, a partial sort can be significantly faster than a full sort.
Addendum
See here for another (probably better) solution that uses a QuickSelect algorithm.
This LINQ approach retrieves the 5th biggest element OR throws an exception WHEN the list is null or contains less than 5 elements:
int fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
throw new Exception("list is null OR has not enough elements");
This one retrieves the 5th biggest element OR null WHEN the list is null or contains less than 5 elements:
int? fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
default(int?);
if(fifth == null) // define behavior
This one retrieves the 5th biggest element OR the smallest element WHEN the list contains less than 5 elements:
if(list == null || list.Count <= 0)
throw new Exception("Unable to retrieve Nth biggest element");
int fifth = list.OrderByDescending(x => x).Take(5).Last();
All these solutions are reliable, they should NEVER throw "unexpected" exceptions.
PS: I'm using .NET 4.7 in this answer.
Here there is a C# implementation of the QuickSelect algorithm to select the nth element in an unordered IList<>.
You have to put all the code contained in that page in a static class, like:
public static class QuickHelpers
{
// Put the code here
}
Given that "library" (in truth a big fat block of code), then you can:
int resA = list.QuickSelect(2, (x, y) => Comparer<int>.Default.Compare(y, x));
int resB = list.QuickSelect(list.Count - 1 - 2);
Now... Normally the QuickSelect would select the nth lowest element. We reverse it in two ways:
For resA we create a reverse comparer based on the default int comparer. We do this by reversing the parameters of the Compare method. Note that the index is 0 based. So there is a 0th, 1th, 2th and so on.
For resB we use the fact that the 0th element is the list-1 th element in the reverse order. So we count from the back. The highest element would be the list.Count - 1 in an ordered list, the next one list.Count - 1 - 1, then list.Count - 1 - 2 and so on
Theorically using Quicksort should be better than ordering the list and then picking the nth element, because ordering a list is on average a O(NlogN) operation and picking the nth element is then a O(1) operation, so the composite is O(NlogN) operation, while QuickSelect is on average a O(N) operation. Clearly there is a but. The O notation doesn't show the k factor... So a O(k1 * NlogN) with a small k1 could be better than a O(k2 * N) with a big k2. Only multiple real life benchmarks can tell us (you) what is better, and it depends on the size of the collection.
A small note about the algorithm:
As with quicksort, quickselect is generally implemented as an in-place algorithm, and beyond selecting the k'th element, it also partially sorts the data. See selection algorithm for further discussion of the connection with sorting.
So it modifies the ordering of the original list.

Code efficiency and accuracy

I'm trying to solve a problem on code wars and the unit tests provided make absolutely no sense...
The problem is as follows and sounds absolutely simple enough to have something working in 5 minutes
Consider a sequence u where u is defined as follows:
The number u(0) = 1 is the first one in u.
For each x in u, then y = 2 * x + 1 and z = 3 * x + 1 must be in u too.
There are no other numbers in u.
Ex: u = [1, 3, 4, 7, 9, 10, 13, 15, 19, 21, 22, 27, ...]
1 gives 3 and 4, then 3 gives 7 and 10, 4 gives 9 and 13, then 7 gives 15 and 22 and so on...
Task:
Given parameter n the function dbl_linear (or dblLinear...) returns the element u(n) of the ordered (with <) sequence u.
Example:
dbl_linear(10) should return 22
At first I used a sortedset with a linq query as I didnt really care about efficiency, I quickly learned that this operation will have to calculate to ranges where n could equal ~100000 in under 12 seconds.
So this abomination was born, then butchered time and time again since a for loop would generate issues for some reason. It was then "upgraded" to a while loop which gave slightly more passed unit tests ( 4 -> 8 ).
public class DoubleLinear {
public static int DblLinear(int n) {
ListSet<int> table = new ListSet<int> {1};
for (int i = 0; i < n; i++) {
table.Put(Y(table[i]));
table.Put(Z(table[i]));
}
table.Sort();
return table[n];
}
private static int Y(int y) {
return 2 * y + 1;
}
private static int Z(int z) {
return 3 * z + 1;
}
}
public class ListSet<T> : List<T> {
public void Put(T item) {
if (!this.Contains(item))
this.Add(item);
}
}
With this code it still fails the calculation in excess of n = 75000, but passes up to 8 tests.
I've checked if other people have passed this, and they have. However, i cannot check what they wrote to learn from it.
Can anyone provide insight to what could be wrong here? I'm sure the answer is blatantly obvious and I'm being dumb.
Also is using a custom list in this way a bad idea? is there a better way?
ListSet is slow for sorting, and you constantly get memory reallocation as you build the set. I would start by allocating the table in its full size first, though honestly I would also tell you using a barebones array of the size you need is best for performance.
If you know you need n = 75,000+, allocate a ListSet (or an ARRAY!) of that size. If the unit tests start taking you into the stratosphere, there is a binary segmentation technique we can discuss, but that's a bit involved and logically tougher to build.
I don't see anything logically wrong with the code. The numbers it generates are correct from where I'm standing.
EDIT: Since you know 3n+1 > 2n+1, you only ever have to maintain 6 values:
Target index in u
Current index in u
Current x for y
Current x for z
Current val for y
Current val for z
public static int DblLinear(int target) {
uint index = 1;
uint ind_y = 1;
uint ind_z = 1;
uint val_y = 3;
uint val_z = 4;
if(target < 1)
return 1;
while(index < target) {
if(val_y < val_z) {
ind_y++;
val_y = 2*ind_y + 1;
} else {
ind_z++;
val_z = 3*ind_z + 1;
}
index++;
}
return (val_y < val_z) ? val_y : val_z;
}
You could modify the val_y if to be a while loop (more efficient critical path) if you either widen the branch to 2 conditions or implement a backstep loop for when you blow past your target index.
No memory allocation will definitely speed your calculations up, even f people want to (incorrectly) belly ache about branch prediction in such an easily predictable case.
Also, did you turn optimization on in your Visual Studio project? If you're submitting a binary and not a code file, then that can also shave quite a bit of time.

Why does failing to recognise equality mess up C# List<T> sort?

This is a somewhat obscure question, but after wasting an hour tracking down the bug, I though it worth asking...
I wrote a custom ordering for a struct, and made one mistake:
My struct has a special state, let us call this "min".
If the struct is in the min state, then it's smaller than any other struct.
My CompareTo method made one mistake: a.CompareTo(b) would return -1 whenever a was "min", but of course if b is also "min" it should return 0.
Now, this mistake completely messed up a List<MyStruct> Sort() method: the whole list would (sometimes) come out in a random order.
My list contained exactly one object in "min" state.
It seems my mistake could only affect things if the one "min" object was compared to itself.
Why would this even happen when sorting?
And even if it did, how can it cause the relative order of two "non-min" objects to be wrong?
Using the LINQ OrderBy method can cause an infinite loop...
Small, complete, test example:
struct MyStruct : IComparable<MyStruct>
{
public int State;
public MyStruct(int s) { State = s; }
public int CompareTo(MyStruct rhs)
{
// 10 is the "min" state. Otherwise order as usual
if (State == 10) { return -1; } // Incorrect
/*if (State == 10) // Correct version
{
if (rhs.State == 10) { return 0; }
return -1;
}*/
if (rhs.State == 10) { return 1; }
return this.State - rhs.State;
}
public override string ToString()
{
return String.Format("MyStruct({0})", State);
}
}
class Program
{
static int Main()
{
var list = new List<MyStruct>();
var rnd = new Random();
for (int i = 0; i < 20; ++i)
{
int x = rnd.Next(15);
if (x >= 10) { ++x; }
list.Add(new MyStruct(x));
}
list.Add(new MyStruct(10));
list.Sort();
// Never returns...
//list = list.OrderBy(item => item).ToList();
Console.WriteLine("list:");
foreach (var x in list) { Console.WriteLine(x); }
for (int i = 1; i < list.Count(); ++i)
{
Console.Write("{0} ", list[i].CompareTo(list[i - 1]));
}
return 0;
}
}
It seems my mistake could only affect things if the one "min" object was compared to itself.
Not quite. It could also be caused if there were two different "min" objects. In the case of the list sorted this particular time, it can only happen if the item is compared to itself. But the other case is worth considering generally in terms of why supplying a non-transitive comparer to a method that expects a transitive comparer is a very bad thing.
Why would this even happen when sorting?
Why not?
List<T>.Sort() works by using the Array.Sort<T> on its items. Array.Sort<T> in turn uses a mixture of Insertion Sort, Heapsort and Quicksort, but to simplify let's consider a general quicksort. For simplicity we'll use IComparable<T> directly, rather than via System.Collections.Generic.Comparer<T>.Default:
public static void Quicksort<T>(IList<T> list) where T : IComparable<T>
{
Quicksort<T>(list, 0, list.Count - 1);
}
public static void Quicksort<T>(IList<T> list, int left, int right) where T : IComparable<T>
{
int i = left;
int j = right;
T pivot = list[(left + right) / 2];
while(i <= j)
{
while(list[i].CompareTo(pivot) < 0)
i++;
while(list[j].CompareTo(pivot) > 0)
j--;
if(i <= j)
{
T tmp = list[i];
list[i] = list[j];
list[j] = tmp;
i++;
j--;
}
}
if(left < j)
Quicksort(list, left, j);
if(i < right)
Quicksort(list, i, right);
}
This works as follows:
Pick an element, called a pivot, from the list(we use the middle).
Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it.
The pivot is now in its final position, with an unsorted sub-list before and after it. Recursively apply the same steps to these two sub-lists.
Now, there are two things to note about the example code above.
The first is that we do not prevent pivot being compared with itself. We could do this, but why would we? For one thing, we need some sort of comparison code to do this, which is precisely what you've already provided in your CompareTo() method. In order to avoid the wasted CompareTo we'd have to either call CompareTo()* an extra time for each comparison (!) or else track the position of pivot which would add more waste than it removed.
And even if it did, how can it cause the relative order of two "non-min" objects to be wrong?
Because quicksort partitions, it doesn't do one massive sort, but a series of mini-sorts. Therefore an incorrect comparison gets a series of opportunities to mess up parts of those sorts, each time leading to a sub-list of incorrectly sorted values that the algorithm considers "dealt with". So in those cases where the bug in the comparer hits, its damage can be spread throughout much of the list. Just as it does its sort by a series of mini-sorts, so it will do a buggy sort by a series of buggy mini-sorts.
Using the LINQ OrderBy method can cause an infinite loop
It uses a variant of Quicksort that guarantees stability; two equivalent item will still have the same relative order after the search as before. The extra complexity is presumably leading to it not only comparing the item to itself, but then continuing to do so forever, as it tries to make sure that it is both in front of itself, but also in the same order to itself as it was before. (Yes, that last sentence makes no sense, and that's exactly why it never returns).
*If this was a reference rather than value type then we could do ReferenceEquals quickly, but aside from the fact that this won't be any good with structs, and the fact that if that really was a time-saver for the type in question it should have if(ReferenceEquals(this, other)) return 0; in the CompareTo anyway, it still wouldn't fix the bug once there was more than one "min" items in the list.

Get all possible distinct triples using LINQ

I have a List contains these values: {1, 2, 3, 4, 5, 6, 7}. And I want to be able to retrieve unique combination of three. The result should be like this:
{1,2,3}
{1,2,4}
{1,2,5}
{1,2,6}
{1,2,7}
{2,3,4}
{2,3,5}
{2,3,6}
{2,3,7}
{3,4,5}
{3,4,6}
{3,4,7}
{3,4,1}
{4,5,6}
{4,5,7}
{4,5,1}
{4,5,2}
{5,6,7}
{5,6,1}
{5,6,2}
{5,6,3}
I already have 2 for loops that able to do this:
for (int first = 0; first < test.Count - 2; first++)
{
int second = first + 1;
for (int offset = 1; offset < test.Count; offset++)
{
int third = (second + offset)%test.Count;
if(Math.Abs(first - third) < 2)
continue;
List<int> temp = new List<int>();
temp .Add(test[first]);
temp .Add(test[second]);
temp .Add(test[third]);
result.Add(temp );
}
}
But since I'm learning LINQ, I wonder if there is a smarter way to do this?
UPDATE: I used this question as the subject of a series of articles starting here; I'll go through two slightly different algorithms in that series. Thanks for the great question!
The two solutions posted so far are correct but inefficient for the cases where the numbers get large. The solutions posted so far use the algorithm: first enumerate all the possibilities:
{1, 1, 1 }
{1, 1, 2 },
{1, 1, 3 },
...
{7, 7, 7}
And while doing so, filter out any where the second is not larger than the first, and the third is not larger than the second. This performs 7 x 7 x 7 filtering operations, which is not that many, but if you were trying to get, say, permutations of ten elements from thirty, that's 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30, which is rather a lot. You can do better than that.
I would solve this problem as follows. First, produce a data structure which is an efficient immutable set. Let me be very clear what an immutable set is, because you are likely not familiar with them. You normally think of a set as something you add items and remove items from. An immutable set has an Add operation but it does not change the set; it gives you back a new set which has the added item. The same for removal.
Here is an implementation of an immutable set where the elements are integers from 0 to 31:
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System;
// A super-cheap immutable set of integers from 0 to 31 ;
// just a convenient wrapper around bit operations on an int.
internal struct BitSet : IEnumerable<int>
{
public static BitSet Empty { get { return default(BitSet); } }
private readonly int bits;
private BitSet(int bits) { this.bits = bits; }
public bool Contains(int item)
{
Debug.Assert(0 <= item && item <= 31);
return (bits & (1 << item)) != 0;
}
public BitSet Add(int item)
{
Debug.Assert(0 <= item && item <= 31);
return new BitSet(this.bits | (1 << item));
}
public BitSet Remove(int item)
{
Debug.Assert(0 <= item && item <= 31);
return new BitSet(this.bits & ~(1 << item));
}
IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
public IEnumerator<int> GetEnumerator()
{
for(int item = 0; item < 32; ++item)
if (this.Contains(item))
yield return item;
}
public override string ToString()
{
return string.Join(",", this);
}
}
Read this code carefully to understand how it works. Again, always remember that adding an element to this set does not change the set. It produces a new set that has the added item.
OK, now that we've got that, let's consider a more efficient algorithm for producing your permutations.
We will solve the problem recursively. A recursive solution always has the same structure:
Can we solve a trivial problem? If so, solve it.
If not, break the problem down into a number of smaller problems and solve each one.
Let's start with the trivial problems.
Suppose you have a set and you wish to choose zero items from it. The answer is clear: there is only one possible permutation with zero elements, and that is the empty set.
Suppose you have a set with n elements in it and you want to choose more than n elements. Clearly there is no solution, not even the empty set.
We have now taken care of the cases where the set is empty or the number of elements chosen is more than the number of elements total, so we must be choosing at least one thing from a set that has at least one thing.
Of the possible permutations, some of them have the first element in them and some of them do not. Find all the ones that have the first element in them and yield them. We do this by recursing to choose one fewer elements on the set that is missing the first element.
The ones that do not have the first element in them we find by enumerating the permutations of the set without the first element.
static class Extensions
{
public static IEnumerable<BitSet> Choose(this BitSet b, int choose)
{
if (choose < 0) throw new InvalidOperationException();
if (choose == 0)
{
// Choosing zero elements from any set gives the empty set.
yield return BitSet.Empty;
}
else if (b.Count() >= choose)
{
// We are choosing at least one element from a set that has
// a first element. Get the first element, and the set
// lacking the first element.
int first = b.First();
BitSet rest = b.Remove(first);
// These are the permutations that contain the first element:
foreach(BitSet r in rest.Choose(choose-1))
yield return r.Add(first);
// These are the permutations that do not contain the first element:
foreach(BitSet r in rest.Choose(choose))
yield return r;
}
}
}
Now we can ask the question that you need the answer to:
class Program
{
static void Main()
{
BitSet b = BitSet.Empty.Add(1).Add(2).Add(3).Add(4).Add(5).Add(6).Add(7);
foreach(BitSet result in b.Choose(3))
Console.WriteLine(result);
}
}
And we're done. We have generated only as many sequences as we actually need. (Though we have done a lot of set operations to get there, but set operations are cheap.) The point here is that understanding how this algorithm works is extremely instructive. Recursive programming on immutable structures is a powerful tool that many professional programmers do not have in their toolbox.
You can do it like this:
var data = Enumerable.Range(1, 7);
var r = from a in data
from b in data
from c in data
where a < b && b < c
select new {a, b, c};
foreach (var x in r) {
Console.WriteLine("{0} {1} {2}", x.a, x.b, x.c);
}
Demo.
Edit: Thanks Eric Lippert for simplifying the answer!
var ints = new int[] { 1, 2, 3, 4, 5, 6, 7 };
var permutations = ints.SelectMany(a => ints.Where(b => (b > a)).
SelectMany(b => ints.Where(c => (c > b)).
Select(c => new { a = a, b = b, c = c })));

Simple Sequence Generation?

I'm looking for an ultra-easy way to generate a list of numbers, 1-200.
(it can be a List, Array, Enumerable... I don't really care about the specific type)
Apparently .Net 4.0 has a Sequence.Range(min,max) method.
But I'm currently on .Net 3.5.
Here is a sample usage, of what I'm after, shown with Sequence.Range.
public void ShowOutput(Sequence.Range(1,200));
For the moment, I need consequitive numbers 1-200. In future iterations, I may need arbitrary lists of numbers, so I'm trying to keep the design flexible.
Perhaps there is a good LINQ solution? Any other ideas?
.NET 3,5 has Range too. It's actually Enumerable.Range and returns IEnumerable<int>.
The page you linked to is very much out of date - it's talking about 3 as a "future version" and the Enumerable static class was called Sequence at one point prior to release.
If you wanted to implement it yourself in C# 2 or later, it's easy - here's one:
IEnumerable<int> Range(int count)
{
for (int n = 0; n < count; n++)
yield return n;
}
You can easily write other methods that further filter lists:
IEnumerable<int> Double(IEnumerable<int> source)
{
foreach (int n in source)
yield return n * 2;
}
But as you have 3.5, you can use the extension methods in System.Linq.Enumerable to do this:
var evens = Enumerable.Range(0, someLimit).Select(n => n * 2);
var r = Enumerable.Range( 1, 200 );
Check out System.Linq.Enumerable.Range.
Regarding the second part of your question, what do you mean by "arbitrary lists"? If you can define a function from an int to the new values, you can use the result of Range with other LINQ methods:
var squares = from i in Enumerable.Range(1, 200)
select i * i;

Categories

Resources