Get all possible distinct triples using LINQ - c#

I have a List contains these values: {1, 2, 3, 4, 5, 6, 7}. And I want to be able to retrieve unique combination of three. The result should be like this:
{1,2,3}
{1,2,4}
{1,2,5}
{1,2,6}
{1,2,7}
{2,3,4}
{2,3,5}
{2,3,6}
{2,3,7}
{3,4,5}
{3,4,6}
{3,4,7}
{3,4,1}
{4,5,6}
{4,5,7}
{4,5,1}
{4,5,2}
{5,6,7}
{5,6,1}
{5,6,2}
{5,6,3}
I already have 2 for loops that able to do this:
for (int first = 0; first < test.Count - 2; first++)
{
int second = first + 1;
for (int offset = 1; offset < test.Count; offset++)
{
int third = (second + offset)%test.Count;
if(Math.Abs(first - third) < 2)
continue;
List<int> temp = new List<int>();
temp .Add(test[first]);
temp .Add(test[second]);
temp .Add(test[third]);
result.Add(temp );
}
}
But since I'm learning LINQ, I wonder if there is a smarter way to do this?

UPDATE: I used this question as the subject of a series of articles starting here; I'll go through two slightly different algorithms in that series. Thanks for the great question!
The two solutions posted so far are correct but inefficient for the cases where the numbers get large. The solutions posted so far use the algorithm: first enumerate all the possibilities:
{1, 1, 1 }
{1, 1, 2 },
{1, 1, 3 },
...
{7, 7, 7}
And while doing so, filter out any where the second is not larger than the first, and the third is not larger than the second. This performs 7 x 7 x 7 filtering operations, which is not that many, but if you were trying to get, say, permutations of ten elements from thirty, that's 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30, which is rather a lot. You can do better than that.
I would solve this problem as follows. First, produce a data structure which is an efficient immutable set. Let me be very clear what an immutable set is, because you are likely not familiar with them. You normally think of a set as something you add items and remove items from. An immutable set has an Add operation but it does not change the set; it gives you back a new set which has the added item. The same for removal.
Here is an implementation of an immutable set where the elements are integers from 0 to 31:
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System;
// A super-cheap immutable set of integers from 0 to 31 ;
// just a convenient wrapper around bit operations on an int.
internal struct BitSet : IEnumerable<int>
{
public static BitSet Empty { get { return default(BitSet); } }
private readonly int bits;
private BitSet(int bits) { this.bits = bits; }
public bool Contains(int item)
{
Debug.Assert(0 <= item && item <= 31);
return (bits & (1 << item)) != 0;
}
public BitSet Add(int item)
{
Debug.Assert(0 <= item && item <= 31);
return new BitSet(this.bits | (1 << item));
}
public BitSet Remove(int item)
{
Debug.Assert(0 <= item && item <= 31);
return new BitSet(this.bits & ~(1 << item));
}
IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
public IEnumerator<int> GetEnumerator()
{
for(int item = 0; item < 32; ++item)
if (this.Contains(item))
yield return item;
}
public override string ToString()
{
return string.Join(",", this);
}
}
Read this code carefully to understand how it works. Again, always remember that adding an element to this set does not change the set. It produces a new set that has the added item.
OK, now that we've got that, let's consider a more efficient algorithm for producing your permutations.
We will solve the problem recursively. A recursive solution always has the same structure:
Can we solve a trivial problem? If so, solve it.
If not, break the problem down into a number of smaller problems and solve each one.
Let's start with the trivial problems.
Suppose you have a set and you wish to choose zero items from it. The answer is clear: there is only one possible permutation with zero elements, and that is the empty set.
Suppose you have a set with n elements in it and you want to choose more than n elements. Clearly there is no solution, not even the empty set.
We have now taken care of the cases where the set is empty or the number of elements chosen is more than the number of elements total, so we must be choosing at least one thing from a set that has at least one thing.
Of the possible permutations, some of them have the first element in them and some of them do not. Find all the ones that have the first element in them and yield them. We do this by recursing to choose one fewer elements on the set that is missing the first element.
The ones that do not have the first element in them we find by enumerating the permutations of the set without the first element.
static class Extensions
{
public static IEnumerable<BitSet> Choose(this BitSet b, int choose)
{
if (choose < 0) throw new InvalidOperationException();
if (choose == 0)
{
// Choosing zero elements from any set gives the empty set.
yield return BitSet.Empty;
}
else if (b.Count() >= choose)
{
// We are choosing at least one element from a set that has
// a first element. Get the first element, and the set
// lacking the first element.
int first = b.First();
BitSet rest = b.Remove(first);
// These are the permutations that contain the first element:
foreach(BitSet r in rest.Choose(choose-1))
yield return r.Add(first);
// These are the permutations that do not contain the first element:
foreach(BitSet r in rest.Choose(choose))
yield return r;
}
}
}
Now we can ask the question that you need the answer to:
class Program
{
static void Main()
{
BitSet b = BitSet.Empty.Add(1).Add(2).Add(3).Add(4).Add(5).Add(6).Add(7);
foreach(BitSet result in b.Choose(3))
Console.WriteLine(result);
}
}
And we're done. We have generated only as many sequences as we actually need. (Though we have done a lot of set operations to get there, but set operations are cheap.) The point here is that understanding how this algorithm works is extremely instructive. Recursive programming on immutable structures is a powerful tool that many professional programmers do not have in their toolbox.

You can do it like this:
var data = Enumerable.Range(1, 7);
var r = from a in data
from b in data
from c in data
where a < b && b < c
select new {a, b, c};
foreach (var x in r) {
Console.WriteLine("{0} {1} {2}", x.a, x.b, x.c);
}
Demo.
Edit: Thanks Eric Lippert for simplifying the answer!

var ints = new int[] { 1, 2, 3, 4, 5, 6, 7 };
var permutations = ints.SelectMany(a => ints.Where(b => (b > a)).
SelectMany(b => ints.Where(c => (c > b)).
Select(c => new { a = a, b = b, c = c })));

Related

Shortest list from a two dimensional array

This question is more about an algorithm than actual code, but example code would be appreciated.
Let's say I have a two-dimensional array such as this:
A B C D E
--------------
1 | 0 2 3 4 5
2 | 1 2 4 5 6
3 | 1 3 4 5 6
4 | 2 3 4 5 6
5 | 1 2 3 4 5
I am trying to find the shortest list that would include a value from each row. Currently, I am going row by row and column by column, adding each value to a SortedSet and then checking the length of the set against the shortest set found so far. For example:
Adding cells {1A, 2A, 3A, 4A, 5A} would add the values {0, 1, 1, 2, 1} which would result in a sorted set {0, 1, 2}. {1B, 2A, 3A, 4A, 5A} would add the values {2, 1, 1, 2, 1} which would result in a sorted set {1, 2}, which is shorter than the previous set.
Obviously, adding {1D, 2C, 3C, 4C, 5D} or {1E, 2D, 3D, 4D, 5E} would be the shortest sets, having only one item each, and I could use either one.
I don't have to include every number in the array. I just need to find the shortest set while including at least one number from every row.
Keep in mind that this is just an example array, and the arrays that I'm using are much, much larger. The smallest is 495x28. Brute force will take a VERY long time (28^495 passes). Is there a shortcut that someone knows, to find this in the least number of passes? I have C# code, but it's kind of long.
Edit:
Posting current code, as per request:
// Set an array of counters, Add enough to create largest initial array
int ListsCount = MatrixResults.Count();
int[] Counters = new int[ListsCount];
SortedSet<long> CurrentSet = new SortedSet<long>();
for (long X = 0; X < ListsCount; X++)
{
Counters[X] = 0;
CurrentSet.Add(X);
}
while (true)
{
// Compile sequence list from MatrixResults[]
SortedSet<long> ThisSet = new SortedSet<long>();
for (int X = 0; X < Count4; X ++)
{
ThisSet.Add(MatrixResults[X][Counters[X]]);
}
// if Sequence Length less than current low, set ThisSet as Current
if (ThisSet.Count() < CurrentSet.Count())
{
CurrentSet.Clear();
long[] TSI = ThisSet.ToArray();
for (int Y = 0; Y < ThisSet.Count(); Y ++)
{
CurrentSet.Add(TSI[Y]);
}
}
// Increment Counters
int Index = 0;
bool EndReached = false;
while (true)
{
Counters[Index]++;
if (Counters[Index] < MatrixResults[Index].Count()) break;
Counters[Index] = 0;
Index++;
if (Index >= ListsCount)
{
EndReached = true;
break;
}
Counters[Index]++;
}
// If all counters are fully incremented, then break
if (EndReached) break;
}
With all computations there is always a tradeoff, several factors are in play, like will You get paid for getting it perfect (in this case for me, no). This is a case of the best being the enemy of the good. How long can we spend on solving a problem and will it be sufficient to get close enough to fulfil the use case (imo) and when we can solve the problem without hand painting pixels in UHD resolution to get the idea of a key through, lets!
So, my choice is an approach which will get a covering set which is small and ehem... sometimes will be the smallest :) In essence because of the sequence in comparing would to be spot on be iterative between different strategies, comparing the length of the sets for different strategies - and for this evening of fun I chose to give one strategy which is I find defendable to be close to or equal the minimal set.
So this strategy is to observe the multi dimensional array as a sequence of lists that has a distinct value set each. Then if reducing the total amount of lists with the smallest in the remainder iteratively, weeding out any non used values in that smallest list when having reduced total set in each iteration we will get a path which is close enough to the ideal to be effective as it completes in milliseconds with this approach.
A critique of this approach up front is then that the direction you pass your minimal list in really would have to get iteratively varied to pick best, left to right, right to left, in position sequences X,Y,Z, ... because the amount of potential reducing is not equal. So to get close to the ideal iterations of sequences would have to be made for each iteration too until all combinations were covered, choosing the most reducing sequence. right - but I chose left to right, only!
Now I chose not to run compare execution against Your code, because of the way you instantiate your MatrixResults is an array of int arrays and not instantiated as a multidimension array, which your drawing is, so I went by Your drawing and then couldn't share data source with your code. No matter, you can make that conversion if you wish, onwards to generate sample data:
private int[,] CreateSampleArray(int xDimension, int yDimensions, Random rnd)
{
Debug.WriteLine($"Created sample array of dimensions ({xDimension}, {yDimensions})");
var array = new int[xDimension, yDimensions];
for (int x = 0; x < array.GetLength(0); x++)
{
for(int y = 0; y < array.GetLength(1); y++)
{
array[x, y] = rnd.Next(0, 4000);
}
}
return array;
}
The overall structure with some logging, I'm using xUnit to run the code in
[Fact]
public void SetCoverExperimentTest()
{
var rnd = new Random((int)DateTime.Now.Ticks);
var sw = Stopwatch.StartNew();
int[,] matrixResults = CreateSampleArray(rnd.Next(100, 500), rnd.Next(100, 500), rnd);
//So first requirement is that you must have one element per row, so lets get our unique rows
var listOfAll = new List<List<int>>();
List<int> listOfRow;
for (int y = 0; y < matrixResults.GetLength(1); y++)
{
listOfRow = new List<int>();
for (int x = 0; x < matrixResults.GetLength(0); x++)
{
listOfRow.Add(matrixResults[x, y]);
}
listOfAll.Add(listOfRow.Distinct().ToList());
}
var setFound = new HashSet<int>();
List<List<int>> allUniquelyRequired = GetDistinctSmallestList(listOfAll, setFound);
// This set now has all rows that are either distinctly different
// Or have a reordering of distinct values of that length value lists
// our HashSet has the unique value range
//Meaning any combination of sets with those values,
//grabbing any one for each set, prefering already chosen ones should give a covering total set
var leastSet = new LeastSetData
{
LeastSet = setFound,
MatrixResults = matrixResults,
};
List<Coordinate>? minSet = leastSet.GenerateResultsSet();
sw.Stop();
Debug.WriteLine($"Completed in {sw.Elapsed.TotalMilliseconds:0.00} ms");
Assert.NotNull(minSet);
//There is one for each row
Assert.False(minSet.Select(s => s.y).Distinct().Count() < minSet.Count());
//We took less than 25 milliseconds
var timespan = new TimeSpan(0, 0, 0, 0, 25);
Assert.True(sw.Elapsed < timespan);
//Outputting to debugger for the fun of it
var sb = new StringBuilder();
foreach (var coordinate in minSet)
{
sb.Append($"({coordinate.x}, {coordinate.y}) {matrixResults[coordinate.x, coordinate.y]},");
}
var debugLine = sb.ToString();
debugLine = debugLine.Substring(0, debugLine.Length - 1);
Debug.WriteLine("Resulting set: " + debugLine);
}
Now the more meaty iterative bits
private List<List<int>> GetDistinctSmallestList(List<List<int>> listOfAll, HashSet<int> setFound)
{
// Our smallest set must be a subset the distinct sum of all our smallest lists for value range,
// plus unknown
var listOfShortest = new List<List<int>>();
int shortest = int.MaxValue;
foreach (var list in listOfAll)
{
if (list.Count < shortest)
{
listOfShortest.Clear();
shortest = list.Count;
listOfShortest.Add(list);
}
else if (list.Count == shortest)
{
if (listOfShortest.Contains(list))
continue;
listOfShortest.Add(list);
}
}
var setFoundAddition = new HashSet<int>(setFound);
foreach (var list in listOfShortest)
{
foreach (var item in list)
{
if (setFound.Contains(item))
continue;
if (setFoundAddition.Contains(item))
continue;
setFoundAddition.Add(item);
}
}
//Now we can remove all rows with those found, we'll add the smallest later
var listOfAllRemainder = new List<List<int>>();
bool foundInList;
List<int> consumedWhenReducing = new List<int>();
foreach (var list in listOfAll)
{
foundInList = false;
foreach (int item in list)
{
if (setFound.Contains(item))
{
//Covered by data from last iteration(s)
foundInList = true;
break;
}
else if (setFoundAddition.Contains(item))
{
consumedWhenReducing.Add(item);
foundInList = true;
break;
}
}
if (!foundInList)
{
listOfAllRemainder.Add(list); //adding what lists did not have elements found
}
}
//Remove any from these smallestset lists that did not get consumed in the favour used pass before
if (consumedWhenReducing.Count == 0)
{
throw new Exception($"Shouldn't be possible to remove the row itself without using one of its values, please investigate");
}
var removeArray = setFoundAddition.Where(a => !consumedWhenReducing.Contains(a)).ToArray();
setFoundAddition.RemoveWhere(x => removeArray.Contains(x));
foreach (var value in setFoundAddition)
{
setFound.Add(value);
}
if (listOfAllRemainder.Count != 0)
{
//Do the whole thing again until there in no list left
listOfShortest.AddRange(GetDistinctSmallestList(listOfAllRemainder, setFound));
}
return listOfShortest; //Here we will ultimately have the sum of shortest lists per iteration
}
To conclude: I hope to have inspired You, at least I had fun coming up with a best approximate, and should you feel like completing the code, You're very welcome to grab what You like.
Obviously we should really track the sequence we go through the shortest lists, after all it is of significance if we start by reducing the total distinct lists by element at position 0 or 0+N and which one we reduce with after. I mean we must have one of those values but each time consuming each value has removed most of the total list all it really produces is a value range and the range consumption sequence matters to the later iterations - Because a position we didn't reach before there were no others left e.g. could have remove potentially more than some which were covered. You get the picture I'm sure.
And this is just one strategy, One may as well have chosen the largest distinct list even within the same framework and if You do not iteratively cover enough strategies, there is only brute force left.
Anyways you'd want an AI to act. Just like a human, not to contemplate the existence of universe before, after all we can reconsider pretty often with silicon brains as long as we can do so fast.
With any moving object at least, I'd much rather be 90% on target correcting every second while taking 14 ms to get there, than spend 2 seconds reaching 99% or the illusive 100% => meaning we should stop the vehicle before the concrete pillar or the pram or conversely buy the equity when it is a good time to do so, not figuring out that we should have stopped, when we are allready on the other side of the obstacle or that we should've bought 5 seconds ago, but by then the spot price already jumped again...
Thus the defense rests on the notion that it is opinionated if this solution is good enough or simply incomplete at best :D
I realize it's pretty random, but just to say that although this sketch is not entirely indisputably correct, it is easy to read and maintain and anyways the question is wrong B-] We will very rarely need the absolute minimal set and when we do the answer will be much longer :D
... woopsie, forgot the support classes
public struct Coordinate
{
public int x;
public int y;
public override string ToString()
{
return $"({x},{y})";
}
}
public struct CoordinateValue
{
public int Value { get; set; }
public Coordinate Coordinate { get; set; }
public override string ToString()
{
return string.Concat(Coordinate.ToString(), " ", Value.ToString());
}
}
public class LeastSetData
{
public HashSet<int> LeastSet { get; set; }
public int[,] MatrixResults { get; set; }
public List<Coordinate> GenerateResultsSet()
{
HashSet<int> chosenValueRange = new HashSet<int>();
var chosenSet = new List<Coordinate>();
for (int y = 0; y < MatrixResults.GetLength(1); y++)
{
var candidates = new List<CoordinateValue>();
for (int x = 0; x < MatrixResults.GetLength(0); x++)
{
if (LeastSet.Contains(MatrixResults[x, y]))
{
candidates.Add(new CoordinateValue
{
Value = MatrixResults[x, y],
Coordinate = new Coordinate { x = x, y = y }
}
);
continue;
}
}
if (candidates.Count == 0)
throw new Exception($"OMG Something's wrong! (this row did not have any of derived range [y: {y}])");
var done = false;
foreach (var c in candidates)
{
if (chosenValueRange.Contains(c.Value))
{
chosenSet.Add(c.Coordinate);
done = true;
break;
}
}
if (!done)
{
var firstCandidate = candidates.First();
chosenSet.Add(firstCandidate.Coordinate);
chosenValueRange.Add(firstCandidate.Value);
}
}
return chosenSet;
}
}
This problem is NP hard.
To show that, we have to take a known NP hard problem, and reduce it to this one. Let's do that with the Set Cover Problem.
We start with a universe U of things, and a collection S of sets that covers the universe. Assign each thing a row, and each set a number. This will fill different numbers of columns for each row. Fill in a rectangle by adding new numbers.
Now solve your problem.
For each new number in your solution that didn't come from a set in the original problem, we can replace it with another number in the same row that did come from a set.
And now we turn numbers back into sets and we have a solution to the Set Cover Problem.
The transformations from set cover to your problem and back again are both O(number_of_elements * number_of_sets) which is polynomial in the input. And therefore your problem is NP hard.
Conversely if you replace each number in the matrix with the set of rows covered, your problem turns into the Set Cover Problem. Using any existing solver for set cover then gives a reasonable approach for your problem as well.
The code is not particularly tidy or optimised, but illustrates the approach I think #btilly is suggesting in his answer (E&OE) using a bit of recursion (I was going for intuitive rather than ideal for scaling, so you may have to work an iterative equivalent).
From the rows with their values make a "values with the rows that they appear in" counterpart. Now pick a value, eliminate all rows in which it appears and solve again for the reduced set of rows. Repeat recursively, keeping only the shortest solutions.
I know this is not terribly readable (or well explained) and may come back to tidy up in the morning, so let me know if it does what you want (is worth a bit more of my time;-).
// Setup
var rowValues = new Dictionary<int, HashSet<int>>
{
[0] = new() { 0, 2, 3, 4, 5 },
[1] = new() { 1, 2, 4, 5, 6 },
[2] = new() { 1, 3, 4, 5, 6 },
[3] = new() { 2, 3, 4, 5, 6 },
[4] = new() { 1, 2, 3, 4, 5 }
};
Dictionary<int, HashSet<int>> ValueRows(Dictionary<int, HashSet<int>> rv)
{
var vr = new Dictionary<int, HashSet<int>>();
foreach (var row in rv.Keys)
{
foreach (var value in rv[row])
{
if (vr.ContainsKey(value))
{
if (!vr[value].Contains(row))
vr[value].Add(row);
}
else
{
vr.Add(value, new HashSet<int> { row });
}
}
}
return vr;
}
List<int> FindSolution(Dictionary<int, HashSet<int>> rAndV)
{
if (rAndV.Count == 0) return new List<int>();
var bestSolutionSoFar = new List<int>();
var vAndR = ValueRows(rAndV);
foreach (var v in vAndR.Keys)
{
var copyRemove = new Dictionary<int, HashSet<int>>(rAndV);
foreach (var r in vAndR[v])
copyRemove.Remove(r);
var solution = new List<int>{ v };
solution.AddRange(FindSolution(copyRemove));
if (bestSolutionSoFar.Count == 0 || solution.Count > 0 && solution.Count < bestSolutionSoFar.Count)
bestSolutionSoFar = solution;
}
return bestSolutionSoFar;
}
var solution = FindSolution(rowValues);
Console.WriteLine($"Optimal solution has values {{ {string.Join(',', solution)} }}");
output Optimal solution has values { 4 }

How to Order By or Sort an integer List and select the Nth element

I have a list, and I want to select the fifth highest element from it:
List<int> list = new List<int>();
list.Add(2);
list.Add(18);
list.Add(21);
list.Add(10);
list.Add(20);
list.Add(80);
list.Add(23);
list.Add(81);
list.Add(27);
list.Add(85);
But OrderbyDescending is not working for this int list...
int fifth = list.OrderByDescending(x => x).Skip(4).First();
Depending on the severity of the list not having more than 5 elements you have 2 options.
If the list never should be over 5 i would catch it as an exception:
int fifth;
try
{
fifth = list.OrderByDescending(x => x).ElementAt(4);
}
catch (ArgumentOutOfRangeException)
{
//Handle the exception
}
If you expect that it will be less than 5 elements then you could leave it as default and check it for that.
int fifth = list.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == 0)
{
//handle default
}
This is still some what flawed because you could end up having the fifth element being 0. This can be solved by typecasting the list into a list of nullable ints at before the linq:
var newList = list.Select(i => (int?)i).ToList();
int? fifth = newList.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == null)
{
//handle default
}
Without LINQ expressions:
int result;
if(list != null && list.Count >= 5)
{
list.Sort();
result = list[list.Count - 5];
}
else // define behavior when list is null OR has less than 5 elements
This has a better performance compared to LINQ expressions, although the LINQ solutions presented in my second answer are comfortable and reliable.
In case you need extreme performance for a huge List of integers, I'd recommend a more specialized algorithm, like in Matthew Watson's answer.
Attention: The List gets modified when the Sort() method is called. If you don't want that, you must work with a copy of your list, like this:
List<int> copy = new List<int>(original);
List<int> copy = original.ToList();
The easiest way to do this is to just sort the data and take N items from the front. This is the recommended way for small data sets - anything more complicated is just not worth it otherwise.
However, for large data sets it can be a lot quicker to do what's known as a Partial Sort.
There are two main ways to do this: Use a heap, or use a specialised quicksort.
The article I linked describes how to use a heap. I shall present a partial sort below:
public static IList<T> PartialSort<T>(IList<T> data, int k) where T : IComparable<T>
{
int start = 0;
int end = data.Count - 1;
while (end > start)
{
var index = partition(data, start, end);
var rank = index + 1;
if (rank >= k)
{
end = index - 1;
}
else if ((index - start) > (end - index))
{
quickSort(data, index + 1, end);
end = index - 1;
}
else
{
quickSort(data, start, index - 1);
start = index + 1;
}
}
return data;
}
static int partition<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
T x = lst[start];
int i = start;
for (int j = start + 1; j <= end; j++)
{
if (lst[j].CompareTo(x) < 0) // Or "> 0" to reverse sort order.
{
i = i + 1;
swap(lst, i, j);
}
}
swap(lst, start, i);
return i;
}
static void swap<T>(IList<T> lst, int p, int q)
{
T temp = lst[p];
lst[p] = lst[q];
lst[q] = temp;
}
static void quickSort<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
if (start >= end)
return;
int index = partition(lst, start, end);
quickSort(lst, start, index - 1);
quickSort(lst, index + 1, end);
}
Then to access the 5th largest element in a list you could do this:
PartialSort(list, 5);
Console.WriteLine(list[4]);
For large data sets, a partial sort can be significantly faster than a full sort.
Addendum
See here for another (probably better) solution that uses a QuickSelect algorithm.
This LINQ approach retrieves the 5th biggest element OR throws an exception WHEN the list is null or contains less than 5 elements:
int fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
throw new Exception("list is null OR has not enough elements");
This one retrieves the 5th biggest element OR null WHEN the list is null or contains less than 5 elements:
int? fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
default(int?);
if(fifth == null) // define behavior
This one retrieves the 5th biggest element OR the smallest element WHEN the list contains less than 5 elements:
if(list == null || list.Count <= 0)
throw new Exception("Unable to retrieve Nth biggest element");
int fifth = list.OrderByDescending(x => x).Take(5).Last();
All these solutions are reliable, they should NEVER throw "unexpected" exceptions.
PS: I'm using .NET 4.7 in this answer.
Here there is a C# implementation of the QuickSelect algorithm to select the nth element in an unordered IList<>.
You have to put all the code contained in that page in a static class, like:
public static class QuickHelpers
{
// Put the code here
}
Given that "library" (in truth a big fat block of code), then you can:
int resA = list.QuickSelect(2, (x, y) => Comparer<int>.Default.Compare(y, x));
int resB = list.QuickSelect(list.Count - 1 - 2);
Now... Normally the QuickSelect would select the nth lowest element. We reverse it in two ways:
For resA we create a reverse comparer based on the default int comparer. We do this by reversing the parameters of the Compare method. Note that the index is 0 based. So there is a 0th, 1th, 2th and so on.
For resB we use the fact that the 0th element is the list-1 th element in the reverse order. So we count from the back. The highest element would be the list.Count - 1 in an ordered list, the next one list.Count - 1 - 1, then list.Count - 1 - 2 and so on
Theorically using Quicksort should be better than ordering the list and then picking the nth element, because ordering a list is on average a O(NlogN) operation and picking the nth element is then a O(1) operation, so the composite is O(NlogN) operation, while QuickSelect is on average a O(N) operation. Clearly there is a but. The O notation doesn't show the k factor... So a O(k1 * NlogN) with a small k1 could be better than a O(k2 * N) with a big k2. Only multiple real life benchmarks can tell us (you) what is better, and it depends on the size of the collection.
A small note about the algorithm:
As with quicksort, quickselect is generally implemented as an in-place algorithm, and beyond selecting the k'th element, it also partially sorts the data. See selection algorithm for further discussion of the connection with sorting.
So it modifies the ordering of the original list.

Avoiding 'System.OutOfMemoryException' while iterating over a IEnumerable<IEnumerable<object>>

I have the following code to get the cheapest List of objects which satisfy the requiredNumbers criteria. This list of objects can have a length varying from 1 to maxLength, i.e. there can be a combination of 1 to maxLength of objects with repitition allowed. Right now, this this iterates over the whole list of combinations (IEnumerable of IEnumerable of OBJECT) fine till maxLength = 9 and breaks after that with a "System.OutOfMemoryException" at
t1.Concat(new OBJECT[] { t2 }
I tried another approach to solve this (mentioned in the code comments), but that seems to have its own demons. What I understand right now is , I'll have to somehow know the least priced combination of objects without iterating over the whole List of combination, which I can't seem to find feasible.
Could someone suggest any changes that let the maxLength be higher(much higher ideally), without hindering the performance. Any help is much appreciated. Please let me know if I am not clear.
private static int leastPrice = int.MaxValue;
private IEnumerable<IEnumerable<OBJECT>> CombinationOfObjects(IEnumerable<OBJECT> objects, int length)
{
if (length == 1)
return objects.Select(t => new OBJECT[] { t });
return CombinationOfObjects(objects, length - 1).SelectMany(t => objects, (t1, t2) => t1.Concat(new OBJECT[] { t2 }));
}
//Gets the least priced Valid combination out of all possible
public IEnumerable<OBJECT> GetValidCombination(IEnumerable<OBJECT> list, int maxLength, int[] matArray)
{
IEnumerable<IEnumerable<OBJECT>> tempList = null;
List<IEnumerable<OBJECT>> validList = new List<IEnumerable<OBJECT>>();
for (int i = 1; i <= maxLength; i++)
{
tempList = CombinationOfObjects(list, i);
tempList = from alist in tempList
orderby alist.Sum(x => x.Price)
select alist;
foreach (var lst in tempList)
{
//This check will not be required if the least priced value is returned as soon as found
int price = lst.Sum(c => c.Price);
if (price < leastPrice)
{
if (CheckMaterialSum(lst, matArray))
{
validList.Add(lst);
leastPrice = price;
break;
//return lst;
//returning lst as soon as valid combo is found is fastest
//Con being it also returns the least priced least item containing combo
//i.e. even if a 4 item combo is cheaper than the 2 item combo satisfying the need,
//it'll never even check for the 4 item combo
}
}
}
}
//This whole thing would go too if lst was returned earlier
foreach (IEnumerable<OBJECT> combination in validList)
{
int priceTotal = combination.Sum(combo => combo.Price);
if (priceTotal == leastPrice)
{
return combination;
}
}
return new List<OBJECT>();
}
//Checks if the given combination satisfies the requirement
private bool CheckMaterialSum(IEnumerable<OBJECT> combination, int[] matArray)
{
int[] sumMatProp = new int[matArray.Count()];
for (int i = 0; i < matArray.Count(); i++)
{
sumMatProp[i] = combination.Sum(combo => combo.Numbers[i]);
}
bool isCombinationValid = matArray.Zip(sumMatProp, (requirement, c) => c >= requirement).All(comboValid => comboValid);
return isCombinationValid;
}
static void Main(string[] args)
{
List<OBJECT> testList = new List<OBJECT>();
OBJECT object1 = new OBJECT();
object1.Name = "object1";
object1.Price = 2000;
object1.Numbers = new int[] { 2, 3, 4 };
testList.Add(object1);
OBJECT object2 = new OBJECT();
object2.Name = "object2";
object2.Price = 1900;
object2.Numbers = new int[] { 3, 2, 4 };
testList.Add(object1);
OBJECT object3 = new OBJECT();
object3.Name = "object3";
object3.Price = 1600;
object3.Numbers = new int[] { 4, 3, 2 };
testList.Add(object1);
int requiredNumbers = new int[]{10,10,10};
int maxLength = 9;//This is the max length possible, OutOf Mememory exception after this
IEnumerable<OBJECT> resultCombination = GetValidCombination(testList, maxLength, requiredNumbers);
}
EDIT
Requirement:
I have a number of objects having several properties, namely, Price, Name , and Materials. Now, I need to find such a combination of these objects that the sum of all materials in a combination satisfies the user input qty of materials. Also, the combination needs to be of least price possible.
There is a constraint of maxLength and it sets the maximum total number of objects that can be in a combination, i.e. for a maxLength = 8, the combination may contain anywhere from 1 to 8 objects.
Approaches tried:
1.
-I find all combinations of objects possible (valid + invalid)
-Iterate over them to find the least priced combination. This goes out of memory while iterating.
2.
-I find all combinations possible (valid + invalid)
-Apply a validity check (i.e if it fulfills the user requirement)
-Add only valid combinations in a List of List
-Iterate over this valid List of lists to find the cheapest list and return that. Also goes out of memory
3.
-I find combinations in increasing order of objects (i.e. first all combinations having 1 object, then 2 then so on...)
-Sort the combinations according to price
-Apply validity check and return the first valid combination
-Now this works fine performance wise, but does not always return the cheapest possible combination.
If I could somehow get the optimal solution without iterating over the whole list , that would solve it. But, all of the things that I've tried either have to iterate over all combinations or simply do not result in the optimal solution.
Any help regarding even some other approach that I can't seem to think of is most welcome.

Variable number of for loops without recursion but with Stack?

I know the usual approach for "variable number of for loops" is said to use a recursive method. But I wonder if I could solve that without recursion and instead with using Stack, since you can bypass recursion with the use of a stack.
My example:
I have a variable number of collections and I need to combine every item of every collection with every other item of the other collections.
// example for collections A, B and C:
A (4 items) + B (8 items) + C (10 items)
4 * 8 * 10 = 320 combinations
I need to run through all those 320 combinations. Yet at compile time I don't know if B or C or D exist. How would a solution with no recursive method but with the use of an instance of Stack look like?
Edit:
I realized Stack is not necessary here at all, while you can avoid recursion with a simple int array and a few while loops. Thanks for help and info.
Not with a stack but without recursion.
void Main()
{
var l = new List<List<int>>()
{
new List<int>(){ 1,2,3 },
new List<int>(){ 4,5,6 },
new List<int>(){ 7,8,9 }
};
var result = CartesianProduct(l);
}
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>()};
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] {item})
);
}
Function taken form Computing a Cartesian Product with LINQ
Here is an example of how to do this. Algorithm is taken from this question - https://stackoverflow.com/a/2419399/5311735 and converted to C#. Note that it can be made more efficient, but I converted inefficient version to C# because it's better illustrates the concept (you can see more efficient version in the linked question):
static IEnumerable<T[]> CartesianProduct<T>(IList<IList<T>> collections) {
// this contains the indexes of elements from each collection to combine next
var indexes = new int[collections.Count];
bool done = false;
while (!done) {
// initialize array for next combination
var nextProduct = new T[collections.Count];
// fill it
for (int i = 0; i < collections.Count; i++) {
var collection = collections[i];
nextProduct[i] = collection[indexes[i]];
}
yield return nextProduct;
// now we need to calculate indexes for the next combination
// for that, increase last index by one, until it becomes equal to the length of last collection
// then increase second last index by one until it becomes equal to the length of second last collection
// and so on - basically the same how you would do with regular numbers - 09 + 1 = 10, 099 + 1 = 100 and so on.
var j = collections.Count - 1;
while (true) {
indexes[j]++;
if (indexes[j] < collections[j].Count) {
break;
}
indexes[j] = 0;
j--;
if (j < 0) {
done = true;
break;
}
}
}
}

Thoughts on foreach with Enumerable.Range vs traditional for loop

In C# 3.0, I'm liking this style:
// Write the numbers 1 thru 7
foreach (int index in Enumerable.Range( 1, 7 ))
{
Console.WriteLine(index);
}
over the traditional for loop:
// Write the numbers 1 thru 7
for (int index = 1; index <= 7; index++)
{
Console.WriteLine( index );
}
Assuming 'n' is small so performance is not an issue, does anyone object to the new style over the traditional style?
I find the latter's "minimum-to-maximum" format a lot clearer than Range's "minimum-count" style for this purpose. Also, I don't think it's really a good practice to make a change like this from the norm that is not faster, not shorter, not more familiar, and not obviously clearer.
That said, I'm not against the idea in general. If you came up to me with syntax that looked something like foreach (int x from 1 to 8) then I'd probably agree that that would be an improvement over a for loop. However, Enumerable.Range is pretty clunky.
This is just for fun. (I'd just use the standard "for (int i = 1; i <= 10; i++)" loop format myself.)
foreach (int i in 1.To(10))
{
Console.WriteLine(i); // 1,2,3,4,5,6,7,8,9,10
}
// ...
public static IEnumerable<int> To(this int from, int to)
{
if (from < to)
{
while (from <= to)
{
yield return from++;
}
}
else
{
while (from >= to)
{
yield return from--;
}
}
}
You could also add a Step extension method too:
foreach (int i in 5.To(-9).Step(2))
{
Console.WriteLine(i); // 5,3,1,-1,-3,-5,-7,-9
}
// ...
public static IEnumerable<T> Step<T>(this IEnumerable<T> source, int step)
{
if (step == 0)
{
throw new ArgumentOutOfRangeException("step", "Param cannot be zero.");
}
return source.Where((x, i) => (i % step) == 0);
}
In C# 6.0 with the use of
using static System.Linq.Enumerable;
you can simplify it to
foreach (var index in Range(1, 7))
{
Console.WriteLine(index);
}
You can actually do this in C# (by providing To and Do as extension methods on int and IEnumerable<T> respectively):
1.To(7).Do(Console.WriteLine);
SmallTalk forever!
I kind of like the idea. It's very much like Python. Here's my version in a few lines:
static class Extensions
{
public static IEnumerable<int> To(this int from, int to, int step = 1) {
if (step == 0)
throw new ArgumentOutOfRangeException("step", "step cannot be zero");
// stop if next `step` reaches or oversteps `to`, in either +/- direction
while (!(step > 0 ^ from < to) && from != to) {
yield return from;
from += step;
}
}
}
It works like Python's:
0.To(4) → [ 0, 1, 2, 3 ]
4.To(0) → [ 4, 3, 2, 1 ]
4.To(4) → [ ]
7.To(-3, -3) → [ 7, 4, 1, -2 ]
I think the foreach + Enumerable.Range is less error prone (you have less control and less ways to do it wrong, like decreasing the index inside the body so the loop would never end, etc.)
The readability problem is about the Range function semantics, that can change from one language to another (e.g if given just one parameter will it begin from 0 or 1, or is the end included or excluded or is the second parameter a count instead a end value).
About the performance, I think the compiler should be smart enough to optimize both loops so they execute at a similar speed, even with large ranges (I suppose that Range does not create a collection, but of course an iterator).
I think Range is useful for working with some range inline:
var squares = Enumerable.Range(1, 7).Select(i => i * i);
You can each over. Requires converting to list but keeps things compact when that's what you want.
Enumerable.Range(1, 7).ToList().ForEach(i => Console.WriteLine(i));
But other than for something like this, I'd use traditional for loop.
It seems like quite a long winded approach to a problem that's already solved. There's a whole state machine behind the Enumerable.Range that isn't really needed.
The traditional format is fundamental to development and familiar to all. I don't really see any advantage to your new style.
I'd like to have the syntax of some other languages like Python, Haskell, etc.
// Write the numbers 1 thru 7
foreach (int index in [1..7])
{
Console.WriteLine(index);
}
Fortunatly, we got F# now :)
As for C#, I'll have to stick with the Enumerable.Range method.
#Luke:
I reimplemented your To() extension method and used the Enumerable.Range() method to do it.
This way it comes out a little shorter and uses as much infrastructure given to us by .NET as possible:
public static IEnumerable<int> To(this int from, int to)
{
return from < to
? Enumerable.Range(from, to - from + 1)
: Enumerable.Range(to, from - to + 1).Reverse();
}
How to use a new syntax today
Because of this question I tried out some things to come up with a nice syntax without waiting for first-class language support. Here's what I have:
using static Enumerizer;
// prints: 0 1 2 3 4 5 6 7 8 9
foreach (int i in 0 <= i < 10)
Console.Write(i + " ");
Not the difference between <= and <.
I also created a proof of concept repository on GitHub with even more functionality (reversed iteration, custom step size).
A minimal and very limited implementation of the above loop would look something like like this:
public readonly struct Enumerizer
{
public static readonly Enumerizer i = default;
public Enumerizer(int start) =>
Start = start;
public readonly int Start;
public static Enumerizer operator <(int start, Enumerizer _) =>
new Enumerizer(start);
public static Enumerizer operator >(int _, Enumerizer __) =>
throw new NotImplementedException();
public static IEnumerable<int> operator <=(Enumerizer start, int end)
{
for (int i = start.Start; i < end; i++)
yield return i;
}
public static IEnumerable<int> operator >=(Enumerizer _, int __) =>
throw new NotImplementedException();
}
There is no significant performance difference between traditional iteration and range iteration, as Nick Chapsas pointed out in his excellent YouTube video. Even the benchmark showed there is some difference in nanoseconds for the small number of iterations. As the loop gets quite big, the difference is almost gone.
Here is an elegant way of iterating in a range loop from his content:
private static void Test()
{
foreach (var i in 1..5)
{
}
}
Using this extension:
public static class Extension
{
public static CustomIntEnumerator GetEnumerator(this Range range)
{
return new CustomIntEnumerator(range);
}
public static CustomIntEnumerator GetEnumerator(this int number)
{
return new CustomIntEnumerator(new Range(0, number));
}
}
public ref struct CustomIntEnumerator
{
private int _current;
private readonly int _end;
public CustomIntEnumerator(Range range)
{
if (range.End.IsFromEnd)
{
throw new NotSupportedException();
}
_current = range.Start.Value - 1;
_end = range.End.Value;
}
public int Current => _current;
public bool MoveNext()
{
_current++;
return _current <= _end;
}
}
Benchmark result:
I loved this way of implementation. But, the biggest issue with this approach is its inability to use it in the async method.
I'm sure everybody has their personal preferences (many would prefer the later just because it is familiar over almost all programming languages), but I am like you and starting to like the foreach more and more, especially now that you can define a range.
In my opinion the Enumerable.Range() way is more declarative. New and unfamiliar to people? Certainly. But I think this declarative approach yields the same benefits as most other LINQ-related language features.
I imagine there could be scenarios where Enumerable.Range(index, count) is clearer when dealing with expressions for the parameters, especially if some of the values in that expression are altered within the loop. In the case of for the expression would be evaluated based on the state after the current iteration, whereas Enumerable.Range() is evaluated up-front.
Other than that, I'd agree that sticking with for would normally be better (more familiar/readable to more people... readable is a very important value in code that needs to be maintained).
I agree that in many (or even most cases) foreach is much more readable than a standard for-loop when simply iterating over a collection. However, your choice of using Enumerable.Range(index, count) isn't a strong example of the value of foreach over for.
For a simple range starting from 1, Enumerable.Range(index, count) looks quite readable. However, if the range starts with a different index, it becomes less readable because you have to properly perform index + count - 1 to determine what the last element will be. For example…
// Write the numbers 2 thru 8
foreach (var index in Enumerable.Range( 2, 7 ))
{
Console.WriteLine(index);
}
In this case, I much prefer the second example.
// Write the numbers 2 thru 8
for (int index = 2; index <= 8; index++)
{
Console.WriteLine(index);
}
Strictly speaking, you misuse enumeration.
Enumerator provides the means to access all the objects in a container one-by-one, but it does not guarantee the order.
It is OK to use enumeration to find the biggest number in an array. If you are using it to find, say, first non-zero element, you are relying on the implementation detail you should not know about. In your example, the order seems to be important to you.
Edit: I am wrong. As Luke pointed out (see comments) it is safe to rely on the order when enumerating an array in C#. This is different from, for example, using "for in" for enumerating an array in Javascript .
I do like the foreach + Enumerable.Range approach and use it sometimes.
// does anyone object to the new style over the traditional style?
foreach (var index in Enumerable.Range(1, 7))
I object to the var abuse in your proposal. I appreciate var, but, damn, just write int in this case! ;-)
Just throwing my hat into the ring.
I define this...
namespace CustomRanges {
public record IntRange(int From, int Thru, int step = 1) : IEnumerable<int> {
public IEnumerator<int> GetEnumerator() {
for (var i = From; i <= Thru; i += step)
yield return i;
}
IEnumerator IEnumerable.GetEnumerator()
=> GetEnumerator();
};
public static class Definitions {
public static IntRange FromTo(int from, int to, int step = 1)
=> new IntRange(from, to - 1, step);
public static IntRange FromThru(int from, int thru, int step = 1)
=> new IntRange(from, thru, step);
public static IntRange CountFrom(int from, int count)
=> new IntRange(from, from + count - 1);
public static IntRange Count(int count)
=> new IntRange(0, count);
// Add more to suit your needs. For instance, you could add in reversing ranges, etc.
}
}
Then anywhere I want to use it, I add this at the top of the file...
using static CustomRanges.Definitions;
And use it like this...
foreach(var index in FromTo(1, 4))
Debug.WriteLine(index);
// Prints 1, 2, 3
foreach(var index in FromThru(1, 4))
Debug.WriteLine(index);
// Prints 1, 2, 3, 4
foreach(var index in FromThru(2, 10, 2))
Debug.WriteLine(index);
// Prints 2, 4, 6, 8, 10
foreach(var index in CountFrom(7, 4))
Debug.WriteLine(index);
// Prints 7, 8, 9, 10
foreach(var index in Count(5))
Debug.WriteLine(index);
// Prints 0, 1, 2, 3, 4
foreach(var _ in Count(4))
Debug.WriteLine("A");
// Prints A, A, A, A
The nice thing about this approach is by the names, you know exactly if the end is included or not.

Categories

Resources