How to generate powersets with a maximum size in C#?

How to generate powersets with a maximum size in C#? - c#

I am trying to generate all powersets from a given list within a limit of given maximum size. I've found some great answers for how to generate powersets in general and admire the solution using bitmaps found here All Possible Combinations of a list of Values or here Computing Powersets in C#.
Is there a way to generate sets with a maximum size of 'maxSize' numbers in one set? E.g. my input is {1, 2, 3, 4, 5, 6}, but I only want results with 3 or less items. Is it possible to do within the one command? I have found a solution where I iterate over all items of the result, but this is quite inefficient for large inputs with smaller maxSize.

It's easy with recursion:
static public IEnumerable<List<int>> FindAllCombos(List<int> list, int maxSize, int minIndex = 0)
{
yield return new List<int>();
if (maxSize > 0)
for (int i = minIndex; i < list.Count; i++)
foreach (var set in FindAllCombos(list, maxSize - 1, i + 1))
{
set.Add(list[i]);
yield return set;
}
}
Note that the elements of the output sets will here be in the reverse order.

Related

Subset pattern implementation

I am trying to write an implementation on C# of Subsets pattern read here 14 Patterns to Ace Any Coding Interview Question:
It looks obvious but confuses me. My research says me it should be implemented via Jagged Arrays (not Multidimensional Arrays). I started:
int[] input = { 1, 5, 3 };
int[][] set = new int[4][];
// ...
Could someone help with 2, 3 and 4 steps?

The instructions provided seem to lend themselves more to a c++ style than a C# style. I believe there are better ways than manually building arrays to get a list of subsets in C#. That said, here's how I would go about implementing the instructions as they are written.
To avoid having to repeatedly grow the array of subsets, we should calculate its length before we allocate it.
Assuming n elements in the input, we can determine the number of possible subsets by adding:
All subsets with 0 elements (the empty set)
All subsets with 1 element
All subsets with 2 elements
...
All subsets with n-1 elements
All subsets with n elements (the set itself)
Mathematically, this is the summation of the binomial coefficient. We take the sum from 0 to n of n choose k which evaluates to 2^n.
The jagged array should then contain 2^n arrays whose length will vary from 0 to n.
var input = new int[] { 1, 3, 5 };
var numberOfSubsets = (int)Math.Pow(2, input.Length);
var subsets = new int[numberOfSubsets][];
As the instructions in your article state, we start by adding the empty set to our list of subsets.
int nextEmptyIndex = 0;
subsets[nextEmptyIndex++] = new int[0];
Then, for each element in our input, we record the end of the existing subsets (so we don't end up in an infinite loop chasing the new subsets we will be adding) and add the new subset(s).
foreach (int element in input)
{
int stopIndex = nextEmptyIndex - 1;
// Build a new subset by adding the new element
// to the end of each existing subset.
for (int i = 0; i <= stopIndex; i++)
{
int newSubsetLength = subsets[i].Length + 1;
int newSubsetIndex = nextEmptyIndex++;
// Allocate the new subset array.
subsets[newSubsetIndex] = new int[newSubsetLength];
// Copy the elements from the existing subset.
Array.Copy(subsets[i], subsets[newSubsetIndex], subsets[i].Length);
// Add the new element at the end of the new subset.
subsets[newSubsetIndex][newSubsetLength - 1] = element;
}
}
With some logging at the end, we can see our result:
for (int i = 0; i < subsets.Length; i++)
{
Console.WriteLine($"subsets[{ i }] = { string.Join(", ", subsets[i]) }");
}
subsets[0] =
subsets[1] = 1
subsets[2] = 3
subsets[3] = 1, 3
subsets[4] = 5
subsets[5] = 1, 5
subsets[6] = 3, 5
subsets[7] = 1, 3, 5
Try it out!

What I find easiest is translating the problem from a word problem into a more logical one.
Start with an empty set : [[]]
So the trick here is that this word problem tells you to create an empty set but immediately shows you a set that contains an element.
If we break this down into arrays instead(because I personally find it more intuitive) we can translate it to:
Start with an array of arrays, who's first element is an empty array. (instead of null)
So basically
int[][] result = new int[]{ new int[0] };
Now we have somewhere to start from, we can start to translate the other parts of the word problem.
Add the First Number (1) to all existing subsets to create subsets: [[],[1]]
Add the Second Number (5) to all existing subsets ...
Add the Third Number (3) to all existing subsets ...
There's a lot of information here. Let's translate different parts
Add the 1st Number ...
Add the 2nd Number ...
Add the nth Number ...
The repetition of these instructions and the fact that each number 1, 5, 3 matches our starting set of {1, 5, 3} tells us we should use a loop of some kind to build our result.
for(int i = 0; i < set.Length; i++)
{
int number = set[i];
// add subsets some how
}
Add the number to all existing subsets to create subsets: [[],[1]
A couple things here stand out. Notice they used the word Add but provide you an example where the number wasn't added to one of the existing subsets [[]] turned into [[],[1]]. Why is one of them still empty if we added 1 to all of them?
The reason for this is because when we create the new subsets and all their variations, we want to keep the old ones. So we do add the 1 to [](the first element) but we make a copy of [] first. That way when we add 1 to that copy, we still have the original [] and now a brand new [1] then we can combine them to create [[],[1]].
Using these clues we can decipher that Add the number to all existing subsets, actually means make copies of all existing subsets, add the number to each of the copies, then add those copies at the end of the result array.
int[][] result = new int[]{ new int[0] };
int[] copy = result[0];
copy.Append(1); // pseudo code
result.Append(copy); // pseudo code
// result: [[],[1]]
Let's put each of those pieces together and put together the final solution, hopefully!
Here's an example that I threw together that works(at least according to your example data).
object[] set = { 1, 5, 3 };
// [null]
object[][] result = Array.Empty<object[]>();
// add a [] to the [null] creating [[]]
Append(ref result, Array.Empty<object>());
// create a method so we can add things to the end of an array
void Append<T>(ref T[] array, T SubArrayToAdd)
{
int size = array.Length;
Array.Resize(ref array, size + 1);
array[size] = SubArrayToAdd;
}
// create a method that goes through all the existing subsets and copies them, adds the item, and adds those copies to the result array
void AddSubsets(object item)
{
// store the length of the result because if we don't we will infinitely expand(because we resize the array)
int currentLength = result.Length;
for (int i = 0; i < currentLength; i++)
{
// copy the array so we don't change the original
object[] previousItemArray = result[i]; // []
// add the item to it
Append(ref previousItemArray, item); // [1]
// add that copy to the results
Append(ref result, previousItemArray); // [[]] -> [[],[1]]
}
}
// Loop over the set and add the subsets to the result
for (int i = 0; i < set.Length; i++)
{
object item = set[i];
AddSubsets(item);
}

Is there a C# equivalent to C++ std::partial_sort?

I'm trying to implement a paging algorithm for a dataset sortable via many criteria. Unfortunately, while some of those criteria can be implemented at the database level, some must be done at the app level (we have to integrate with another data source). We have a paging (actually infinite scroll) requirement and are looking for a way to minimize the pain of sorting the entire dataset at the app level with every paging call.
What is the best way to do a partial sort, only sorting the part of the list that absolutely needs to be sorted? Is there an equivalent to C++'s std::partial_sort function available in the .NET libraries? How should I go about solving this problem?
EDIT: Here's an example of what I'm going for:
Let's say I need to get elements 21-40 of a 1000 element set, according to some sorting criteria. In order to speed up the sort, and since I have to go through the whole dataset every time anyway (this is a web service over HTTP, which is stateless), I don't need the whole dataset ordered. I only need elements 21-40 to be correctly ordered. It is sufficient to create 3 partitions: Elements 1-20, unsorted (but all less than element 21); elements 21-40, sorted; and elements 41-1000, unsorted (but all greater than element 40).

OK. Here's what I would try based on what you said in reply to my comment.
I want to be able to say "4th through 6th" and get something like: 3,
2, 1 (unsorted, but all less than proper 4th element); 4, 5, 6 (sorted
and in the same place they would be for a sorted list); 8, 7, 9
(unsorted, but all greater than proper 6th element).
Lets add 10 to our list to make it easier: 10, 9, 8, 7, 6, 5, 4, 3, 2, 1.
So, what you could do is use the quick select algorithm to find the the ith and kth elements. In your case above i is 4 and k is 6. That will of course return the values 4 and 6. That's going to take two passes through your list. So, so far the runtime is O(2n) = O(n). The next part is easy, of course. We have lower and upper bounds on the data we care about. All we need to do is make another pass through our list looking for any element that is between our upper and lower bounds. If we find such an element we throw it into a new List. Finally, we then sort our List which contains only the ith through kth elements that we care about.
So, I believe the total runtime ends up being O(N) + O((k-i)lg(k-i))
static void Main(string[] args) {
//create an array of 10 million items that are randomly ordered
var list = Enumerable.Range(1, 10000000).OrderBy(x => Guid.NewGuid()).ToList();
var sw = Stopwatch.StartNew();
var slowOrder = list.OrderBy(x => x).Skip(10).Take(10).ToList();
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
//Took ~8 seconds on my machine
sw.Restart();
var smallVal = Quickselect(list, 11);
var largeVal = Quickselect(list, 20);
var elements = list.Where(el => el >= smallVal && el <= largeVal).OrderBy(el => el);
Console.WriteLine(sw.ElapsedMilliseconds);
//Took ~1 second on my machine
}
public static T Quickselect<T>(IList<T> list , int k) where T : IComparable {
Random rand = new Random();
int r = rand.Next(0, list.Count);
T pivot = list[r];
List<T> smaller = new List<T>();
List<T> larger = new List<T>();
foreach (T element in list) {
var comparison = element.CompareTo(pivot);
if (comparison == -1) {
smaller.Add(element);
}
else if (comparison == 1) {
larger.Add(element);
}
}
if (k <= smaller.Count) {
return Quickselect(smaller, k);
}
else if (k > list.Count - larger.Count) {
return Quickselect(larger, k - (list.Count - larger.Count));
}
else {
return pivot;
}
}

You can use List<T>.Sort(int, int, IComparer<T>):
inputList.Sort(startIndex, count, Comparer<T>.Default);

Array.Sort() has an overload that accepts index and length arguments that lets you sort a subset of an array. The same exists for List.
You cannot sort an IEnumerable directly, of course.

Top 5 values from three given arrays

Recently i faced a question in
C#,question is:-
There are three int arrays
Array1={88,65,09,888,87}
Array2={1,49,921,13,33}
Array2={22,44,66,88,110}
Now i have to get array of highest 5 from all these three arrays.What is the most optimized way of doing this in c#?
The way i can think of is take an array of size 15 and add array elements of all three arrays and sort it n get last 5.

An easy way with LINQ:
int[] top5 = array1.Concat(array2).Concat(array3).OrderByDescending(i => i).Take(5).ToArray();
An optimal way:
List<int> highests = new List<int>(); // Keep the current top 5 sorted
// Traverse each array. No need to put them together in an int[][]..it's just for simplicity
foreach (int[] array in new int[][] { array1, array2, array3 }) {
foreach (int i in array) {
int index = highests.BinarySearch(i); // where should i be?
if (highests.Count < 5) { // if not 5 yet, add anyway
if (index < 0) {
highests.Insert(~index, i);
} else { //add (duplicate)
highests.Insert(index, i);
}
}
else if (index < 0) { // not in top-5 yet, add
highests.Insert(~index, i);
highests.RemoveAt(0);
} else if (index > 0) { // already in top-5, add (duplicate)
highests.Insert(index, i);
highests.RemoveAt(0);
}
}
}
Keep a sorted list of the top-5 and traverse each array just once.
You may even check the lowest of the top-5 each time, avoiding the BinarySearch:
List<int> highests = new List<int>();
foreach (int[] array in new int[][] { array1, array2, array3 }) {
foreach (int i in array) {
int index = highests.BinarySearch(i);
if (highests.Count < 5) { // if not 5 yet, add anyway
if (index < 0) {
highests.Insert(~index, i);
} else { //add (duplicate)
highests.Insert(index, i);
}
} else if (highests.First() < i) { // if larger than lowest top-5
if (index < 0) { // not in top-5 yet, add
highests.Insert(~index, i);
highests.RemoveAt(0);
} else { // already in top-5, add (duplicate)
highests.Insert(index, i);
highests.RemoveAt(0);
}
}
}
}

The most optimized way for a fixed K=5 is gong through all arrays five times, picking the highest element not taken so far on each pass. You need to mark the element that you take in order to skip it on subsequent passes. This has the complexity of O(N1+N2+N3) (you go through all N1+N2+N3 elements five times), which is as fast as it can get.

You can combine the arrays using LINQ, sort them, then reverse.
int[] a1 = new int[] { 1, 10, 2, 9 };
int[] a2 = new int[] { 3, 8, 4, 7 };
int[] a3 = new int[] { 2, 9, 8, 4 };
int[] a4 = a1.Concat(a2).Concat(a3).ToArray();
Array.Sort(a4);
Array.Reverse(a4);
for (int i = 0; i < 5; i++)
{
Console.WriteLine(a4[i].ToString());
}
Console.ReadLine();
Prints: 10, 9, 9, 8, 8 from the sample I provided as input for the arrays.

Maybe you could have an array of 5 elements which would be the "max values" array.
Initially fill it with the first 5 values, which in your case would just be the first array. Then loop through the rest of the values. For each value, check it against the 5 max values from least to greatest. If you find the current value from the main list is greater than the value in the max values array, insert it above that element in the array, which would push the last element out. At the end you should have an array of the 5 max values.

For three arrays of length N1,N2,N3, the fastest way should be combining the 3 arrays, and then finding the (N1+N2+N3-4)th order statistic using modified quick sort.
In the resultant array, the elements with indices (N1+N2+N3-5) to the maximum (N1+N2+N3-1) should be your 5 largest. You can also sort them later.
The time complexity of this approach is O(N1+N2+N3) on average.

Here are the two ways for doing this task. The first one is using only basic types. This is the most efficient way, with no extra loop, no extra comparison, and no extra memory consumption. You just pass the index of elements that need to be matched with another one and calculate which is the next index to be matched for each given array.
First Way -
http://www.dotnetbull.com/2013/09/find-max-top-5-number-from-3-sorted-array.html
Second Way -
int[] Array1 = { 09, 65, 87, 89, 888 };
int[] Array2 = { 1, 13, 33, 49, 921 };
int[] Array3 = { 22, 44, 66, 88, 110 };
int [] MergeArr = Array1.Concat(Array2).Concat(Array3).ToArray();
Array.Sort(MergeArr);
int [] Top5Number = MergeArr.Reverse().Take(5).ToArray()
Taken From -
Find max top 5 number from three given sorted array

Short answer: Use a SortedList from Sorted Collection Types in .NET as a min-heap.
Explanation:
From the first array, add 5 elements to this SortedList/min-heap;
Now iterate through all the rest of the elements of arrays:
If an array element is bigger than the smallest element in min-heap then remove the min element and push this array element in the heap;
Else, continue to next array element;
In the end, your min-heap has the 5 biggest elements of all arrays.
Complexity: Takes Log k time to find the minimum when you have a SortedList of k elements. Multiply that by total elements in all arrays because you are going to perform this 'find minimum operation' that many times.
Brings us to overall complexity of O(n * Log k) where n is the total number of elements in all your arrays and k is the number of highest numbers you want.

Getting all possible combinations from a list of numbers

I'm looking for an efficient way to achieve this:
you have a list of numbers 1.....n (typically: 1..5 or 1..7 or so - reasonably small, but can vary from case to case)
you need all combinations of all lengths for those numbers, e.g. all combinations of just one number ({1}, {2}, .... {n}), then all combinations of two distinct numbers ({1,2}, {1,3}, {1,4} ..... {n-1, n} ), then all combinations fo three of those numbers ({1,2,3}, {1,2,4}) and so forth
Basically, within the group, the order is irrelevant, so {1,2,3} is equivalent to {1,3,2} - it's just a matter of getting all groups of x numbers from that list
Seems like there ought to be a simple algorithm for this - but I have searched in vain so far. Most combinatorics and permutation algorithms seems to a) take order into account (e.g. 123 is not equal to 132), and they always seems to operate on a single string of characters or numbers....
Anyone have a great, nice'n'quick algorithm up their sleeve??
Thanks!

Not my code, but you're looking for the powerset. Google gave me this solution, which seems t work:
public IEnumerable<IEnumerable<T>> GetPowerSet<T>(List<T> list)
{
return from m in Enumerable.Range(0, 1 << list.Count)
select
from i in Enumerable.Range(0, list.Count)
where (m & (1 << i)) != 0
select list[i];
}
Source: http://rosettacode.org/wiki/Power_set#C.23

Just increment a binary number and take the elements corresponding to bits that are set.
For instance, 00101101 would mean take the elements at indexes 0, 2, 3, and 5. Since your list is simply 1..n, the element is simply the index + 1.
This will generate in-order permutations. In other words, only {1, 2, 3} will be generated. Not {1, 3, 2} or {2, 1, 3} or {2, 3, 1}, etc.

This is something I have written in the past to accomplish such a task.
List<T[]> CreateSubsets<T>(T[] originalArray)
{
List<T[]> subsets = new List<T[]>();
for (int i = 0; i < originalArray.Length; i++)
{
int subsetCount = subsets.Count;
subsets.Add(new T[] { originalArray[i] });
for (int j = 0; j < subsetCount; j++)
{
T[] newSubset = new T[subsets[j].Length + 1];
subsets[j].CopyTo(newSubset, 0);
newSubset[newSubset.Length - 1] = originalArray[i];
subsets.Add(newSubset);
}
}
return subsets;
}
It's generic, so it will work for ints, longs, strings, Foos, etc.

What is fastest way to find number of matches between arrays?

Currently, I am testing every integer element against each other to find which ones match. The arrays do not contain duplicates within their own set. Also, the arrays are not always equal lengths. Are there any tricks to speed this up? I am doing this thousands of times, so it's starting to become a bottle neck in my program, which is in C#.

You could use LINQ:
var query = firstArray.Intersect(secondArray);
Or if the arrays are already sorted you could iterate over the two arrays yourself:
int[] a = { 1, 3, 5 };
int[] b = { 2, 3, 4, 5 };
List<int> result = new List<int>();
int ia = 0;
int ib = 0;
while (ia < a.Length && ib < b.Length)
{
if (a[ia] == b[ib])
{
result.Add(a[ia]);
ib++;
ia++;
}
else if (a[ia] < b[ib])
{
ia++;
}
else
{
ib++;
}
}

Use a HashSet
var set = new HashSet<int>(firstArray);
set.IntersectWith(secondArray);
The set now contains only the values that exist in both arrays.

If such a comparison is a bottleneck in your program, you are perhaps using an inappropriate data structure. The simplest way might be to keep your data sorted. Then for finding out the common entries, you would need to traverse both arrays only once. Another option would be to keep the data in a HashSet.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to generate powersets with a maximum size in C#? - c#

Related

Subset pattern implementation

Is there a C# equivalent to C++ std::partial_sort?

Top 5 values from three given arrays

Getting all possible combinations from a list of numbers

What is fastest way to find number of matches between arrays?

Categories

Resources