Loop Through Every Possible Combination of an Array - c#

I am attempting to loop through every combination of an array in C# dependent on size, but not order. For example: var states = ["NJ", "AK", "NY"];
Some Combinations might be:
states = [];
states = ["NJ"];
states = ["NJ","NY"];
states = ["NY"];
states = ["NJ", "NY", "AK"];
and so on...
It is also true in my case that states = ["NJ","NY"] and states = ["NY","NJ"] are the same thing, as order does not matter.
Does anyone have any idea on the most efficient way to do this?

The combination of the following two methods should do what you want. The idea is that if the number of items is n then the number of subsets is 2^n. And if you iterate from 0 to 2^n - 1 and look at the numbers in binary you'll have one digit for each item and if the digit is 1 then you include the item, and if it is 0 you don't. I'm using BigInteger here as int would only work for a collection of less than 32 items and long would only work for less than 64.
public static IEnumerable<IEnumerable<T>> PowerSets<T>(this IList<T> set)
{
var totalSets = BigInteger.Pow(2, set.Count);
for (BigInteger i = 0; i < totalSets; i++)
{
yield return set.SubSet(i);
}
}
public static IEnumerable<T> SubSet<T>(this IList<T> set, BigInteger n)
{
for (int i = 0; i < set.Count && n > 0; i++)
{
if ((n & 1) == 1)
{
yield return set[i];
}
n = n >> 1;
}
}
With that the following code
var states = new[] { "NJ", "AK", "NY" };
foreach (var subset in states.PowerSets())
{
Console.WriteLine("[" + string.Join(",", subset.Select(s => "'" + s + "'")) + "]");
}
Will give you this output.
[]
['NJ']
['AK']
['NJ','AK']
['NY']
['NJ','NY']
['AK','NY']
['NJ','AK','NY']

You can use back-tracking where in each iteration you'll either (1) take the item in index i or (2) do not take the item in index i.
Pseudo code for this problem:
Main code:
Define a bool array (e.g. named picked) of the length of states
Call the backtracking method with index 0 (the first item)
Backtracking function:
Receives the states array, its length, the current index and the bool array
Halt condition - if the current index is equal to the length then you just need to iterate over the bool array and for each item which is true print the matching string from states
Actual backtracking:
Set picked[i] to true and call the backtracking function with the next index
Set picked[i] to false and call the backtracking function with the next index

Related

How to find repetitions/duplicates in the array within 10 elements apart

I have an array like following:
{1,5,5,4,5,6,7,8,9,10,11,12,13,14,1,16,17,5}
I want to find duplicates within each 10 elements from one to another.
I need a code that can tell me that 5 was duplicated 3 times within 10 elements (there are at most only 1 element between 5s (4). It should igore last 5 as it is too far away. Only three 5s are within 10 elements).
I don't want code to return 1, because there are 13 element in between both 1s.
I have a code that can count duplicates but how to change it so it can count duplicates withing 10 elements?
var dict = new Dictionary<string, int>();
foreach (var count in combined2)
{
if (dict.ContainsKey(count))
dict[count]++;
else
dict[count] = 1;
}
foreach (var val in dict)
{
MessageBox.Show(val.Key + " occurred " + val.Value + " times");
}
I'm only concerned with duplicates that occur the most. If some number get duplicated twice but another gets duplicated 3 times. I would only like to know about number that got duplicated 3 times (withing 10 items). Thank you
Make a dictionary max defaulting to 0
Make a dictionary seen defaulting to 0
Count count from 0 up to N, where N is number of elements.
after N >= 10, decrement seen[array[count - 10]]
Increment seen[array[count]]
If that number is higher than max[array[count]], update it
Repeat
Return the key of the highest value in max.
This way, seen always has the accurate count in the 10-element window, and max will have the maximum number of appearances of each element in a 10-element window.
This code finds the first item with the higher number of occurrences inside the "numbers" array (within n = 10 elements):
int n = 10;
int[] numbers = new int[] {1,5,5,4,5,6,7,8,9,10,11,12,13,14,1,16,17,5};
int mostDuplicatedNumber = 0, numberOfMaxOccurrences = 0;
for(int count = 0; count < numbers.Length - n; count++)
{
var groupOfNumbers = numbers.Skip(count).Take(n).GroupBy(i => i).OrderByDescending(i => i.Count());
if(numberOfMaxOccurrences < groupOfNumbers.First().Count())
{
numberOfMaxOccurrences = groupOfNumbers.First().Count();
mostDuplicatedNumber = groupOfNumbers.First().Key;
}
}
Console.WriteLine("Most duplicated number is " + mostDuplicatedNumber + " with " + numberOfMaxOccurrences + " occurrences");
Try out this way. I have not tested using IDE just wrote while travelling. Let me know if you encounter any error. What it does simply take first 10 elements and finds number of occurrence i.e. repetition but then ( you would like to display most repeated number in that case you have to hold those repeated numbers in another array and swap the elements to get most repeated and least repeated one as you asking in your question I have not implemented this part ) .
.................
int[] inputArray= {1,5,5,4,5,6,7,8,9,10,11,12,13,14,1,16,17,5} // total input elements
int[] mostAndLeastOccuranceArray=new int[10] ;
int skip=0;
int limit=10;
int[] resultArray=new int[10];
for (int i = skip; i < inputArray.Length; i++)
{
if(i<limit)
{
resultArray[i]=inputArray[i];
skip=skip+1;
}else
{
findOccurance(resultArray); // call in every 10 elements array subset
resultArray=new int[10]; // re-initialize the array
limit=limit+10; // increase the limit for next iteration remember loop has not finished yet
}
}
public void findOccurance(int[] resultArray)
{
var dict = new Dictionary < int,int > ();
foreach(var value in resultArray)
{
if (dict.ContainsKey(value)) dict[value]++;
else dict[value] = 1;
}
foreach(var pair in dict)
{
mostAndLeastOccuranceArray[pair.Key]=pair.Value; // store the repeated value
Console.WriteLine("Value {0} occurred {1} times", pair.Key, pair.Value);
}
// call the method to find most and least occurance elements within each array subsets
findMostAndLeastOccuranceElements(mostAndLeastOccuranceArray)
// re-initialize
mostAndLeastOccuranceArray=new int[10] ;
}
public void findMostAndLeastOccuranceElements(int[] mostAndLeastOccuranceArray)
{
// now find most and least repeated elements within each 10 elements block
}
A simpler solution would be to use LINQ. Here I wrote a simple method to count the number of time a value is repeated.
public int CountRepetitions(List<int> myLists,int maxValues,int number)
{
if (myLists.Count > maxValues)
return myLists.Take(maxValues).Count(v => v == number);
else return 0;
}

How to Order By or Sort an integer List and select the Nth element

I have a list, and I want to select the fifth highest element from it:
List<int> list = new List<int>();
list.Add(2);
list.Add(18);
list.Add(21);
list.Add(10);
list.Add(20);
list.Add(80);
list.Add(23);
list.Add(81);
list.Add(27);
list.Add(85);
But OrderbyDescending is not working for this int list...
int fifth = list.OrderByDescending(x => x).Skip(4).First();
Depending on the severity of the list not having more than 5 elements you have 2 options.
If the list never should be over 5 i would catch it as an exception:
int fifth;
try
{
fifth = list.OrderByDescending(x => x).ElementAt(4);
}
catch (ArgumentOutOfRangeException)
{
//Handle the exception
}
If you expect that it will be less than 5 elements then you could leave it as default and check it for that.
int fifth = list.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == 0)
{
//handle default
}
This is still some what flawed because you could end up having the fifth element being 0. This can be solved by typecasting the list into a list of nullable ints at before the linq:
var newList = list.Select(i => (int?)i).ToList();
int? fifth = newList.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == null)
{
//handle default
}
Without LINQ expressions:
int result;
if(list != null && list.Count >= 5)
{
list.Sort();
result = list[list.Count - 5];
}
else // define behavior when list is null OR has less than 5 elements
This has a better performance compared to LINQ expressions, although the LINQ solutions presented in my second answer are comfortable and reliable.
In case you need extreme performance for a huge List of integers, I'd recommend a more specialized algorithm, like in Matthew Watson's answer.
Attention: The List gets modified when the Sort() method is called. If you don't want that, you must work with a copy of your list, like this:
List<int> copy = new List<int>(original);
List<int> copy = original.ToList();
The easiest way to do this is to just sort the data and take N items from the front. This is the recommended way for small data sets - anything more complicated is just not worth it otherwise.
However, for large data sets it can be a lot quicker to do what's known as a Partial Sort.
There are two main ways to do this: Use a heap, or use a specialised quicksort.
The article I linked describes how to use a heap. I shall present a partial sort below:
public static IList<T> PartialSort<T>(IList<T> data, int k) where T : IComparable<T>
{
int start = 0;
int end = data.Count - 1;
while (end > start)
{
var index = partition(data, start, end);
var rank = index + 1;
if (rank >= k)
{
end = index - 1;
}
else if ((index - start) > (end - index))
{
quickSort(data, index + 1, end);
end = index - 1;
}
else
{
quickSort(data, start, index - 1);
start = index + 1;
}
}
return data;
}
static int partition<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
T x = lst[start];
int i = start;
for (int j = start + 1; j <= end; j++)
{
if (lst[j].CompareTo(x) < 0) // Or "> 0" to reverse sort order.
{
i = i + 1;
swap(lst, i, j);
}
}
swap(lst, start, i);
return i;
}
static void swap<T>(IList<T> lst, int p, int q)
{
T temp = lst[p];
lst[p] = lst[q];
lst[q] = temp;
}
static void quickSort<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
if (start >= end)
return;
int index = partition(lst, start, end);
quickSort(lst, start, index - 1);
quickSort(lst, index + 1, end);
}
Then to access the 5th largest element in a list you could do this:
PartialSort(list, 5);
Console.WriteLine(list[4]);
For large data sets, a partial sort can be significantly faster than a full sort.
Addendum
See here for another (probably better) solution that uses a QuickSelect algorithm.
This LINQ approach retrieves the 5th biggest element OR throws an exception WHEN the list is null or contains less than 5 elements:
int fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
throw new Exception("list is null OR has not enough elements");
This one retrieves the 5th biggest element OR null WHEN the list is null or contains less than 5 elements:
int? fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
default(int?);
if(fifth == null) // define behavior
This one retrieves the 5th biggest element OR the smallest element WHEN the list contains less than 5 elements:
if(list == null || list.Count <= 0)
throw new Exception("Unable to retrieve Nth biggest element");
int fifth = list.OrderByDescending(x => x).Take(5).Last();
All these solutions are reliable, they should NEVER throw "unexpected" exceptions.
PS: I'm using .NET 4.7 in this answer.
Here there is a C# implementation of the QuickSelect algorithm to select the nth element in an unordered IList<>.
You have to put all the code contained in that page in a static class, like:
public static class QuickHelpers
{
// Put the code here
}
Given that "library" (in truth a big fat block of code), then you can:
int resA = list.QuickSelect(2, (x, y) => Comparer<int>.Default.Compare(y, x));
int resB = list.QuickSelect(list.Count - 1 - 2);
Now... Normally the QuickSelect would select the nth lowest element. We reverse it in two ways:
For resA we create a reverse comparer based on the default int comparer. We do this by reversing the parameters of the Compare method. Note that the index is 0 based. So there is a 0th, 1th, 2th and so on.
For resB we use the fact that the 0th element is the list-1 th element in the reverse order. So we count from the back. The highest element would be the list.Count - 1 in an ordered list, the next one list.Count - 1 - 1, then list.Count - 1 - 2 and so on
Theorically using Quicksort should be better than ordering the list and then picking the nth element, because ordering a list is on average a O(NlogN) operation and picking the nth element is then a O(1) operation, so the composite is O(NlogN) operation, while QuickSelect is on average a O(N) operation. Clearly there is a but. The O notation doesn't show the k factor... So a O(k1 * NlogN) with a small k1 could be better than a O(k2 * N) with a big k2. Only multiple real life benchmarks can tell us (you) what is better, and it depends on the size of the collection.
A small note about the algorithm:
As with quicksort, quickselect is generally implemented as an in-place algorithm, and beyond selecting the k'th element, it also partially sorts the data. See selection algorithm for further discussion of the connection with sorting.
So it modifies the ordering of the original list.

Variable number of for loops without recursion but with Stack?

I know the usual approach for "variable number of for loops" is said to use a recursive method. But I wonder if I could solve that without recursion and instead with using Stack, since you can bypass recursion with the use of a stack.
My example:
I have a variable number of collections and I need to combine every item of every collection with every other item of the other collections.
// example for collections A, B and C:
A (4 items) + B (8 items) + C (10 items)
4 * 8 * 10 = 320 combinations
I need to run through all those 320 combinations. Yet at compile time I don't know if B or C or D exist. How would a solution with no recursive method but with the use of an instance of Stack look like?
Edit:
I realized Stack is not necessary here at all, while you can avoid recursion with a simple int array and a few while loops. Thanks for help and info.
Not with a stack but without recursion.
void Main()
{
var l = new List<List<int>>()
{
new List<int>(){ 1,2,3 },
new List<int>(){ 4,5,6 },
new List<int>(){ 7,8,9 }
};
var result = CartesianProduct(l);
}
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>()};
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] {item})
);
}
Function taken form Computing a Cartesian Product with LINQ
Here is an example of how to do this. Algorithm is taken from this question - https://stackoverflow.com/a/2419399/5311735 and converted to C#. Note that it can be made more efficient, but I converted inefficient version to C# because it's better illustrates the concept (you can see more efficient version in the linked question):
static IEnumerable<T[]> CartesianProduct<T>(IList<IList<T>> collections) {
// this contains the indexes of elements from each collection to combine next
var indexes = new int[collections.Count];
bool done = false;
while (!done) {
// initialize array for next combination
var nextProduct = new T[collections.Count];
// fill it
for (int i = 0; i < collections.Count; i++) {
var collection = collections[i];
nextProduct[i] = collection[indexes[i]];
}
yield return nextProduct;
// now we need to calculate indexes for the next combination
// for that, increase last index by one, until it becomes equal to the length of last collection
// then increase second last index by one until it becomes equal to the length of second last collection
// and so on - basically the same how you would do with regular numbers - 09 + 1 = 10, 099 + 1 = 100 and so on.
var j = collections.Count - 1;
while (true) {
indexes[j]++;
if (indexes[j] < collections[j].Count) {
break;
}
indexes[j] = 0;
j--;
if (j < 0) {
done = true;
break;
}
}
}
}

Dice Sorensen Distance error calculating Bigrams without using Intersect method

I have been programming an object to calculate the DiceSorensen Distance between two strings. The logic of the operation is not so difficult. You calculate how many two letter pairs exist in a string, compare it with a second string and then perform this equation
2(x intersect y)/ (|x| . |y|)
where |x| and |y| is the number of bigram elements in x & y. Reference can be found here for further clarity https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient
So I have tried looking up how to do the code online in various spots but every method I have come across uses the 'Intersect' method between two lists and as far as I am aware this won't work because if you have a string where the bigram already exists it won't add another one. For example if I had a string
'aaaa'
I would like there to be 3 'aa' bigrams but the Intersect method will only produce one, if i am incorrect on this assumption please tell me cause i wondered why so many people used the intersect method. My assumption is based on the MSDN website https://msdn.microsoft.com/en-us/library/bb460136(v=vs.90).aspx
So here is the code I have made
public static double SorensenDiceDistance(this string source, string target)
{
// formula 2|X intersection Y|
// --------------------
// |X| + |Y|
//create variables needed
List<string> bigrams_source = new List<string>();
List<string> bigrams_target = new List<string>();
int source_length;
int target_length;
double intersect_count = 0;
double result = 0;
Console.WriteLine("DEBUG: string length source is " + source.Length);
//base case
if (source.Length == 0 || target.Length == 0)
{
return 0;
}
//extract bigrams from string 1
bigrams_source = source.ListBiGrams();
//extract bigrams from string 2
bigrams_target = target.ListBiGrams();
source_length = bigrams_source.Count();
target_length = bigrams_target.Count();
Console.WriteLine("DEBUG: bigram counts are source: " + source_length + " . target length : " + target_length);
//now we have two sets of bigrams compare them in a non distinct loop
for (int i = 0; i < bigrams_source.Count(); i++)
{
for (int y = 0; y < bigrams_target.Count(); y++)
{
if (bigrams_source.ElementAt(i) == bigrams_target.ElementAt(y))
{
intersect_count++;
//Console.WriteLine("intersect count is :" + intersect_count);
}
}
}
Console.WriteLine("intersect line value : " + intersect_count);
result = (2 * intersect_count) / (source_length + target_length);
if (result < 0)
{
result = Math.Abs(result);
}
return result;
}
In the code somewhere you can see I call a method called listBiGrams and this is how it looks
public static List<string> ListBiGrams(this string source)
{
return ListNGrams(source, 2);
}
public static List<string> ListTriGrams(this string source)
{
return ListNGrams(source, 3);
}
public static List<string> ListNGrams(this string source, int n)
{
List<string> nGrams = new List<string>();
if (n > source.Length)
{
return null;
}
else if (n == source.Length)
{
nGrams.Add(source);
return nGrams;
}
else
{
for (int i = 0; i < source.Length - n; i++)
{
nGrams.Add(source.Substring(i, n));
}
return nGrams;
}
}
So my understanding of the code step by step is
1) pass in strings
2) 0 length check
3) create list and pass up bigrams into them
4) get the lengths of each bigram list
5) nested loop to check in source position[i] against every bigram in target string and then increment i until no more source list to check against
6) perform equation mentioned above taken from wikipedia
7) if result is negative Math.Abs it to return a positive result (however i know the result should be between 0 and 1 already this is what keyed me into knowing i was doing something wrong)
the source string i used is source = "this is not a correct string" and the target string was, target = "this is a correct string"
the result I got was -0.090909090908
I'm SURE (99%) that what I'm missing is something small like a mis-calculated length somewhere or a count mis-count. If anyone could point out what i'm doing wrong I'd be really grateful. Thank you for your time!
This looks like homework, yet this similarity metric on strings is new to me so I took a look.
Algorith implementation in various languages
As you may notice the C# version uses HashSet and takes advantage of the IntersectWith method.
A set is a collection that contains no duplicate elements, and whose
elements are in no particular order.
This solves your string 'aaaa' puzzle. Only one bigram there.
My naive implementation on Rextester
If you prefer Linq then I'd suggest Enumerable.Distinct, Enumerable.Union and Enumerable.Intersect. These should mimic very well the duplicate removal capabilities of the HashSet.
Also found this nice StringMetric framework written in Scala.

How to display the items from a dictionary in a random order but no two adjacent items being the same

First of all, there is actually more restrictions than stated in the title. Plz readon.
say, i have a dictionary<char,int> where key acts as the item, and value means the number of occurrence in the output. (somewhat like weighting but without replacement)
e.g. ('a',2) ('b',3) ('c',1)
a possible output would be 'babcab'
I am thinking of the following way to implement it.
build a new list containing (accumulated weightings,char) as its entry.
randomly select an item from the list,
recalculate the accumulated weightings, also set the recent drawn item weighing as 0.
repeat.
to some extent there might be a situation like such: 'bacab' is generated, but can do no further (as only 'b' left, but the weighting is set to 0 as no immediate repetition allowed). in this case i discard all the results and start over from the very beginning.
Is there any other good approach?
Also, what if i skip the "set the corresponding weighting to 0" process, instead I reject any infeasible solution. e.g. already i got 'bab'. In the next rng selection i get 'b', then i redo the draw process, until i get something that is not 'b', and then continue. Does this perform better?
How about this recursive algorithm.
Create a list of all characters (candidate list), repeating them according to their weight.
Create an empty list of characters (your solution list)
Pick a random entry from the candidate list
If the selected item (character) is the same as the last in solution list then start scanning for another character in the candidate list (wrapping around if needed).
If no such character in step 4 can be found and candidate list is not empty then backtrack
Append the selected character to the solution list.
If the candidate list is empty print out the solution and 'backtrack', else go to step 3.
I'm not quite sure about the 'backtrack' step yet, but you should get the general idea.
Try this out, it should generate a (pseudo) random ordering of the elements in your enumerable. I'd recommend flattening from your dictionary to a list:
AKA Dictionary of
{b, 2}, {a, 3} becomes {b} {b} {a} {a} {a}
public static IEnumerable<T> RandomPermutation<T>(this IEnumerable<T> enumerable)
{
if (enumerable.Count() < 1)
throw new InvalidOperationException("Must have some elements yo");
Random random = new Random(DateTime.Now.Millisecond);
while (enumerable.Any())
{
int currentCount = enumerable.Count();
int randomIndex = random.Next(0, currentCount);
yield return enumerable.ElementAt(randomIndex);
if (randomIndex == 0)
enumerable = enumerable.Skip(1);
else if (randomIndex + 1 == currentCount)
enumerable = enumerable.Take(currentCount - 1);
else
{
T removeditem = enumerable.ElementAt(randomIndex);
enumerable = enumerable.Where(item => !item.Equals(removeditem));
}
}
}
If you need additional permutations, simply call it again for another random ordering. While this wont get you every permutation, you should find something useful. You can also use this as a base line to get a solution going.
This should be split into some seperate methods and could use some refactoring but the idea is to implement it in such a way that it does not depend on randomly moving things around till you get a valid result. That way you can't predict how long it would take
Concatenate all chars to a string and randomize that string
Loop through the randomized string and find any char that violates the rule
Remove that char from the string
Pick a random number. Use this number as "put the removed char at the nth valid position")
Loop around the remaining string to find the Nth valid postion to put the char back.
If there is no valid position left drop the char
Repeat from step 2 until no more violations are found
using System;
using System.Collections.Generic;
namespace RandomString
{
class Program
{
static void Main(string[] args)
{
Random rnd = new Random(DateTime.Now.Millisecond);
Dictionary<char, int> chars = new Dictionary<char, int> { { 'a', 2 }, { 'b', 3 }, { 'c', 1 } };
// Convert to a string with all chars
string basestring = "";
foreach (var pair in chars)
{
basestring += new String(pair.Key, pair.Value);
}
// Randomize the string
string randomstring = "";
while (basestring.Length > 0)
{
int randomIndex = rnd.Next(basestring.Length);
randomstring += basestring.Substring(randomIndex, 1);
basestring = basestring.Remove(randomIndex, 1);
}
// Now fix 'violations of the rule
// this can be optimized by not starting over each time but this is easier to read
bool done;
do
{
Console.WriteLine("Current string: " + randomstring);
done = true;
char lastchar = randomstring[0];
for (int i = 1; i < randomstring.Length; i++)
{
if (randomstring[i] == lastchar)
{
// uhoh violation of the rule. We pick a random position to move it to
// this means it gets placed at the nth location where it doesn't violate the rule
Console.WriteLine("Violation at position {0} ({1})", i, randomstring[i]);
done = false;
char tomove = randomstring[i];
randomstring = randomstring.Remove(i, 1);
int putinposition = rnd.Next(randomstring.Length);
Console.WriteLine("Moving to {0}th valid position", putinposition);
bool anyplacefound;
do
{
anyplacefound = false;
for (int replace = 0; replace < randomstring.Length; replace++)
{
if (replace == 0 || randomstring[replace - 1] != tomove)
{
// then no problem on the left side
if (randomstring[replace] != tomove)
{
// no problem right either. We can put it here
anyplacefound = true;
if (putinposition == 0)
{
randomstring = randomstring.Insert(replace, tomove.ToString());
break;
}
putinposition--;
}
}
}
} while (putinposition > 0 && anyplacefound);
break;
}
lastchar = randomstring[i];
}
} while (!done);
Console.WriteLine("Final string: " + randomstring);
Console.ReadKey();
}
}
}

Categories

Resources