Shuffle an array without creating any runs

Shuffle an array without creating any runs - c#

I have an array of repeating letters:
AABCCD
and I would like to put them into pseudo-random order. Simple right, just use Fisher-Yates => done. However there is a restriction on the output - I don't want any runs of the same letter. I want at least two other characters to appear before the same character reappears. For example:
ACCABD
is not valid because there are two Cs next to each other.
ABCACD
is also not valid because there are two C's next to each other (CAC) with only one other character (A) between them, I require at least two other characters.
Every valid sequence for this simple example:
ABCADC ABCDAC ACBACD ACBADC ACBDAC ACBDCA ACDABC ACDACB ACDBAC ACDBCA
ADCABC ADCBAC BACDAC BCADCA CABCAD CABCDA CABDAC CABDCA CADBAC CADBCA
CADCAB CADCBA CBACDA CBADCA CDABCA CDACBA DACBAC DCABCA
I used a brute force approach for this small array but my actual problem is arrays with hundreds of elements. I've tried using Fisher-Yates with some suppression - do normal Fisher-Yates and then if you don't like the character that comes up, try X more times for a better one. Generates valid sequences about 87% of the time only and is very slow. Wondering if there's a better approach. Obviously this isn't possible for all arrays. An array of just "AAB" has no valid order, so I'd like to fail down to the best available order of "ABA" for something like this.

Here is a modified Fisher-Yates approach. As I mentioned, it is very difficult to generate a valid sequence 100% of the time, because you have to check that you haven't trapped yourself by leaving only AAA at the end of your sequence.
It is possible to create a recursive CanBeSorted method, which tells you whether or not a sequence can be sorted according to your rules. That will be your basis for a full solution, but this function, which returns a boolean value indicating success or failure, should be a starting point.
public static bool Shuffle(char[] array)
{
var random = new Random();
var groups = array.ToDictionary(e => e, e => array.Count(v => v == e));
char last = '\0';
char lastButOne = '\0';
for (int i = array.Length; i > 1; i--)
{
var candidates = groups.Keys.Where(c => groups[c] > 0)
.Except(new[] { last, lastButOne }).ToList();
if (!candidates.Any())
return false;
var #char = candidates[random.Next(candidates.Count)];
var j = Array.IndexOf(array.Take(i).ToArray(), #char);
// Swap.
var tmp = array[j];
array[j] = array[i - 1];
array[i - 1] = tmp;
lastButOne = last;
last = #char;
groups[#char] = groups[#char] - 1;
}
return true;
}

Maintain a link list that will keep track of the letter and it's position in the result.
After getting the random number,Pick it's corresponding character from the input(same as Fisher-Yates) but now search in the list whether it has already occurred or not.
If not, insert the letter in the result and also in the link list with its position in the result.
If yes, then check it's position in the result(that you have stored in the link list when you have written that letter in result). Now compare this location with the current inserting location, If mod(currentlocation-previouslocation) is 3 or greater, you can insert that letter in the result otherwise not, if not choose the random number again.

Related

Find remaining elements in the sequence

everyone. I've this small task to do:
There are two sequences of numbers:
A[0], A[1], ... , A[n].
B[0], B[1], ... , B[m].
Do the following operations with the sequence A:
Remove the items whose indices are divisible by B[0].
In the items remained, remove those whose indices are divisible by B[1].
Repeat this process up to B[m].
Output the items finally remained.
Input is like this: (where -1 is delimiter for two sequences A and B)
1 2 4 3 6 5 -1 2 -1
Here goes my code (explanation done via comments):
List<int> result = new List<int>(); // list for sequence A
List<int> values = new List<int>(); // list for holding value to remove
var input = Console.ReadLine().Split().Select(int.Parse).ToArray();
var len = Array.IndexOf(input, -1); // getting index of the first -1 (delimiter)
result = input.ToList(); // converting input array to List
result.RemoveRange(len, input.Length - len); // and deleting everything beyond first delimiter (including it)
for (var i = len + 1; i < input.Length - 1; i++) // for the number of elements in the sequence B
{
for (var j = 0; j < result.Count; j++) // going through all elmnts in sequence A
{
if (j % input[i] == 0) // if index is divisible by B[i]
{
values.Add(result[j]); // adding associated value to List<int> values
}
}
foreach (var value in values) // after all elements in sequence A have been looked upon, now deleting those who apply to criteria
{
result.Remove(value);
}
}
But the problem is that I'm only passing 5/11 tests cases. The 25% is 'Wrong result' and the rest 25% - 'Timed out'. I understand that my code is probably very badly written, but I really can't get to understand how to improve it.
So, if someone more experienced could explain (clarify) next points to me it would be very cool:
1. Am I doing parsing from the console input right? I feel like it could be done in a more elegant/efficient way.
2. Is my logic of getting value which apply to criteria and then storing them for later deleting is efficient in terms of performance? Or is there any other way to do it?
3. Why is this code not passing all test-cases or how would you change it in order to pass all of them?

I'm writing the answer once again, since I have misunderstood the problem completely. So undoubtly the problem in your code is a removal of elements. Let's try to avoid that. Let's try to make a new array C, where you can store all the correct numbers that should be left in the A array after each removal. So if index id is not divisible by B[i], you should add A[id] to the array C. Then, after checking all the indices with the B[i] value, you should replace the array A with the array C and do the same for B[i + 1]. Repeat until you reach the end of the array B.
The algorithm:
1. For each value in B:
2. For each id from 1 to length(A):
3. If id % value != 0, add A[id] to C
4. A = C
5. Return A.
EDIT: Be sure to make a new array C for each iteration of the 1. loop (or clear C after replacing A with it)

how to add a sign between each letter in a string in C#?

I have a task, in which i have to write a function called accum, which transforms given string into something like this:
Accumul.Accum("abcd"); // "A-Bb-Ccc-Dddd"
Accumul.Accum("RqaEzty"); // "R-Qq-Aaa-Eeee-Zzzzz-Tttttt-Yyyyyyy"
Accumul.Accum("cwAt"); // "C-Ww-Aaa-Tttt"
So far I only converted each letter to uppercase and... Now that I am writing about it, I think it could be easier for me to - firstly multiply the number of each letter and then add a dash there... Okay, well let's say I already multiplied the number of them(I will deal with it later) and now I need to add the dash. I tried several manners to solve this, including: for and foreach(and now that I think of it, I can't use foreach if I want to add a dash after multiplying the letters) with String.Join, String.Insert or something called StringBuilder with Append(which I don't exactly understand) and it does nothing to the string.
One of those loops that I tried was:
for (int letter = 0; letter < s.Length-1; letter += 2) {
if (letter % 2 == 0) s.Replace("", "-");
}
and
for (int letter = 0; letter < s.Length; letter++) {
return String.Join(s, "-");
}
The second one returns "unreachable code" error. What am I doing wrong here, that it does nothing to the string(after uppercase convertion)? Also, is there any method to copy each letter, in order to increase the number of them?

As you say string.join can be used as long as an enumerable is created instead of a foreach. Since the string itself is enumerable, you can use the Linq select overload which includes an index:
var input = "abcd";
var res = string.Join("-", input.Select((c,i) => Char.ToUpper(c) + new string(Char.ToLower(c),i)));
(Assuming each char is unique or can be used. e.g. "aab" would become "A-Aa-Bbb")
Explanation:
The Select extension method takes a lambda function as parameter with c being a char and i the index. The lambda returns an uppercase version of the char (c) folowed by a string of the lowercase char of the index length (new string(char,length)), (which is an empty string for the first index). Finally the string.join concatenates the resulting enumeration with a - between each element.

Use this code.
string result = String.Empty;
for (int i = 0; i < s.Length; i++)
{
char c = s[i];
result += char.ToUpper(c);
result += new String(char.ToLower(c), i);
if (i < s.Length - 1)
{
result += "-";
}
}
It will be better to use StringBuilder instead of strings concatenation, but this code can be a bit more clear.

Strings are immutable, which means that you cannot modify them once you created them. It means that Replace function return a new string that you need to capture somehow:
s = s.Replace("x", "-");
you currently are not assigning the result of the Replace method anywhere, that's why you don't see any results

For the future, the best way to approach problems like this one is not to search for the code snippet, but write down step by step algorithm of how you can achieve the expected result in plain English or some other pseudo code, e.g.
Given I have input string 'abcd' which should turn into output string 'A-Bb-Ccc-Dddd'.
Copy first character 'a' from the input to Buffer.
Store the index of the character to Index.
If Buffer has only one character make it Upper Case.
If Index is greater then 1 trail Buffer with Index-1 lower case characters.
Append dash '-' to the Buffer.
Copy Buffer content to Output and clear Buffer.
Copy second character 'b' from the input to Buffer.
...
etc.
Aha moment often happens on the third iteration. Hope it helps! :)

C# Search array within provided index points

I'm not sure how best to phrase this. I have a text file of almost 80,000 words which I have converted across to a string array.
Basically I want a method where I pass it a word and it checks if it's in the word string array. To save it searching 80,000 each time I have indexed the locations where the words beginning with each letter start and end in a two dimensional array. So wordIndex[0,0] = 0 when the 'A' words start and wordIndex[1,0] = 4407 is where they end. Then wordIndex[0,1] = 4408 which is where the words beginning with 'B' start etc.
What I would like to know is how can I present this range to a method to have it search for a value. I know I can give an index and length but is this the only way? Can I say look for x within range y and z?

Look at Trie set. It can help you to store many words using few memory and quick search. Here is good implementation.

Basically you could use a for loop to search just a part of the array:
string word = "apple";
int start = 0;
int end = 4407;
bool found = false;
for (int i = start; i <= end ; i++)
{
if (arrayOfWords[i] == word)
{
found = true;
break;
}
}
But since the description of your index implies that your array is already sorted a better way might be to go with Array.BinarySearch<T>.

Randomly select a specific quantity of indices from an array?

I have an array of boolean values and need to randomly select a specific quantity of indices for values which are true.
What is the most efficient way to generate the array of indices?
For instance,
BitArray mask = GenerateSomeMask(length: 100000);
int[] randomIndices = RandomIndicesForTrue(mask, quantity: 10);
In this case the length of randomIndices would be 10.

There's a faster way to do this that requires only a single scan of the list.
Consider picking a line at random from a text file when you don't know how many lines are in the file, and the file is too large to fit in memory. The obvious solution is to read the file once to count the lines, pick a random number in the range of 0 to Count-1, and then read the file again up to the chosen line number. That works, but requires you to read the file twice.
A faster solution is to read the first line and save it as the selected line. You replace the selected line with the next line with probability 1/2. When you read the third line, you replace with probability 1/3, etc. When you've read the entire file, you have selected a line at random, and every line had equal probability of being selected. The code looks something like this:
string selectedLine = null;
int numLines = 0;
Random rnd = new Random();
foreach (var line in File.ReadLines(filename))
{
++numLines;
double prob = 1.0/numLines;
if (rnd.Next() >= prob)
selectedLine = line;
}
Now, what if you want to select 2 lines? You select the first two. Then, as each line is read the probability that it will replace one of the two lines is 2/n, where n is the number of lines already read. If you determine that you need to replace a line, you randomly select the line to be replaced. You can follow that same basic idea to select any number of lines at random. For example:
string[] selectedLines = new int[M];
int numLines = 0;
Random rnd = new Random();
foreach (var line in File.ReadLines(filename))
{
++numLines;
if (numLines <= M)
{
selectedLines[numLines-1] = line;
}
else
{
double prob = (double)M/numLines;
if (rnd.Next() >= prob)
{
int ix = rnd.Next(M);
selectedLines[ix] = line;
}
}
}
You can apply that to your BitArray quite easily:
int[] selected = new int[quantity];
int num = 0; // number of True items seen
Random rnd = new Random();
for (int i = 0; i < items.Length; ++i)
{
if (items[i])
{
++num;
if (num <= quantity)
{
selected[num-1] = i;
}
else
{
double prob = (double)quantity/num;
if (rnd.Next() > prob)
{
int ix = rnd.Next(quantity);
selected[ix] = i;
}
}
}
}
You'll need some special case code at the end to handle the case where there aren't quantity set bits in the array, but you'll need that with any solution.
This makes a single pass over the BitArray, and the only extra memory it uses is for the list of selected indexes. I'd be surprised if it wasn't significantly faster than the LINQ version.
Note that I used the probability calculation to illustrate the math. You can change the inner loop code in the first example to:
if (rnd.Next(numLines+1) == numLines)
{
selectedLine = line;
}
++numLines;
You can make a similar change to the other examples. That does the same thing as the probability calculation, and should execute a little faster because it eliminates a floating point divide for each item.

There are two families of approaches you can use: deterministic and non-deterministic. The first one involves finding all the eligible elements in the collection and then picking N at random; the second involves randomly reaching into the collection until you have found N eligible items.
Since the size of your collection is not negligible at 100K and you only want to pick a few out of those, at first sight non-deterministic sounds like it should be considered because it can give very good results in practice. However, since there is no guarantee that N true values even exist in the collection, going non-deterministic could put your program into an infinite loop (less catastrophically, it could just take a very long time to produce results).
Therefore I am going to suggest going for a deterministic approach, even though you are going to pay for the guarantees you need through the nose with resource usage. In particular, the operation will involve in-place sorting of an auxiliary collection; this will practically undo the nice space savings you got by using BitArray.
Theory aside, let's get to work. The standard way to handle this is:
Filter all eligible indices into an auxiliary collection.
Randomly shuffle the collection with Fisher-Yates (there's a convenient implementation on StackOverflow).
Pick the N first items of the shuffled collection. If there are less than N then your input cannot satisfy your requirements.
Translated into LINQ:
var results = mask
.Select((i, f) => Tuple.Create) // project into index/bool pairs
.Where(t => t.Item2) // keep only those where bool == true
.Select(t => t.Item1) // extract indices
.ToList() // prerequisite for next step
.Shuffle() // Fisher-Yates
.Take(quantity) // pick N
.ToArray(); // into an int[]
if (results.Length < quantity)
{
// not enough true values in input
}

If you have 10 indices to choose from, you could generate a random number from 0 to 2^10 - 1, and use that as you mask.

Implementing an efficent algorithm to find the intersection of two strings

Implement an algorithm that takes two strings as input, and returns the intersection of the two, with each letter represented at most once.
Algo: (considering language used will be c#)
Convert both strings into char array
take the smaller array and generate a hash table for it with key as the character and value 0
Now Loop through the other array and increment the count in hash table if that char is present in it.
Now take out all char for hash table whose value is > 0.
These are intersection values.
This is an O(n), solution but is uses extra space, 2 char arrays and a hash table
Can you guys think of better solution than this?

How about this ...
var s1 = "aabbccccddd";
var s2 = "aabc";
var ans = s1.Intersect(s2);

Haven't tested this, but here's my thought:
Quicksort both strings in place, so you have an ordered sequence of characters
Keeping an index into both strings, compare the "next" character from each string, pick and output the first one, incrementing the index for that string.
Continue until you get to the end of one of the strings, then just pull unique values from the rest of the remaining string.
Won't use additional memory, only needs the two original strings, two integers, and an output string (or StringBuilder). As an added bonus, the output values will be sorted too!
Part 2:
This is what I'd write (sorry about the comments, new to stackoverflow):
private static string intersect(string left, string right)
{
StringBuilder theResult = new StringBuilder();
string sortedLeft = Program.sort(left);
string sortedRight = Program.sort(right);
int leftIndex = 0;
int rightIndex = 0;
// Work though the string with the "first last character".
if (sortedLeft[sortedLeft.Length - 1] > sortedRight[sortedRight.Length - 1])
{
string temp = sortedLeft;
sortedLeft = sortedRight;
sortedRight = temp;
}
char lastChar = default(char);
while (leftIndex < sortedLeft.Length)
{
char nextChar = (sortedLeft[leftIndex] <= sortedRight[rightIndex]) ? sortedLeft[leftIndex++] : sortedRight[rightIndex++];
if (lastChar == nextChar) continue;
theResult.Append(nextChar);
lastChar = nextChar;
}
// Add the remaining characters from the "right" string
while (rightIndex < sortedRight.Length)
{
char nextChar = sortedRight[rightIndex++];
if (lastChar == nextChar) continue;
theResult.Append(nextChar);
lastChar = nextChar;
}
theResult.Append(sortedRight, rightIndex, sortedRight.Length - rightIndex);
return (theResult.ToString());
}
I hope that makes more sense.

You don't need to 2 char arrays. The System.String data type has a built-in indexer by position that returns the char from that position, so you could just loop through from 0 to (String.Length - 1). If you're more interested in speed than optimizing storage space, then you could make a HashSet for the one of the strings, then make a second HashSet which will contain your final result. Then you iterate through the second string, testing each char against the first HashSet, and if it exists then add it the second HashSet. By the end, you already have a single HashSet with all the intersections, and save yourself the pass of running through the Hashtable looking for ones with a non-zero value.
EDIT: I entered this before all the comments on the question about not wanting to use any built-in containers at all

here's how I would do this. It's still O(N) and it doesn't use a hash table but instead one int array of length 26. (ideally)
make an array of 26 integers, each element for a letter of the alphebet. init to 0's.
iterate over the first string, decrementing one when a letter is encountered.
iterate over the second string and take the absolute of whatever is at the index corresponding to any letter you encounter. (edit: thanks to scwagner in comments)
return all letters corresponding to all indexes holding value greater than 0.
still O(N) and extra space of only 26 ints.
of course if you're not limited to only lower or uppercase characters your array size may need to change.

"with each letter represented at most once"
I'm assuming that this means you just need to know the intersections, and not how many times they occurred. If that's so then you can trim down your algorithm by making use of yield. Instead of storing the count and continuing to iterate the second string looking for additional matches, you can yield the intersection right there and continue to the next possible match from the first string.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Shuffle an array without creating any runs - c#

Related

Find remaining elements in the sequence

how to add a sign between each letter in a string in C#?

C# Search array within provided index points

Randomly select a specific quantity of indices from an array?

Implementing an efficent algorithm to find the intersection of two strings

Categories

Resources