Recursive woes - reducing an input string

Recursive woes - reducing an input string - c#

I'm working on a portion of code that is essentially trying to reduce a list of strings down to a single string recursively.
I have an internal database built up of matching string arrays of varying length (say array lengths of 2-4).
An example input string array would be:
{"The", "dog", "ran", "away"}
And for further example, my database could be made up of string arrays in this manner:
(length 2) {{"The", "dog"},{"dog", "ran"}, {"ran", "away"}}
(length 3) {{"The", "dog", "ran"}.... and so on
So, what I am attempting to do is recursively reduce my input string array down to a single token. So ideally it would parse something like this:
1) {"The", "dog", "ran", "away"}
Say that (seq1) = {"The", "dog"} and (seq2) = {"ran", "away"}
2) { (seq1), "ran", "away"}
3) { (seq1), (seq2)}
In my sequence database I know that, for instance, seq3 = {(seq1), (seq2)}
4) { (seq3) }
So, when it is down to a single token, I'm happy and the function would end.
Here is an outline of my current program logic:
public void Tokenize(Arraylist<T> string_array, int current_size)
{
// retrieve all known sequences of length [current_size] (from global list array)
loc_sequences_by_length = sequences_by_length[current_size-min_size]; // sequences of length 2 are stored in position 0 and so on
// escape cases
if (string_array.Count == 1)
{
// finished successfully
return;
}
else if (string_array.Count < current_size)
{
// checking sequences of greater length than input string, bail
return;
}
else
{
// split input string into chunks of size [current_size] and compare to local database
// of known sequences
// (splitting code works fine)
foreach (comparison)
{
if (match_found)
{
// update input string and recall function to find other matches
string_array[found_array_position] = new_sequence;
string_array.Removerange[found_array_position+1, new_sequence.Length-1];
Tokenize(string_array, current_size)
}
}
}
// ran through unsuccessfully, increment length and try again for new sequence group
current_size++;
if (current_size > MAX_SIZE)
return;
else
Tokenize(string_array, current_size);
}
I thought it was straightforward enough, but have been getting some strange results.
Generally it appears to work, but upon further review of my output data I'm seeing some issues. Mainly, it appears to work up to a certain point...and at that point my 'curr_size' counter resets to the minimum value.
So it is called with a size of 2, then 3, then 4, then resets to 2.
My assumption was that it would run up to my predetermined max size, and then bail completely.
I tried to simplify my code as much as possible, so there are probably some simple syntax errors in transcribing. If there is any other detail that may help an eagle-eyed SO user, please let me know and I'll edit.
Thanks in advance

One bug is:
string_array[found_array_position] = new_sequence;
I don't know where this is defined, and as far as I can tell if it was defined, it is never changed.
In your if statement, when if match_found ever set to true?
Also, it appears you have an extra close brace here, but you may want the last block of code to be outside of the function:
}
}
}
It would help if you cleaned up the code, to make it easier to read. Once we get past the syntactic errors it will be easier to see what is going on, I think.

Not sure what all the issues are, but the first thing I'd do is have your "catch-all" exit block right at the beginning of your method.
public void Tokenize(Arraylist<T> string_array, int current_size)
{
if (current_size > MAX_SIZE)
return;
// Guts go here
Tokenize(string_array, ++current_size);
}
A couple things:
Your tokens are not clearly separated from your input string values. This makes it more difficult to handle, and to see what's going on.
It looks like you're writing pseudo-code:
loc_sequences_by_length is not used
found_array_position is not defined
Arraylist should be ArrayList.
etc.
Overall I agree with James' statement:
It would help if you cleaned up the
code, to make it easier to read.
-Doug

Related

Dealing with duplicate characters

I'm making a wild west duelling game based on typing of the dead. You have a word to write in a certain amount of time. You win if you type out the word in time, you lose if you type it incorrectly/press the wrong button or if the time runs out.
Currently I've got everything working fine. A slight issue, however, is with how I'm dealing with displaying the letters you have to type on the screen.
Each character is stored into an array that is looped through and displayed on the screen. When the player presses the correct button, the corresponding display should turn red which it does most of the time. The times where it doesn't is when there are duplicate characters.
For example if I was typing the word 'dentist', when I type the first t, it won't turn red. However, when I get to the second t and press it, both turn red. I assume this is because I'm looping through each displayed character and checking to see if it's relevant input is being pressed and because there's two and I can only type one character at a time one is always false which 'overrides' the one that is true. I'm not sure how to implement a solution with how I'm currently dealing input so any help is appreciated!
Code:
if (Duelling)
{
if (currentWord.Count > 0 && Input.inputString == currentWord[0].ToLower())
{
print(Input.inputString);
string pressedKey = currentWord[0];
currentWord.Remove(currentWord[0]);
}
else if (Input.inputString != "" && Input.inputString != currentWord[0].ToLower())
{
DuelLost();
}
if (currentWord.Count <= 0)
{
DuelWon();
}
foreach(Transform Keypad in keyDisplay.transform)
{
//print(Keypad.Find("KeyText").GetComponent<Text>().text);
Keypad.Find("KeyText").GetComponent<Text>().color = currentWord.Contains(Keypad.Find("KeyText").GetComponent<Text>().text) ? Color.black : Color.red;
}
}

I believe the issue lies in your colour-updating logic. Contains naturally returns true if your array, well, contains the text you're looking for. Since the second T in "dentist" is still present in the array after you type the first one in, the component isn't going to change its colour. When inputting the second T, all instances of Ts are cleared from the list, and since you loop over all of your Text components all the time, both of them will become red.
No offence, but you're going about this rather... crudely. Allow me to suggest a more elegant method:
public String currentWord;
private List<Text> letterViews = new List<Text>();
private int curIndex = 0;
void Start() {
// Populate the list of views ONCE, don't look for them every single time
letterViews = ... // How you do this is entirely up to you
}
void Update() {
// ...
if (Duelling) {
// If we've gone through the whole word, we're good
if (curIndex >= currentWord.Length) DuelWon();
// Now check input:
// Note that inputString, which I've never used before, is NOT a single character, but
// you're using only its first character; I'll do the same, as your solution seems to work.
if (Input.inputString[0] == currentWord[currentIndex]) {
// If the correct character was typed, make the label red and increment index
letterViews[currentIndex].color = Color.red;
currentIndex++;
}
else DuelLost();
}
}
I daresay that this is a much simpler solution. DuelWon and DuelLost shall reset the index to 0, clear the text in all letterViews and turn them back to black, perhaps.
How to populate the list of views: you can make it public and manually link them by hand through the inspector (boring), or you can do it iteratively using Transform.GetChild(index). You've probably got enough Text views to accommodate your longest words; I recommend filling the list up with them all. You only do it once, you lose no performance by doing so, and you can re-use them for any words in your dictionary.

Replace char(0x10) with a String (The Optimized way)

This is a common question but I hope this does not get tagged as a duplicate since the nature of the question is different (please read the whole not only the title)
Unaware of the existence of String.Replace I wrote the following:
int theIndex = 0;
while ((theIndex = message.IndexOf(separationChar, theIndex)) != -1) //we found the character
{
theIndex++;
if (theIndex < message.Length)//not in the last position
{
message = message.Insert(theIndex, theTime);
}
else
{
// I dont' think this is really neccessary
break;
}
} //while finding characters
As you can see I am replacing occurrences of separationChar in the message String with a String called "theTime".
Now, this works ok for small strings but I have been given a really huge String (in the order of several hundred Kbytes- by the way is there a limit for String or StringBuilder??) and it takes a lot of time...
So my questions are:
1) Is it more efficient if I just do
oldString=separationChar.ToString();
newString=oldString.Insert(theTime);
message= message.Replace(oldString,newString);
2) Is there any other way I can process very long Strings to insert a String (theTime) when finding some char in a very fast and efficient way??
Thanks a lot

As Danny already mentioned, string.Insert() actually creates a new instance each time you use it, and these also have to be garbage collected at some point.
You could instead start with an empty StringBuilder to construct the result string:
public static string Replace(this string str, char find, string replacement)
{
StringBuilder result = new StringBuilder(str.Length); // initial capacity
int pointer = 0;
int index;
while ((index = str.IndexOf(find, pointer)) >= 0)
{
// Append the unprocessed data up to the character
result.Append(str, pointer, index - pointer);
// Append the replacement string
result.Append(replacement);
// Next unprocessed data starts after the character
pointer = index + 1;
}
// Append the remainder of the unprocessed data
result.Append(str, pointer, str.Length - pointer);
return result.ToString();
}
This will not cause a new string to be created (and garbage collected) for each occurrence of the character. Instead, when the internal buffer of the StringBuilder is full, it will create a new buffer chunk "of sufficient capacity". Quote from reference source, when its buffer is full:
Compute the length of the new block we need
We make the new chunk at least big enough for the current need (minBlockCharCount), but also as big as the current length (thus doubling capacity), up to a maximum
(so we stay in the small object heap, and never allocate really big chunks even if
the string gets really big).

Thank you for answering my question.
I am writing an answer because I have to report that I tried the solution in my question 1) and it is indeed more efficient according to the results of my program. String.Replace can replace a string(from a char) with another string very fast.
oldString=separationChar.ToString();
newString=oldString.Insert(theTime);
message= message.Replace(oldString,newString);

Can I GET and SET an array in C#?

HOMEWORK QUESTION:
I need to create a simple trivia game that reads from a CSV file. My data for a particular question is structured as follows: "Question;AnswerA;AnswerB;AnswerC;AnswerD;CorrectAnswerLetter".
We're using a series of getters and setters to hold all the relevant data for a single question object, and I'm running into a problem with the array I've created to hold the four answers.
In my constructor, I'm using this code--which I believe instantiates the Answer array in question:
class TriviaQuestionUnit
{
...
const int NUM_ANSWERS = 4;
string[] m_Answers = new String[NUM_ANSWERS];
public string[] Answer
{
get { return m_Answers[]; }
set { m_Answers = value[];
}
...
// Answer array
public string[] GETAnswer(int index)
{
return m_Questions[index].Answer;
}
...
}
I'm accessing the getter and setter from my TriviaQuestionBank method, which includes this code:
...
const int NUM_QUESTIONS = 15;
TriviaQuestionUnit[] m_Questions = new TriviaQuestionUnit[NUM_QUESTIONS];
...
// Answer array
public string[] GETAnswer(int index)
{
return m_Questions[index].Answer;
}
...
I'm using using StreamReader to read a line of input from my file
...
char delim = ';';
String[] inputValues = inputText.Split(delim);
...
parses the input in an array from which I create the question data. For my four answers, index 1 through 4 in the inputValues array, I populate this question's array with four answers.
...
for (int i = 0; i < NUM_ANSWERS; i++)
{
m_Questions[questionCounter].Answer[i] = inputValues[i + 1];
}
...
I'm getting errors of Syntax code, value expected on the getters/setters in my constructor, and if I change the variable to m_Answers[NUM_QUESTIONS] I get an error that I can't implicitly convert string to String[].
Hopefully I've posted enough code for someone to help point me in the right direction. I feel like I'm missing something obvious, but I just cannot make this work.

Your code has some errors that will cause compilation errors, so my first lesson for you is going to be: listen to the compiler. Some of the errors might seem a bit hard to understand sometimes, but I can ensure you that a lot of other people have had the same problems before; googling a compiler error often gives you examples from other people that are similar to your issue.
You say "In my constructor", but the problem is that your code does not have a constructor. You do however initialize fields and properties on your class and surely enough, the compiler will create a default constructor for you, but you have not defined one yourself. I am not saying that your code does not work because you do not have a constructor, but you might be using the wrong terms.
The first problem is in your first code snippet inside TriviaQuestionUnit. Your first two lines are working correctly, you are creating a constant integer with the value 4 that you use to determine how large your array is going to be and then you initialize the array with that given number.
When you do new string[NUM_ANSWERS] this will create an array, with default (empty) values.
The first problem that arises in your code is the getters and setters. The property expects you to return an array of strings which the method signature in fact is telling us:
public string[] Answer
However, looking at the getter and setter, what is it that you return?
m_Answers is a "reference" to your array, hence that whenever you write m_Answers you are referring to that array. So what happens when we add the square brackets?
Adding [] after the variable name of an array indicates that we want to retrieve a value from within the array. This is called the indexer, we supply it with an index of where we want to retrieve the value from within the array (first value starts at index 0). However, you don't supply it with a value? So what is returned?
Listen to the compiler!
Indexer has 1 parameter(s) but is invoked with (0) argument(s)
What does this tell you? It tells you that it doesn't expect the empty [] but it would expect you to supply the indexer with a number, for instance 0 like this: [0]. The problem with doing that here though, is that this would be a miss-match to the method signature.
So what is it that we want?
We simply want to return the array that we created, so just remove [] and return m_Answers directly like this:
public string[] Answer
{
get { return m_Answers; }
set { m_Answers = value; }
}
Note that you were also missing a curly bracket at the end if the set.
When fixing this, there might be more issues in your code, but trust the compiler and try to listen to it!

All the paths between 2 nodes in graph

I have to make an uninformed search (Breadth-first-Search) program which takes two nodes and return all the paths between them.
public void BFS(Nod start, Nod end) {
Queue<Nod> queue = new Queue<Nod>();
queue.Enqueue(start);
while (queue.Count != 0)
{
Nod u = queue.Dequeue();
if (u == end) break;
else
{
u.data = "Visited";
foreach (Edge edge in u.getChildren())
{
if (edge.getEnd().data == "")
{
edge.getEnd().data = "Visited";
if (edge.getEnd() != end)
{
edge.getEnd().setParent(u);
}
else
{
edge.getEnd().setParent(u);
cost = 0;
PrintPath(edge.getEnd(), true);
edge.getEnd().data = "";
//return;
}
}
queue.Enqueue(edge.getEnd());
}
}
}
}
My problem is that i only get two paths instead of all and i don't know what to edit in my code to get them all. The input of my problem is based on this map :

In the BFS algorithm you must not stop after you find a solution. One idea is to set data null for all the cities you visited except the first one and let the function run a little bit longer. I don't have time to write you a snippet but if ou don't get it i will write at least a pseudocode. If you didn't understood my idea post a comment with your question and i will try to explain better.

Breadth first search is a strange way to generate all possible paths for the following reason: you'd need to keep track of whether each individual path in the BFS had traversed the node, not that it had been traversed at all.
Take a simple example
1----2
\ \
3--- 4----5
We want all paths from 1 to 5. We queue up 1, then 2 and 3, then 4, then 5. We've lost the fact that there are two paths through 4 to 5.
I would suggest trying to do this with DFS, though this may be fixable for BFS with some thinking. Each thing queued would be a path, not a single node, so one could see if that path had visited each node. This is wasteful memory wise, thoug

A path is a sequence of vertices where no vertex is repeated more than once. Given this definition, you could write a recursive algorithm which shall work as follows: Pass four parameters to the function, call it F(u, v, intermediate_list, no_of_vertices), where u is the current source (which shall change as we recurse), v is the destination, intermediate_list is a list of vertices which shall be initially empty, and every time we use a vertex, we'll add it to the list to avoid using a vertex more than once in our path, and no_of_vertices is the length of the path that we would like to find, which shall be lower bounded by 2, and upper bounded by V, the number of vertices. Essentially, the function shall return a list of paths whose source is u, destination is v, and whose length of each path is no_of_vertices. Create an initial empty list and make calls to F(u, v, {}, 2), F(u, v, {}, 3), ..., F(u, v, {}, V), each time merging the output of F with the list where we intend to store all paths. Try to implement this, and if you still face trouble, I'll write the pseudo-code for you.
Edit: Solving the above problem using BFS: Breadth first search is an algorithm that could be used to explore all the states of a graph. You could explore the graph of all paths of the given graph, using BFS, and select the paths that you want. For each vertex v, add the following states to the queue: (v, {v}, {v}), where each state is defined as: (current_vertex, list_of_vertices_already_visited, current_path). Now, while the queue is not empty, pop off the top element of the queue, for each edge e of the current_vertex, if the tail vertex x doesn't already exist in the list_of_vertices_already_visited, push the new state (x, list_of_vertices_already_visited + {x}, current_path -> x) to the queue, and process each path as you pop it off the queue. This way you can search the entire graph of paths for a graph, whether directed, or undirected.

Sounds like homework. But the fun kind.
The following is pseudocode, is depth first instead of breath first (so should be converted to a queue type algorithm, and may contain bugs, but the general jist should be clear.
class Node{
Vector[Link] connections;
String name;
}
class Link{
Node destination;
int distance;
}
Vector[Vector[Node]] paths(Node source, Node end_dest, Vector[Vector[Node]] routes){
for each route in routes{
bool has_next = false;
for each connection in source.connections{
if !connection.destination in route {
has_next = true;
route.push(destination);
if (!connection.destination == end_dest){
paths(destination, end_dest, routes);
}
}
}
if !has_next {
routes.remove(route) //watch out here, might mess up the iteration
}
}
return routes;
}
Edit: Is this actually the answer to the question you are looking for? Or do you actually want to find the shortest path? If it's the latter, use Dijkstra's algorithm: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm

Piglatin using Arrays

Last night I was messing around with Piglatin using Arrays and found out I could not reverse the process. How would I shift the phrase and take out the Char's "a" and "y" at the end of the word and return the original word in the phrase.
For instance if I entered "piggy" it would come out as "iggypay" shifting the word piggy so "p" is at the end of the word and "ay" is appended.
Here is the example code so you can try it as well.
public string ay;
public string PigLatin(string phrase)
{
string[] pLatin;
ArrayList pLatinPhrase = new ArrayList();
int wordLength;
pLatin = phrase.Split();
foreach (string pl in pLatin)
{
wordLength = pl.Length;
pLatinPhrase.Add(pl.Substring(1, wordLength - 1) + pl.Substring(0, 1) + "ay");
}
foreach (string p in pLatinPhrase)
{
ay += p;
}
return ay;
}
You will notice that is example is not programmed to find vowels and append them to the end along with "ay". Just simply a basic way of doing it.
If you where wondering how to reverse the above try this example of uPiglatinify
public string way;
public string uPigLatinify(string word)
{
string[] latin;
int wordLength;
// Using arrraylist to store split words.
ArrayList Phrase = new ArrayList();
// Split string phrase into words.
latin = word.Split(' ');
foreach (string i in latin)
{
wordLength = i.Length;
if (wordLength > 0)
{
// Grab 3rd letter from the end of word and append to front
// of word chopping off "ay" as it was not included in the indexing.
Phrase.Add(i.Substring(wordLength - 3, 1) + i.Substring(0, wordLength - 3) + " ");
}
}
foreach (string _word in Phrase)
{
// Add words to string and return.
way += _word;
}
return way;
}

Please don’t take this the wrong way, but although you can probably get people here to give you the C# code to implement the algorithm you want, I suspect this is not enough if you want to learn how it works. To learn the basics of programming, there are some good tutorials to delve into (whether websites or books). In particular, if you aspire to be a programmer, you will need to learn not just how to write code. In your example:
You should first write a specification of what your PigLatin function is supposed to do. Think about all the corner-cases: What if the first letter is a vowel? What if there are several consonants at the beginning? What if there are only consonants? What if the input starts with a number, a parenthesis, or a space? What if the input string is empty? Write down exactly what should happen in all of these cases — even if it’s “throw an exception”.
Only then can you implement the algorithm according to the specification (i.e. write the actual C# code). While doing this, you may find that the specification is incomplete, in which case you need to go back and correct it.
Once your code is finished, you need to test it. Run it on several testcases, especially the corner-cases you came up with above: For example, try PigLatin("air"), PigLatin("x"), PigLatin("1"), PigLatin(""), etc. In each case, make yourself aware first what behaviour you expect, and then see if the behaviour matches your expectation. If it doesn’t, you need to go back and fix the code.
Once you have implemented the forward PigLatin algorithm and it works (read: passes all your testcases), then you will already have the skills needed to write the reverse function youself. I guarantee you that you will feel achieved and excited then! Whereas, if you just copy the code from this website, you are setting yourself up for feeling dumb because you will think other people can do it and you can’t.
Of course, we are nonetheless happy to help you with specific technical questions, for example “What is the difference between ArrayList and List<string>?” or “What does the scope of a local variable mean?” (but search first — these may have already been asked before) — but you probably shouldn’t ask to have the code fully written and finished for you.

The work to split the phrase into words and recombine the words after transforming them is the same as in the original case. The difficulty is in un-pig-latin-ifying an individual word. With some error checking, I imagine you could do this:
string UnPigLatinify(string word)
{
if ((word == null) || !Regex.IsMatch(word, #"^\w+ay$", RegexOptions.IgnoreCase))
return word;
return word[word.Length - 3] + word.Substring(0, word.Length - 3);
}
The regular expression just checks to make sure the word is at least 3 letters long, composed of characters, and ends with "ay".
The actual transform takes the third to last letter (the original first letter) and appends the rest of the word minus the "ay" and the original letter.
Is this what you meant?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.