I am looking for a c# function or routine that will center justify text.
For example, if I have a sentence, I have noticed that when the sentence is justified to the edges of the screen, that spaces are placed in the line. The inserted spaces start in the center and move out from there on both sides as needed as needed.
Is there a C# function that I can pass my string, say 50 chars, and get back a pretty 56 char string?
Thanks in advance,
Rob
Nice task. Here's a solution based on Linq extension methods. If you do not wish to use them, see history for previous version of code. In this example spaces on the left and right sides from center are 'equal' with respect to order of inserting.
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
public static String Justify(String s, Int32 count)
{
if (count <= 0)
return s;
Int32 middle = s.Length / 2;
IDictionary<Int32, Int32> spaceOffsetsToParts = new Dictionary<Int32, Int32>();
String[] parts = s.Split(' ');
for (Int32 partIndex = 0, offset = 0; partIndex < parts.Length; partIndex++)
{
spaceOffsetsToParts.Add(offset, partIndex);
offset += parts[partIndex].Length + 1; // +1 to count space that was removed by Split
}
foreach (var pair in spaceOffsetsToParts.OrderBy(entry => Math.Abs(middle - entry.Key)))
{
count--;
if (count < 0)
break;
parts[pair.Value] += ' ';
}
return String.Join(" ", parts);
}
static void Main(String[] args) {
String s = "skvb sdkvkd s kc wdkck sdkd sdkje sksdjs skd";
String j = Justify(s, 5);
Console.WriteLine("Initial: " + s);
Console.WriteLine("Result: " + j);
Console.ReadKey();
}
}
As far as I know, there is no "built-in" C# or .net library function for this, so you'd have to implement something on your own (or find some code online whose license suits your needs).
A simple greedy algorithm shouldn't be too difficult to implement, however:
Until the required number of characters is reached:
Extend the shortest sequence of spaces by one
(choose one randomly if there is more than one such sequence).
I'd distribute the spaces randomly rather than starting at the center, to make sure the spaces are evenly distributed among your text (rather than concentrated at one position). Oh, and keep in mind that some people consider fully-justified fixed-font text harder to read than left-justified fixed-font text.
Related
I have a string like this:
-82.9494547,36.2913021,0
-83.0784938,36.2347521,0
-82.9537782,36.079235,0
I need to have output like this:
-82.9494547 36.2913021, -83.0784938 36.2347521, -82.9537782,36.079235
I have tried this following to code to achieve the desired output:
string[] coordinatesVal = coordinateTxt.Trim().Split(new string[] { ",0" }, StringSplitOptions.None);
for (int i = 0; i < coordinatesVal.Length - 1; i++)
{
coordinatesVal[i] = coordinatesVal[i].Trim();
coordinatesVal[i] = coordinatesVal[i].Replace(',', ' ');
numbers.Append(coordinatesVal[i]);
if (i != coordinatesVal.Length - 1)
{
coordinatesVal.Append(", ");
}
}
But this process does not seem to me the professional solution. Can anyone please suggest more efficient way of doing this?
Your code is okay. You could dismiss temporary results and chain method calls
var numbers = new StringBuilder();
string[] coordinatesVal = coordinateTxt
.Trim()
.Split(new string[] { ",0" }, StringSplitOptions.None);
for (int i = 0; i < coordinatesVal.Length - 1; i++) {
numbers
.Append(coordinatesVal[i].Trim().Replace(',', ' '))
.Append(", ");
}
numbers.Length -= 2;
Note that the last statement assumes that there is at least one coordinate pair available. If the coordinates can be empty, you would have to enclose the loop and this last statement in if (coordinatesVal.Length > 0 ) { ... }. This is still more efficient than having an if inside the loop.
You ask about efficiency, but you don't specify whether you mean code efficiency (execution speed) or programmer efficiency (how much time you have to spend on it).
One key part of professional programming is to judge which one of these is more important in any given situation.
The other answers do a good job of covering programmer efficiency, so I'm taking a stab at code efficiency. I'm doing this at home for fun, but for professional work I would need a good reason before putting in the effort to even spend time comparing the speeds of the methods given in the other answers, let alone try to improve on them.
Having said that, waiting around for the program to finish doing the conversion of millions of coordinate pairs would give me such a reason.
One of the speed pitfalls of C# string handling is the way String.Replace() and String.Trim() return a whole new copy of the string. This involves allocating memory, copying the characters, and eventually cleaning up the garbage generated. Do that a few million times, and it starts to add up. With that in mind, I attempted to avoid as many allocations and copies as possible.
enum CurrentField
{
FirstNum,
SecondNum,
UnwantedZero
};
static string ConvertStateMachine(string input)
{
// Pre-allocate enough space in the string builder.
var numbers = new StringBuilder(input.Length);
var state = CurrentField.FirstNum;
int i = 0;
while (i < input.Length)
{
char c = input[i++];
switch (state)
{
// Copying the first number to the output, next will be another number
case CurrentField.FirstNum:
if (c == ',')
{
// Separate the two numbers by space instead of comma, then move on
numbers.Append(' ');
state = CurrentField.SecondNum;
}
else if (!(c == ' ' || c == '\n'))
{
// Ignore whitespace, output anything else
numbers.Append(c);
}
break;
// Copying the second number to the output, next will be the ,0\n that we don't need
case CurrentField.SecondNum:
if (c == ',')
{
numbers.Append(", ");
state = CurrentField.UnwantedZero;
}
else if (!(c == ' ' || c == '\n'))
{
// Ignore whitespace, output anything else
numbers.Append(c);
}
break;
case CurrentField.UnwantedZero:
// Output nothing, just track when the line is finished and we start all over again.
if (c == '\n')
{
state = CurrentField.FirstNum;
}
break;
}
}
return numbers.ToString();
}
This uses a state machine to treat incoming characters differently depending on whether they are part of the first number, second number, or the rest of the line, and output characters accordingly. Each character is only copied once into the output, then I believe once more when the output is converted to a string at the end. This second conversion could probably be avoided by using a char[] for the output.
The bottleneck in this code seems to be the number of calls to StringBuilder.Append(). If more speed were required, I would first attempt to keep track of how many characters were to be copied directly into the output, then use .Append(string value, int startIndex, int count) to send an entire number across in one call.
I put a few example solutions into a test harness, and ran them on a string containing 300,000 coordinate-pair lines, averaged over 50 runs. The results on my PC were:
String Split, Replace each line (see Olivier's answer, though I pre-allocated the space in the StringBuilder):
6542 ms / 13493147 ticks, 130.84ms / 269862.9 ticks per conversion
Replace & Trim entire string (see Heriberto's second version):
3352 ms / 6914604 ticks, 67.04 ms / 138292.1 ticks per conversion
- Note: Original test was done with 900000 coord pairs, but this entire-string version suffered an out of memory exception so I had to rein it in a bit.
Split and Join (see Ćukasz's answer):
8780 ms / 18110672 ticks, 175.6 ms / 362213.4 ticks per conversion
Character state machine (see above):
1685 ms / 3475506 ticks, 33.7 ms / 69510.12 ticks per conversion
So, the question of which version is most efficient comes down to: what are your requirements?
Your solution is fine. Maybe you could write it a bit more elegant like this:
string[] coordinatesVal = coordinateTxt.Trim().Split(new string[] { ",0" },
StringSplitOptions.RemoveEmptyEntries);
string result = string.Empty;
foreach (string line in coordinatesVal)
{
string[] numbers = line.Trim().Split(',');
result += numbers[0] + " " + numbers[1] + ", ";
}
result = result.Remove(result.Count()-2, 2);
Note the StringSplitOptions.RemoveEmptyEntries parameter of Split method so you don't have to deal with empty lines into foreach block.
Or you can do extremely short one-liner. Harder to debug, but in simple cases does the work.
string result =
string.Join(", ",
coordinateTxt.Trim().Split(new string[] { ",0" }, StringSplitOptions.RemoveEmptyEntries).
Select(i => i.Replace(",", " ")));
heres another way without defining your own loops and replace methods, or using LINQ.
string coordinateTxt = #" -82.9494547,36.2913021,0
-83.0784938,36.2347521,0
-82.9537782,36.079235,0";
string[] coordinatesVal = coordinateTxt.Replace(",", "*").Trim().Split(new string[] { "*0", Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
string result = string.Join(",", coordinatesVal).Replace("*", " ");
Console.WriteLine(result);
or even
string coordinateTxt = #" -82.9494540,36.2913021,0
-83.0784938,36.2347521,0
-82.9537782,36.079235,0";
string result = coordinateTxt.Replace(Environment.NewLine, "").Replace($",", " ").Replace(" 0", ", ").Trim(new char[]{ ',',' ' });
Console.WriteLine(result);
In short - I want to convert the first answer to the question here from Python into C#. My current solution to splitting conjoined words is exponential, and I would like a linear solution. I am assuming no spacing and consistent casing in my input text.
Background
I wish to convert conjoined strings such as "wickedweather" into separate words, for example "wicked weather" using C#. I have created a working solution, a recursive function using exponential time, which is simply not efficient enough for my purposes (processing at least over 100 joined words). Here the questions I have read so far, which I believe may be helpful, but I cannot translate their responses from Python to C#.
How can I split multiple joined words?
Need help understanding this Python Viterbi algorithm
How to extract literal words from a consecutive string efficiently?
My Current Recursive Solution
This is for people who only want to split a few words (< 50) in C# and don't really care about efficiency.
My current solution works out all possible combinations of words, finds the most probable output and displays. I am currently defining the most probable output as the one which uses the longest individual words - I would prefer to use a different method. Here is my current solution, using a recursive algorithm.
static public string find_words(string instring)
{
if (words.Contains(instring)) //where words is my dictionary of words
{
return instring;
}
if (solutions.ContainsKey(instring.ToString()))
{
return solutions[instring];
}
string bestSolution = "";
string solution = "";
for (int i = 1; i < instring.Length; i++)
{
string partOne = find_words(instring.Substring(0, i));
string partTwo = find_words(instring.Substring(i, instring.Length - i));
if (partOne == "" || partTwo == "")
{
continue;
}
solution = partOne + " " + partTwo;
//if my current solution is smaller than my best solution so far (smaller solution means I have used the space to separate words fewer times, meaning the words are larger)
if (bestSolution == "" || solution.Length < bestSolution.Length)
{
bestSolution = solution;
}
}
solutions[instring] = bestSolution;
return bestSolution;
}
This algorithm relies on having no spacing or other symbols in the entry text (not really a problem here, I'm not fussed about splitting up punctuation). Random additional letters added within the string can cause an error, unless I store each letter of the alphabet as a "word" within my dictionary. This means that "wickedweatherdykjs" would return "wicked weather d y k j s" using the above algorithm, when I would prefer an output of "wicked weather dykjs".
My updated exponential solution:
static List<string> words = File.ReadLines("E:\\words.txt").ToList();
static Dictionary<char, HashSet<string>> compiledWords = buildDictionary(words);
private void btnAutoSpacing_Click(object sender, EventArgs e)
{
string text = txtText.Text;
text = RemoveSpacingandNewLines(text); //get rid of anything that breaks the algorithm
if (text.Length > 150)
{
//possibly split the text up into more manageable chunks?
//considering using textSplit() for this.
}
else
{
txtText.Text = find_words(text);
}
}
static IEnumerable<string> textSplit(string str, int chunkSize)
{
return Enumerable.Range(0, str.Length / chunkSize)
.Select(i => str.Substring(i * chunkSize, chunkSize));
}
private static Dictionary<char, HashSet<string>> buildDictionary(IEnumerable<string> words)
{
var dictionary = new Dictionary<char, HashSet<string>>();
foreach (var word in words)
{
var key = word[0];
if (!dictionary.ContainsKey(key))
{
dictionary[key] = new HashSet<string>();
}
dictionary[key].Add(word);
}
return dictionary;
}
static public string find_words(string instring)
{
string bestSolution = "";
string solution = "";
if (compiledWords[instring[0]].Contains(instring))
{
return instring;
}
if (solutions.ContainsKey(instring.ToString()))
{
return solutions[instring];
}
for (int i = 1; i < instring.Length; i++)
{
string partOne = find_words(instring.Substring(0, i));
string partTwo = find_words(instring.Substring(i, instring.Length - i));
if (partOne == "" || partTwo == "")
{
continue;
}
solution = partOne + " " + partTwo;
if (bestSolution == "" || solution.Length < bestSolution.Length)
{
bestSolution = solution;
}
}
solutions[instring] = bestSolution;
return bestSolution;
}
How I would like to use the Viterbi Algorithm
I would like to create an algorithm which works out the most probable solution to a conjoined string, where the probability is calculated according to the position of the word in a text file that I provide the algorithm with. Let's say the file starts with the most common word in the English language first, and on the next line the second most common, and so on until the least common word in my dictionary. It looks roughly like this
the
be
and
...
attorney
Here is a link to a small example of such a text file I would like to use.
Here is a much larger text file which I would like to use
The logic behind this file positioning is as follows...
It is reasonable to assume that they follow Zipf's law, that is the
word with rank n in the list of words has probability roughly 1/(n log
N) where N is the number of words in the dictionary.
Generic Human, in his excellent Python solution, explains this much better than I can. I would like to convert his solution to the problem from Python into C#, but after many hours spent attempting this I haven't been able to produce a working solution.
I also remain open to the idea that perhaps relative frequencies with the Viterbi algorithm isn't the best way to split words, any other suggestions for creating a solution using C#?
Written text is highly contextual and you may wish to use a Markov chain to model sentence structure in order to estimate joint probability. Unfortunately, sentence structure breaks the Viterbi assumption -- but there is still hope, the Viterbi algorithm is a case of branch-and-bound optimization aka "pruned dynamic programming" (something I showed in my thesis) and therefore even when the cost-splicing assumption isn't met, you can still develop cost bounds and prune your population of candidate solutions. But let's set Markov chains aside for now... assuming that the probabilities are independent and each follows Zipf's law, what you need to know is that the Viterbi algorithm works on accumulating additive costs.
For independent events, joint probability is the product of the individual probabilities, making negative log-probability a good choice for the cost.
So your single-step cost would be -log(P) or log(1/P) which is log(index * log(N)) which is log(index) + log(log(N)) and the latter term is a constant.
Can't help you with the Viterbi Algorithm but I'll give my two cents concerning your current approach. From your code its not exactly clear what words is. This can be a real bottleneck if you don't choose a good data structure. As a gut feeling I'd initially go with a Dictionary<char, HashSet<string>> where the key is the first letter of each word:
private static Dictionary<char, HashSet<string>> buildDictionary(IEnumerable<string> words)
{
var dictionary = new Dictionary<char, HashSet<string>>();
foreach (var word in words)
{
var key = word[0];
if (!dictionary.ContainsKey(key))
{
dictionary[key] = new HashSet<string>();
}
dictionary[key].Add(word);
}
return dictionary;
}
And I'd also consider serializing it to disk to avoid building it up every time.
Not sure how much improvement you can make like this (dont have full information of you current implementation) but benchmark it and see if you get any improvement.
NOTE: I'm assuming all words are cased consistently.
I would like to preface this by saying it is for an assessment, so I don't want you to directly give me the answer, I would like you to point me in the right direction, slightly tweak what I have done or just tell me what I should look into doing.
I am trying to create a Caesar Cipher to decipher a text document we have been given. It needs to print all the possible shifts in the console and output the final one to a text document. I'm thinking of trying to add frequency analysis to this later to find the correct one, but I have some problems now that I need to sort out before I can do that.
This is what I have made so far:
using System;
using System.IO;
class cipher
{
public static void Main(string[] args)
{
string output = "";
int shift;
bool userright = false;
string cipher = File.ReadAllText("decryptme.txt");
char[] decr = cipher.ToCharArray();
do {
Console.WriteLine("How many times would you like to shift? (Between 0 and 26)");
shift = Convert.ToInt32(Console.ReadLine());
if (shift > 26) {
Console.WriteLine("Over the limit");
userright = false;
}
if (shift < 0) {
Console.WriteLine("Under the limit");
userright = false;
}
if (shift <= 26 && shift >= 0)
{
userright = true;
}
} while (userright == false);
for (int i = 0; i < decr.Length; i++)
{
{
char character = decr[i];
character = (char)(character + shift);
if (character == '\'' || character == ' ')
continue;
if (character > 'Z')
character = (char)(character - 26);
else if (character < 'A')
character = (char)(character + 26);
output = output + character;
}
Console.WriteLine("\nShift {0} \n {1}", i + 1, output);
}
StreamWriter file = new StreamWriter("decryptedtext.txt");
file.WriteLine(output);
file.Close();
}
}
Right now it compiles and read the document but when it runs in the console it prints shift one as 1 letter from the encoded text, shift 2 as 2 letters from it, etc.
I have no idea what I have done wrong and any help would be greatly appreciated. I have also started thinking about ASCII values for letters but have no idea how to implement this.
Once again, please don't just give me the answer or I will not have learned anything from this - and I have been trying to crack this myself but had no luck.
Thanks.
Break the problem down into smaller bite-sized chunks. Start by printing a single shifted line, say with a shift of 1.
When you have that part working correctly (and only then) extend your code to print 26 lines with shifts of 0, 1, 2, 3, ... 26. I am not sure if your instructor wants either or both of shift 0 at the start and shift 26 at the end. You will need to ask.
Again, get that working correctly, and write new code to analyse one line only, and give it some sort of score. Get that working properly.
Now calculate the scores for all the lines and pick out the line with the best score. That should be the right answer. If it isn't then you will need to check your scoring method.
Writing small incremental changes to a very simple starting program is usually a lot easier than trying to go straight from a blank screen to the full, complex, program. Add the complexity gradually, one piece at a time, testing as you go.
I've searched online for a diff algorithm but none of them do what I am looking for. It is for a texting contest (as in cell phone) and I need the entry text compared to the master text recording the errors along the way. I am semi-new to C# and I get most of the string functions and didn't think this was going to be that hard of a problem, but alas I just can't wrap my head around it.
I have a form with 2 rich-text-boxes (one on top of the other) and 2 buttons. The top box is the master text (string) and the bottom box is the entry text (string). Every contestant is sending a text to an email account, from the email we copy and paste the text into the Entry RTB and compare to the Master RTB. For each single word and single space counts as a thing to check. A word, no matter how many errors it has, is still 1 error. And for every error add 1 sec. to their time.
Examples:
Hello there! <= 3 checks (2 words and 1 space)
Helothere! <= 2 errors (Helo and space)
Hello there!! <= 1 error (extra ! at end of there!)
Hello there! How are you? <= 9 checks (5 words and 4 spaces)
Helothere!! How a re you? <= still 9 checks, 4 errors(helo, no space, extra !, and a space in are)
Hello there!# Ho are yu?? <= 3 errors (# at end of there!, no w, no o and extra ? (all errors are still under the 1 word)
What I have so far:
I've created 6 arrays (3 for master, 3 for entry) and they are
CharArray of all chars
StringArray of all strings(words) including the spaces
IntArray with length of the string in each StringArray
My biggest trouble is if the entry text is wrong and it's shorter or longer than the master. I keep getting IndexOutOfRange exceptions (understandably) but can't fathom how to go about checking and writing the code to compensate.
I hope I have made myself clear enough as to what I need help with. If anyone could give some code examples or something to shoot me in the right path would be very helpful.
Have you looked into the Levenshtein distance algorithm? It returns the number of differences between two strings, which, in your case would be texting errors. Implementing the algorithm based off the pseudo-code found on the wikipedia page passes the first 3 of your 4 use cases:
Assert.AreEqual(2, LevenshteinDistance("Hello there!", "Helothere!");
Assert.AreEqual(1, LevenshteinDistance("Hello there!", "Hello there!!"));
Assert.AreEqual(4, LevenshteinDistance("Hello there! How are you?", "Helothere!! How a re you?"));
Assert.AreEqual(3, LevenshteinDistance("Hello there! How are you?", "Hello there!# Ho are yu??")); //fails, returns 4 errors
So while not perfect out of the box, it is probably a good starting point for you. Also, if you have too much trouble implementing your scoring rules, it might be worth revisiting them.
hth
Update:
Here is the result of the string you requested in the comments:
Assert.AreEqual(7, LevenshteinDistance("Hello there! How are you?", "Hlothere!! Hw a reYou?"); //fails, returns 8 errors
And here is my implementation of the Levenshtein Distance algorithm:
int LevenshteinDistance(string left, string right)
{
if (left == null || right == null)
{
return -1;
}
if (left.Length == 0)
{
return right.Length;
}
if (right.Length == 0)
{
return left.Length;
}
int[,] distance = new int[left.Length + 1, right.Length + 1];
for (int i = 0; i <= left.Length; i++)
{
distance[i, 0] = i;
}
for (int j = 0; j <= right.Length; j++)
{
distance[0, j] = j;
}
for (int i = 1; i <= left.Length; i++)
{
for (int j = 1; j <= right.Length; j++)
{
if (right[j - 1] == left[i - 1])
{
distance[i, j] = distance[i - 1, j - 1];
}
else
{
distance[i, j] = Min(distance[i - 1, j] + 1, //deletion
distance[i, j - 1] + 1, //insertion
distance[i - 1, j - 1] + 1); //substitution
}
}
}
return distance[left.Length, right.Length];
}
int Min(int val1, int val2, int val3)
{
return Math.Min(val1, Math.Min(val2, val3));
}
You need to come up with a scoring systems that works for you're situation.
I would make a word array after each space.
If a word is found on the same index +5.
If a word is found on the same index +-1 index location +3 (keep a counter how much words differ to increase the +- correction
If a needed word is found as part of another word +2
etc.etc. Matching words is hard, getting up with a rules engine that works is 'easier'
I once implemented an algorithm (which I can't find at the moment, I'll post code when I find it) which looked at the total number of PAIRS in the target string. i.e. "Hello, World!" would have 11 pairs, { "He", "el", "ll",...,"ld", "d!" }.
You then do the same thing on an input string such as "Helo World" so you have { "He",...,"ld" }.
You can then calculate accuracy as a function of correct pairs (i.e. input pairs that are in the list of target pairs), incorrect pairs (i.e. input pairs that do not exists in the list of target pairs), compared to the total list of target pairs. Over long enough sentences, this measure would be very accurate fair.
A simple algorithm would be to check letter by letter. If the letters differ increment the num of errors. If the next pairing of letters match, its a switched letter so just continue. If the messup matches the next letter, it is an omission and treat it accordingly. If the next letter matches the messed up one, its an insertion and treat it accordingly. Else the person really messed up and continue.
This doesn't get everything but with a few modifications this could become comprehensive.
a weak attempt at pseudocode:
edit: new idea. look at comments. I don't know the string functions off the top of my head so you'll have to figure that part out. The algorithm kinda fails for words that repeat a lot though...
string entry; //we'll pretend that this has stuff inside
string master; // this too...
string tempentry = entry; //stuff will be deleted so I need a copy to mess up
int e =0; //word index for entry
int m = 0; //word index for master
int errors = 0;
while(there are words in tempentry) //!tempentry.empty() ?
string mword = the next word in master;
m++;
int eplace = find mword in tempentry; //eplace is the index of where the mword starts in tempentry
if(eplace == -1) //word not there...
continue;
else
errors += m - e;
errors += find number of spaces before eplace
e = m // there is an error
tempentry = stripoff everything between the beginning and the next word// substring?
all words and spaces left in master are considered errors.
There are a couple of bounds checking errors that need to be fixed here but its a good start.
i'm quite a beginner in C# , i tried to write a program that extract words from an entered string, the user has to enter a minimum length for the word to filter the words output ... my code doesn't look good or intuitive, i used two arrays countStr to store words , countArr to store word length corresponding to each word .. but the problem is i need to use hashtables instead of those two arrays , because both of their sizes are depending on the string length that the user enter , i think that's not too safe for the memory or something ?
here's my humble code , again i'm trying to replace those two arrays with one hashtable , how can this be done ?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
int i = 0 ;
int j = 0;
string myString = "";
int counter = 0;
int detCounter = 0;
myString = Console.ReadLine();
string[] countStr = new string[myString.Length];
int[] countArr = new int[myString.Length];
Console.Write("Enter minimum word length:");
detCounter = int.Parse(Console.ReadLine());
for (i = 0; i < myString.Length; i++)
{
if (myString[i] != ' ')
{
counter++;
countStr[j] += myString[i];
}
else
{
countArr[j] = counter;
counter = 0;
j++;
}
}
if (i == myString.Length)
{
countArr[j] = counter;
}
for (i = 0; i < myString.Length ; i++)
{
if (detCounter <= countArr[i])
{
Console.WriteLine(countStr[i]);
}
}
Console.ReadLine();
}
}
}
You're not doing too badly for a first attempt but this could be a lot better.
First thing: use TryParse rather than Parse when parsing an integer input by a human. If the human types in "HELLO" instead of an integer, your program will crash if you use Parse; only use Parse when you know that it is an integer.
Next thing: consider using String.Split to split the string into an array of words, and then process the array of words.
Next thing: code like yours that has a whole lot of array mutations is hard to read and understand. Consider characterizing your problem as a query. What are you trying to ask? I'm not sure I completely understand your code but it sounds to me like you are trying to say "take this string of words separated by spaces. Take a minimum length. Give me all the words in that string which are more than the minimum length." Yes?
In that case, write the code that looks like that:
string sentence = whatever;
int minimum = whatever;
var words = sentence.Split(' ');
var longWords = from word in words
where word.Length >= minimum
select word;
foreach(var longWord in longWords)
Console.WriteLine(longWord);
And there you go. Notice how the code reads like what it is doing. Try to write code so that the code conveys the meaning of the code, not the mechanism of the code.
One word. Dictioary (or HashTable). Both are standard datatypes you can use
Use a Dictionary for this (in your case, you are looking for a Dictionary).
Your extracted string will be the key, the length of it the Value.
Dictionary<string, int> words = new Dictionary<string,int>();
//algorithm
words.Add(word, length);