Calculating number of words from input text - c#

I split the input paragraph by . and store it in an array like this:
string[] totalSentences = inputPara.Split('.')
then the function below is called which calculates total number of Words from each sentence like this:
public void initParaMatrix()
{
int size = 0;
for (int i = 0; i < totalSentences.Length; i++)
{
string[] words = totalSentences[i].Split();
size = size + words.Length;
//rest of the logic here...
}
matrixSize = size;
paraMatrix = new string[matrixSize, matrixSize];
}
paraMatrix is a 2D matrix equal to length of all words which I need to make in my logic.
The problem here is when I input only one sentence which has 5 words, the size variable gets the value 7. I tried the debugger and I was getting total of 2 sentences instead of 1.
Sentence 1. "Our present ideas about motion." > this is actual sentence which have only 5 words
Sentence 2. " " > this is the exact second sentence I'm getting.
Here is the screenshot:
Why I'm getting two sentences here and how is size getting value 7?

This makes perfect sense. If the second sentence has nothing but a " ", and you split along the " ", then you'll have two empty strings as a result. The easiest thing to do here is change the line you do the split, and add a trim:
string[] words = totalSentences[i].Trim().Split();
I don't know what version of Split that you're using since it accepts no parameters, but if you use String.Split you can set the second parameter so that empty entries are automatically removed by using the option StringSplitOptions.RemoveEmptyEntries.

You're not resetting the size integer to zero. So that's why you get 7 for the second sentence.
For the second sentence, which is a space, try inputPara.Trim() which should remove the space at the end of the string.

Related

C# only use arrays if it exists

I am a beginner in programming. And now I'm facing a task where I can't get any further. Probably it is relatively easy to solve.
This is what I want to do: I read out a .txt file and there are several lines of content.
Example what is in the .txt file:
text1,text2,text3
text1,text2,
text1,text2,text3,text4
I'm now ready to find the right line and use it. Then I want to split the line and assign each text to its own string.
I can do this if I know that this line have 4 words. But what if I don't know how many words this line have.
For example if I want to assign 5 strings but there are only 4 arrays in the column I get an error.
My program currently looks like this:
string reader = "text1,text2,text3,text4";
string[] words = reader.Split(',');
string word1 = words[0].ToString();
string word2 = words[1].ToString();
string word3 = words[2].ToString();
string word4 = words[3].ToString();
textBox1.Text = word3;
My goal is to find out how many words are in the string. And then pass each word to a separate string.
Thank you in advance
To get the length of the Array, you can easily use .Length
In your example, you just write
int arraylength = words.Length;
I don't understand, why do you want to create a new String for every value of the string-array? You can just use them in the array.
In your example you always user .ToString(), this isn't necessary because you already have a string.
An array is just multiple variables (in your example strings) which are connected to another.
I doubt if you want separated local variables like word1, word2 etc. To see why, let's
bring the idea to the point of absurdity. Imagine, that we have a small narration with 1234
words only. Do we really want to create word1, word2, ..., word1234 local variables?
So, let's stick to a single words array only:
string[] words = reader.Split(',');
Now, you can easily get array Length (i.e. number of items):
textBoxCount.Text = $"We have {words.Length} words in total";
Or get N-th word (let N be one based) from the words array:
string wordN = array.Length >= N ? array[N - 1] : "SomeDefaultValue";
In your case (3d word) it can be
// either 3d word or an empty string (when we have just two less words)
textBox1.Text = array.Length >= 3 ? array[3 - 1] : "";
Technically, you can use Linq and query the reader string:
using System.Linq;
...
// 3d word or empty string
textBox1.Text = reader.Split(',').ElementAtOrDefault(3 - 1) ?? "";
But Linq seems to be overshot here.

grouping adjacent similar substrings

I am writing a program in which I want to group the adjacent substrings, e.g ABCABCBC can be compressed as 2ABC1BC or 1ABCA2BC.
Among all the possible options I want to find the resultant string with the minimum length.
Here is code what i have written so far but not doing job. Kindly help me in this regard.
using System;
using System.Collections.Generic;
using System.Linq;
namespace EightPrgram
{
class Program
{
static void Main(string[] args)
{
string input;
Console.WriteLine("Please enter the set of operations: ");
input = Console.ReadLine();
char[] array = input.ToCharArray();
List<string> list = new List<string>();
string temp = "";
string firstTemp = "";
foreach (var x in array)
{
if (temp.Contains(x))
{
firstTemp = temp;
if (list.Contains(firstTemp))
{
list.Add(firstTemp);
}
temp = "";
list.Add(firstTemp);
}
else
{
temp += x;
}
}
/*foreach (var item in list)
{
Console.WriteLine(item);
}*/
Console.ReadLine();
}
}
}
You can do this with recursion. I cannot give you a C# solution, since I do not have a C# compiler here, but the general idea together with a python solution should do the trick, too.
So you have an input string ABCABCBC. And you want to transform this into an advanced variant of run length encoding (let's called it advanced RLE).
My idea consists of a general first idea onto which I then apply recursion:
The overall target is to find the shortest representation of the string using advanced RLE, let's create a function shortest_repr(string).
You can divide the string into a prefix and a suffix and then check if the prefix can be found at the beginning of the suffix. For your input example this would be:
(A, BCABCBC)
(AB, CABCBC)
(ABC, ABCBC)
(ABCA, BCBC)
...
This input can be put into a function shorten_prefix, which checks how often the suffix starts with the prefix (e.g. for the prefix ABC and the suffix ABCBC, the prefix is only one time at the beginning of the suffix, making a total of 2 ABC following each other. So, we can compact this prefix / suffix combination to the output (2ABC, BC).
This function shorten_prefix will be used on each of the above tuples in a loop.
After using the function shorten_prefix one time, there still is a suffix for most of the string combinations. E.g. in the output (2ABC, BC), there still is the string BC as suffix. So, need to find the shortest representation for this remaining suffix. Wooo, we still have a function for this called shortest_repr, so let's just call this onto the remaining suffix.
This image displays how this recursion works (I only expanded one of the node after the 3rd level, but in fact all of the orange circles would go through recursion):
We start at the top with a call of shortest_repr to the string ABABB (I selected a shorter sample for the image). Then, we split this string at all possible split positions and get a list of prefix / suffix pairs in the second row. On each of the elements of this list we first call the prefix/suffix optimization (shorten_prefix) and retrieve a shortened prefix/suffix combination, which already has the run-length numbers in the prefix (third row). Now, on each of the suffix, we call our recursion function shortest_repr.
I did not display the upward-direction of the recursion. When a suffix is the empty string, we pass an empty string into shortest_repr. Of course, the shortest representation of the empty string is the empty string, so we can return the empty string immediately.
When the result of the call to shortest_repr was received inside our loop, we just select the shortest string inside the loop and return this.
This is some quickly hacked code that does the trick:
def shorten_beginning(beginning, ending):
count = 1
while ending.startswith(beginning):
count += 1
ending = ending[len(beginning):]
return str(count) + beginning, ending
def find_shortest_repr(string):
possible_variants = []
if not string:
return ''
for i in range(1, len(string) + 1):
beginning = string[:i]
ending = string[i:]
shortened, new_ending = shorten_beginning(beginning, ending)
shortest_ending = find_shortest_repr(new_ending)
possible_variants.append(shortened + shortest_ending)
return min([(len(x), x) for x in possible_variants])[1]
print(find_shortest_repr('ABCABCBC'))
print(find_shortest_repr('ABCABCABCABCBC'))
print(find_shortest_repr('ABCABCBCBCBCBCBC'))
Open issues
I think this approach has the same problem as the recursive levenshtein distance calculation. It calculates the same suffices multiple times. So, it would be a nice exercise to try to implement this with dynamic programming.
If this is not a school assignment or performance critical part of the code, RegEx might be enough:
string input = "ABCABCBC";
var re = new Regex(#"(.+)\1+|(.+)", RegexOptions.Compiled); // RegexOptions.Compiled is optional if you use it more than once
string output = re.Replace(input,
m => (m.Length / m.Result("$1$2").Length) + m.Result("$1$2")); // "2ABC1BC" (case sensitive by default)

Clip string inside of Array

Lets say I have a String Array full of items such as:
string[] letters = new string[4] {"A1","B1","C1","D1"};
Later, I want to set the contents of a textbox to the first value in the array:
Letter.Content = letters[0];
Is there a way to 'clip' the number out of the String in the Array? For example, in my above code, currently the Letter textbox would be set to 'A1'. What I want however is to set it to just 'A'.
Depends on if the strings's length is always two and the digit is at the second position. Then it's simple:
Letter.Content = letters[0][0];
If you don't know the length but you want to take all letters from the left until there is a non-letter you could use string.Concat + LINQ:
Letter.Content = string.Concat(letters[0].TakeWhile(Char.IsLetter));
or you could do it the old fashion way using SubString Method
Letter.Content = letters[0].Substring(0,1);

split string to string array without loosing text order

I have a problem that I busted my head for 7 days, so I decide to ask you for help. Here is my problem:
I read data from datagridview (only 2 cell), and fill all given data in stringbuilder, its actually article and price like invoice (bill). Now I add all what I get in stringbuilder in just string with intention to split string line under line, and that part of my code work but not as I wont. Article is one below another but price is one price more left another more right not all in one vertical line, something like this:
Bread 10$
Egg 4$
Milk 5$
My code:
string[] lines;
StringBuilder sbd = new StringBuilder();
foreach (DataGridViewRow rowe in dataGridView2.Rows)
{
sbd.Append(rowe.Cells[0].Value).Append(rowe.Cells[10].Value);
sbd.Append("\n");
}
sbd.Remove(sbd.Length - 1, 1);
string userOutput = sbd.ToString();
lines = userOutput.Split(new string[] { "\r", "\n" },
StringSplitOptions.RemoveEmptyEntries);
You can use the Trim method in order to remove existing leading and trailing spaces. With PadRight you can automatically add the right number of spaces in order to get a specified total length.
Also use a List<string> that grows automatically instead of using an array that you get from splitting what you just put together before:
List<string> lines = new List<string>();
foreach (DataGridViewRow row in dataGridView2.Rows) {
lines.Add( row.Cells[0].Value.ToString().Trim().PadRight(25) +
row.Cells[10].Value.ToString().Trim());
}
But keep in mind that this way of formatting works only if you display the string in a monospaced font (like Courier New or Consolas). Proportional fonts like Arial will yield jagged columns.
Alternatively you can create an array with the right size by reading the number of lines from the Count property
string[] lines = new string[dataGridView2.Rows.Count];
for (int i = 0; i < lines.Length; i++) {
DataGridViewRow row = dataGridView2.Rows[i];
lines[i] = row.Cells[0].Value.ToString().Trim().PadRight(25) +
row.Cells[10].Value.ToString().Trim();
}
You can also use the PadLeft method in order to right align the amounts
row.Cells[10].Value.ToString().Trim().PadLeft(10)
Have you tried this String Split method ?
String myString = "Bread ;10$;";
String articleName = myString.split(';')[0];
String price = myString.split(';')[1];

getting current line text is on

I have a list of lines that looks like this:
textbox.text += "p"+b+" the rest\r\np"+b+" more text";
b is supposed to represent the current line number in the textbox that the line is on. I have tried using textbox.lines.count() but it only changes i into the last line number.
Is there any other way about going with this, or do I have to switch to another method?
If you are assigning, I think you can do it manually (calculate the line number). There is no function that could "guess" on which line the tex will appear.
You can create a integer variable and increment it when appending a line/s and use the variable when you need to display the current line number.
I split the lines by the line breaks ("\r\n") and used a for loop to replace "b" (I changed it to string rather than a variable)
for (int i = 0; i < da.Length; i++)
{
//replace char with number
string f = da[i].Replace("n", (i + 1).ToString());
disp.Text += f + "v";
}
I added "v" so that I can replace it outside of the loop with "\r\n" again.

Categories

Resources