At the next code I'm splitting text to words, inserting them into a table separately and counting the numbers of letters in each word.
The problem is that counter is also counting spaces at the beginning of each line, and give me wrong value for some of the words.
How can I count only the letters of each word exactly?
var str = reader1.ReadToEnd();
char[] separators = new char[] {' ', ',', '/', '?'}; //Clean punctuation from copying
var words = str.Split(separators, StringSplitOptions.RemoveEmptyEntries).ToArray(); //Insert all the song words into "words" string
string constring1 = "datasource=localhost;port=3306;username=root;password=123";
using (var conDataBase1 = new MySqlConnection(constring1))
{
conDataBase1.Open();
for (int i = 0; i < words.Length; i++)
{
int numberOfLetters = words[i].ToCharArray().Length; //Calculate the numbers of letters in each word
var songtext = "insert into myproject.words (word_text,word_length) values('" + words[i] + "','" + numberOfLetters + "');"; //Insert words list and length into words table
MySqlCommand cmdDataBase1 = new MySqlCommand(songtext, conDataBase1);
try
{
cmdDataBase1.ExecuteNonQuery();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
}
This will be a simple and fast way of doing so:
int numberOfLetters = words[i].Count(word => !Char.IsWhiteSpace(word));
Another simple solution that will save you the above and rest of the answers here, will be to Trim() first, and than do your normal calculation, due your statement that it is happening just in the beginning of every line.
var words = str.Trim().Split(separators, StringSplitOptions.RemoveEmptyEntries);
Than all you will need is: (Without the redundant conversion)
int numberOfLetters = words[i].Length;
See String.Trim()
int numberOfLetters = words[i].Trim().ToCharArray().Length; //Calculate the numbers of letters in each word
instead of ' ' use '\s+' since it matches one or more whitespace at once, so it splits on any number of whitespace characters.
Regex.Split(myString, #"\s+");
Related
const string Duom = "Text.txt";
char[] seperators = { ' ', '.', ',', '!', '?', ':', ';', '(', ')', '\t' };
string[] lines = File.ReadAllLines(Duom, Encoding.GetEncoding(1257));
for (int i = 0; i < lines.Length; i++)
{
string GLine = " " + lines[i];
GLine = Regex.Replace(GLine, #"\s+", " ");
GLine = GLine.PadRight(5, ' ');
Console.WriteLine(GLine);
}
Reads a text file, for each line it adds a whitespace at the start, removes all double and above whitespaces, and I want to move the line to the right , but it doesn't do anything.
Result :
Expected Result:
PadLeft and PadRight doesn't add characters to the start/end of your string if the specified length has already been reached.
From the docs for String.PadRight (emphasis mine):
Returns a new string that left-aligns the characters in this string by padding them on the right with a specified Unicode character, for a specified total length.
All of your strings are larger than 5, the specified total length, so PadRight/PadLeft won't do anything.
"Padding" the string is adding spaces (or some other character) so that the new string is at least as large as the number you want.
Instead, just manually add 5 spaces before your string.
GLine = " " + GLine;
Or more programmaticly:
GLine = new string(' ', 5) + GLine;
You could replace the body of your loop like this:
string GLine = new string(' ', 1 + i * 5) + Regex.Replace(lines[i], #"\s+", " ");
Console.WriteLine(GLine);
This will add 1 space and then 5 more spaces for each line.
for (int i = 0; i < lines.Count(); i++)
{
string GLine = new string(' ',5*i) + lines[i];
Console.WriteLine(GLine);
}
This should add 5 extra spaces for each line you have, which i believe is what you are trying to accomplish if i understand correctly.
You need to left pad a tab depending on how many lines of text you have. The best increment to use is the i variable.
string GLine = " " + lines[i];
change this to
string GLine = new String('\t', i) + lines[i];
By the way, PadLeft should work but keep in mind you need to execute it i times
I want to count chars in a big text, I do it with this code:
string s = textBox.Text;
int chars = 0;
int words = 0;
foreach(var v in s.ToCharArray())
chars++;
foreach(var v in s.Split(' '))
words++;
this code works but it seems pretty slow with large text, so how can i improve this?
You don't need another char-array, you can use String.Length directly:
int chars = s.Length;
int words = s.Split().Length;
Side-note: if you call String.Split without an argument all white-space characters are used as delimiter. Those include spaces, tab-characters and new-line characters. This is not a complete list of possible word delimiters but it's better than " ".
You are also counting consecutive spaces as different "words". Use StringSplitOptions.RemoveEmptyEntries:
string[] wordSeparators = { "\r\n", "\n", ",", ".", "!", "?", ";", ":", " ", "-", "/", "\\", "[", "]", "(", ")", "<", ">", "#", "\"", "'" }; // this list is probably too extensive, tim.schmelter#myemail.com would count as 4 words, but it should give you an idea
string[] words = s.Split(wordSeparators, StringSplitOptions.RemoveEmptyEntries);
int wordCount = words.Length;
You can do this in a single pass through without making a copy of your string:
int chars = 0;
int words = 0;
//keep track of spaces so as to only count nonspace-space-nonspace transitions
//it is initialized to true to count the first word only when we come to it
bool lastCharWasSpace = true;
foreach (var c in s)
{
chars++;
if (c == ' ')
{
lastCharWasSpace = true;
}
else if (lastCharWasSpace)
{
words++;
lastCharWasSpace = false;
}
}
Note the reason I do not use string.Split here is that it does a bunch of string copies under the hood to return the resulting array. Since you're not using the contents but instead are only interested in the count, this is a waste of time and memory - especially if you have a big enough text that has to be shuffled off to main memory, or worse yet swap space.
Do be aware that string.Split does on the other hand by default use a longer list of delimiters than just ' ', so you may want to add other conditions to the if statement.
You can simply use
int numberOfLetters = textBox.Length;
or use LINQ
int numberOfLetters = textBox.ToCharArray().Count();
or
int numberOfLetters = 0;
foreach (char letter in textBox)
{
numberOfLetters++;
}
var chars = textBox.Text.Length;
var words = textbox.Text.Count(c => c == ' ') + 1;
This question already has answers here:
C# Word Count function
(5 answers)
Closed 8 years ago.
I want give my program a text and it count the words correctly
I tried to use an array to save words in it :
string[] words = richTextBox1.Text.Split(' ');
But this code has a problem and it is count the spaces in text
so I tried the following code :
string[] checkwords= richTextBox1.Text.Split(' ');
for (int i = 0; i < checkwords.Length; i++)
{
if (richTextBox1.Text.EndsWith(" ") )
{
return;
}
else
{
string[] words = richTextBox1.Text.Split(' ');
toolStripStatusLabel1.Text = "Words" + " = " + words.Length.ToString();
but now it wont work correctly .
I'd recommend using Regex here, using the 'word boundary' anchor
Otherwise your code may not correctly take into account things like Tabs and New Lines - \b will take care of that for you
var words = Regex
.Split("hello world", #"\b")
.Where(s => !string.IsNullOrWhiteSpace(s));
var wordCount = words.Count();
You can use the overload of String.Split with StringSplitOptions.RemoveEmptyEntries to ignore multiple consecutive spaces.
string text = "a b c d"; // 4 "words"
int words = text.Split(new char[]{}, StringSplitOptions.RemoveEmptyEntries).Length;
I'm using an empty char[] (you can also use new string[]{}) because that takes all white-space characters into account, so not only ' ' but also tabs or new-line characters.
I do not know why you wish to return if the textbox ends in " ". Maybe it should be a next or continue instead.
If multiple spaces is a possibility.
Regex myRege = new Regex(#"[ ]{2,}");
string myText = regex.Replace(richTextBox1.Text, #" ");
string[] words= myText.Split(" ");
toolStripStatusLabel1.Text = "Words" + " = " + words.Length.ToString();
Just for fun
private string[] GetCount(string bodyText)
{
bodyText = bodyText.Replace(" "," ");
if(bodyText.Contains(" ")
GetCount(bodyText)
return bodyText.Split(' ');
}
string[] words = GetCount(richTextBox1.Text)
toolStripStatusLabel1.Text = "Words" + " = " + words.Length.ToString();
I use Visual Studio 2010 ver.
I have array strings [] = { "eat and go"};
I display it with foreach
I wanna convert strings like this : EAT and GO
Here my code:
Console.Write( myString.First().ToString().ToUpper() + String.Join("",myString].Skip(1)).ToLower()+ "\n");
But the output is : Eat and go . :D lol
Could you help me? I would appreciate it. Thanks
While .ToUpper() will convert a string to its upper case equivalent, calling .First() on a string object actually returns the first element of the string (since it's effectively a char[] under the hood). First() is actually exposed as a LINQ extension method and works on any collection type.
As with many string handling functions, there are a number of ways to handle it, and this is my approach. Obviously you'll need to validate value to ensure it's being given a long enough string.
using System.Text;
public string CapitalizeFirstAndLast(string value)
{
string[] words = value.Split(' '); // break into individual words
StringBuilder result = new StringBuilder();
// Add the first word capitalized
result.Append(words[0].ToUpper());
// Add everything else
for (int i = 1; i < words.Length - 1; i++)
result.Append(words[i]);
// Add the last word capitalized
result.Append(words[words.Length - 1].ToUpper());
return result.ToString();
}
If it's always gonna be a 3 words string, the you can simply do it like this:
string[] mystring = {"eat and go", "fast and slow"};
foreach (var s in mystring)
{
string[] toUpperLower = s.Split(' ');
Console.Write(toUpperLower.First().ToUpper() + " " + toUpperLower[1].ToLower() +" " + toUpperLower.Last().ToUpper());
}
If you want to continuously alternate, you can do the following:
private static string alternateCase( string phrase )
{
String[] words = phrase.split(" ");
StringBuilder builder = new StringBuilder();
//create a flag that keeps track of the case change
book upperToggle = true;
//loops through the words
for(into i = 0; i < words.length; i++)
{
if(upperToggle)
//converts to upper if flag is true
words[i] = words[i].ToUpper();
else
//converts to lower if flag is false
words[i] = words[i].ToLower();
upperToggle = !upperToggle;
//adds the words to the string builder
builder.append(words[i]);
}
//returns the new string
return builder.ToString();
}
Quickie using ScriptCS:
scriptcs (ctrl-c to exit)
> var input = "Eat and go";
> var words = input.Split(' ');
> var result = string.Join(" ", words.Select((s, i) => i % 2 == 0 ? s.ToUpperInvariant() : s.ToLowerInvariant()));
> result
"EAT and GO"
I'm stuck on how to count how many words are in each sentence, an example of this is: string sentence = "hello how are you. I am good. that's good."
and have it come out like:
//sentence1: 4 words
//sentence2: 3 words
//sentence3: 2 words
I can get the number of sentences
public int GetNoOfWords(string s)
{
return s.Split(new char[] { '.' }, StringSplitOptions.RemoveEmptyEntries).Length;
}
label2.Text = (GetNoOfWords(sentance).ToString());
and i can get the number of words in the whole string
public int CountWord (string text)
{
int count = 0;
for (int i = 0; i < text.Length; i++)
{
if (text[i] != ' ')
{
if ((i + 1) == text.Length)
{
count++;
}
else
{
if(text[i + 1] == ' ')
{
count++;
}
}
}
}
return count;
}
then button1
int words = CountWord(sentance);
label4.Text = (words.ToString());
But I can't count how many words are in each sentence.
Instead of looping over the string as you do in CountWords I would just use;
int words = s.Split(' ').Length;
It's much more clean and simple. You split on white spaces which returns an array of all the words, the length of that array is the number of words in the string.
Why not use Split instead?
var sentences = "hello how are you. I am good. that's good.";
foreach (var sentence in sentences.TrimEnd('.').Split('.'))
Console.WriteLine(sentence.Trim().Split(' ').Count());
If you want number of words in each sentence, you need to
string s = "This is a sentence. Also this counts. This one is also a thing.";
string[] sentences = s.Split(new char[] { '.' }, StringSplitOptions.RemoveEmptyEntries);
foreach(string sentence in sentences)
{
Console.WriteLine(sentence.Split(' ').Length + " words in sentence *" + sentence + "*");
}
Use CountWord on each element of the array returned by s.Split:
string sentence = "hello how are you. I am good. that's good.";
string[] words = sentence.Split(new char[] { '.' }, StringSplitOptions.RemoveEmptyEntries).Length;
for (string sentence in sentences)
{
int noOfWordsInSentence = CountWord(sentence);
}
string text = "hello how are you. I am good. that's good.";
string[] sentences = s.Split(new char[] { '.' }, StringSplitOptions.RemoveEmptyEntries);
IEnumerable<int> wordsPerSentence = sentences.Select(s => s.Trim().Split(' ').Length);
As noted in several answers here, look at String functions like Split, Trim, Replace, etc to get you going. All answers here will solve your simple example, but here are some sentences which they may fail to analyse correctly;
"Hello, how are you?" (no '.' to parse on)
"That apple costs $1.50." (a '.' used as a decimal)
"I like whitespace . "
"Word"
If you only need a count, I'd avoid Split() -- it takes up unnecessary space. Perhaps:
static int WordCount(string s)
{
int wordCount = 0;
for(int i = 0; i < s.Length - 1; i++)
if (Char.IsWhiteSpace(s[i]) && !Char.IsWhiteSpace(s[i + 1]) && i > 0)
wordCount++;
return ++wordCount;
}
public static void Main()
{
Console.WriteLine(WordCount(" H elloWor ld g ")); // prints "4"
}
It counts based on the number of spaces (1 space = 2 words). Consecutive spaces are ignored.
Does your spelling of sentence in:
int words = CountWord(sentance);
have anything to do with it?