How to ignore the punctuation c# - c#

I want to ignore the punctuation.So, I'm trying to make a program that counts all the appearences of every word in my text but without taking in consideration the punctuation marks.
So my program is:
static void Main(string[] args)
{
string text = "This my world. World, world,THIS WORLD ! Is this - the world .";
IDictionary<string, int> wordsCount =
new SortedDictionary<string, int>();
text=text.ToLower();
text = text.replaceAll("[^0-9a-zA-Z\text]", "X");
string[] words = text.Split(' ',',','-','!','.');
foreach (string word in words)
{
int count = 1;
if (wordsCount.ContainsKey(word))
count = wordsCount[word] + 1;
wordsCount[word] = count;
}
var items = from pair in wordsCount
orderby pair.Value ascending
select pair;
foreach (var p in items)
{
Console.WriteLine("{0} -> {1}", p.Key, p.Value);
}
}
The output is:
is->1
my->1
the->1
this->3
world->5
(here is nothing) -> 8
How can I remove the punctuation here?

You should try specifying StringSplitOptions.RemoveEmptyEntries:
string[] words = text.Split(" ,-!.".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Note that instead of manually creating a char[] with all the punctuation characters, you may create a string and call ToCharArray() to get the array of characters.
I find it easier to read and to modify later on.

string[] words = text.Split(new char[]{' ',',','-','!','.'}, StringSplitOPtions.RemoveEmptyItems);

It is simple - first step is to remove undesired punctuation with function Replace and then continue with splitting as you have it.

... you can go with the making people cry version ...
"This my world. World, world,THIS WORLD ! Is this - the world ."
.ToLower()
.Split(" ,-!.".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.GroupBy(i => i)
.Select(i=>new{Word=i.Key, Count = i.Count()})
.OrderBy(k => k.Count)
.ToList()
.ForEach(Console.WriteLine);
.. output
{ Word = my, Count = 1 }
{ Word = is, Count = 1 }
{ Word = the, Count = 1 }
{ Word = this, Count = 3 }
{ Word = world, Count = 5 }

Related

Reading characters from a string and counting each one of them

The issue i have with my code is as following: i cannot get my head around how to read each character and sum each up in one int for everyone at the end of all rotations. Here is my code:
class Program
{
static void Main()
{
SortedDictionary<string, int> text = new SortedDictionary<string, int>();
string[] characters = Console.ReadLine()
.Split()
.ToArray();
foreach (var character in characters)
{
if (text.ContainsKey(character))
{
text[character]++;
}
else
{
text.Add(character, 1);
}
}
foreach (var character in text)
{
Console.WriteLine($"{character.Key} -> {character.Value}");
}
}
}
I am reading here how many times a string exists in the Dictionary. What i need to get, written above, is different. Please help, thanks!
String.Split() is splitting on new lines by default so characters contains a single string with the whole line in it. If you want each of the characters, just get rid of the Split (and change the Dictionary KeyType to char to match the values):
SortedDictionary<char, int> text = new SortedDictionary<char, int>();
char[] characters = Console.ReadLine().ToArray();
// ...
https://www.ideone.com/hnMSv1
Since string implements IEnumerable<char> you actually don't even need to convert the characters into an array:
SortedDictionary<char, int> text = new SortedDictionary<char, int>();
string line = Console.ReadLine();
foreach( char character in line )
// ...
https://www.ideone.com/nLyBfC
You can use LINQ here because any string consists of char element. So, string type implements IEnumerable<char> interface:
string str = "aaabbc";
var res = str
.GroupBy(c => c)
.ToDictionary(g => new { g.Key, Count = g.Count() });
The example below demonstrates how you can get it without casting to dictionary but projecting an anonymous type and sort the number of characters in descending order:
var res2 = str
.GroupBy(c => c)
.Select(d => new { d.Key, Count = d.Count() })
.OrderByDescending(x => x.Count);

How to count 2 or 3 letter words in a string using asp c#

How to count 2 or 3 letter words of a string using asp csharp, eg.
string value="This is my string value";
and output should look like this
2 letter words = 2
3 letter words = 0
4 letter words = 1
Please help, Thanks in advance.
You can try something like this:
split sentence by space to get array of words
group them by length of word (and order by that length)
iterate through every group and write letter count and number of words with that letter count
code
using System.Linq;
using System.Diagnostics;
...
var words = value.Split(' ');
var groupedByLength = words.GroupBy(w => w.Length).OrderBy(x => x.Key);
foreach (var grp in groupedByLength)
{
Debug.WriteLine(string.Format("{0} letter words: {1}", grp.Key, grp.Count()));
}
First of all you need to decide what counts as a word. A naive approach is to split the string with spaces, but this will also count commas. Another approach is to use the following regex
\b\w+?\b
and collect all the matches.
Now you got all the words in a words array, we can write a LINQ query:
var query = words.Where(x => x.Length >= 2 && x.Length <= 4)
.GroupBy(x => x.Length)
.Select(x => new { CharCount = x.Key, WordCount = x.Count() });
Then you can print the query out like this:
query.ToList().ForEach(Console.WriteLine);
This prints:
{ CharCount = 4, WordCount = 1 }
{ CharCount = 2, WordCount = 2 }
You can write some code yourself to produce a more formatted output.
If i understood your question correctly
You can do it using dictionary
First split the string by space in this case
string value = "This is my string value";
string[] words = value.Split(' ');
Then loop trough array of words and set the length of each word as a key of dictionary, note that I've used string as a key, but you can modify this to your needs.
Dictionary<string, int> latteWords = new Dictionary<string,int>();
for(int i=0;i<words.Length;i++)
{
string key = words[i].Length + " letter word";
if (latteWords.ContainsKey(key))
latteWords[key] += 1;
else
latteWords.Add(key, 1);
}
And the output would be
foreach(var ind in latteWords)
{
Console.WriteLine(ind.Key + " = " + ind.Value);
}
Modify this by wish.

Finding number of instances of exact word of "x" in text

I'm working c# to find out number of instances of exact word of "x".
For example:
List<string> words = new List<string> {"Mode", "Model", "Model:"};
Text= "This is Model: x Type: y aa: e";
I've used Regex:
for(i=0; i<words.count; i++)
{
word= list[i]
int count= Regex.Matches(Text,word)
}
But its not working. The result of above code gave count=1 for every Mode, Model, and Model:.
I want to have my count to be 0 for Mode, 0 for Model, but 1 for Model: that it finds the number of instance of exact word.
Forgot that I can't use split in my case. Is there any way I can get not using split?
I use LINQ for this purpose:
List<string> words = new List<string> { "Mode", "Model", "Model:" };
Text = "This is Model: x Type: Model: y aa: Mode e Model:";
var textArray = Text.Split(' ');
var countt = words.Select(item => textArray.ToList().Contains(item) ?
textArray.Count(d => d == item) : 0).ToArray();
Result:
For Mode => count = 1
For Model => count = 0
For Model: => count = 3
EDIT: I prefer to use LINQ for this purpose because as you see it is more easier and cleaner in this scenario, but if you are looking for a Regex solution yet you could try this:
List<int> count = new List<int>();
foreach (var word in words)
{
var regex = new Regex(string.Format(#"\b{0}(\s|$)", word), RegexOptions.IgnoreCase);
count.Add(regex.Matches(Text).Count);
}
EDIT2: Or by combining LINQ and Regex and without Split you can:
List<int> count = words.Select(word => new Regex(string.Format(#"\b{0}(\s|$)", word), RegexOptions.IgnoreCase))
.Select(regex => regex.Matches(Text).Count).ToList();
Although #S.Akhbari 's solution works... I think using Linq is cleaner:
var splitted = Text.Split(' ');
var items = words.Select(x => new { Word = x, Count = splitted.Count(y => y == x) });
Each item will have Word and Count properties.
See it in action here
\b matches on word boundaries.
for(i=0; i<words.count; i++)
{
word= list[i]
var regex = new Regex(string.Format(#"\b{0}\b", word),
RegexOptions.IgnoreCase);
int count= regex.Matches(Text).Count;
}

Using Indexof to check if string contains a character

What I'm trying to do is type in random words into box1, click a button and then print all the words that start with "D" in box2. So if I was to type in something like "Carrots Doors Apples Desks Dogs Carpet" and click the button "Doors Desks Dogs" would print in box2.
string s = box1.Text;
int i = s.IndexOf("D");
string e = s.Substring(i);
box2.Text = (e);
when I use this^^
It would print out "Doors Apples Desks Dogs Carpet" instead of just the D's.
NOTE: These words are an example, I could type anything into box1.
Any help?
You could simplify this by using LINQ
var allDWords = box1.Text.Split(' ').Where(w => w.StartsWith("D"));
box2.Text = String.Join(" ", allDWords);
Try this
box2.Text = String.Join(" ",
box1.Text.Split(' ')
.Where(p => p.StartsWith("D")));
You can match the D words with a regular expression and iterate over the results
Try this regex
D\w+
First you need to split up the text into words and then check to see if each word starts with D. When looking for the first character it's easier to just check it directly.
string s = box1.Text;
StringBuilder builder = new StringBuilder();
foreach (var cur in s.Split(new char[] { ' ' })) {
if (cur.Length > 0 && cur[0] == 'D') {
builder.Append(cur);
builder.Append(' ');
}
}
box2.Text = builder.ToString();
One thing you could do is:
Lets suppose,
string str = "Dog Cat Man etc";
string[] words = str.Split(' ');
List<string> wordStartWithD = new List<string>();
foreach (string strTemp in words)
if (strTemp.StartsWith("D"))
wordStartWithD.Add(strTemp);
Hope this help.

splitting string from listbox items

i am listbox to store different strings which user gives as input.
but i want to split those listbox items where i want to have the first word of every item as seperate string and rest as other string.
i am iterating the listbox item as
foreach (ListItem item in lstboxColumnList.Items)
{
column_name = temp + "\" "+item+"\"";
temp = column_name + "," + Environment.NewLine;
}
how could i get the splitted string
Assuming firs word ends with a space, you can use something like below:
string firsWord = sentence.SubString(0, sentence.IndexOf(' '));
string remainingSentence = sentence.SubString(sentence.IndexOf(' '), sentence.Length);
I dont know your listbox item's format..
but I assumed that your listbox item have at least 2 word and separate by a space..
so, you can do the splitting using substring and index of..
string first = sentence.SubString(0, sentence.IndexOf(" "));
string second = sentence.SubString(sentence.IndexOf(" ") + 1);
public void Test()
{
List<string> source = new List<string> {
"key1 some data",
"key2 some more data",
"key3 yada..."};
Dictionary<string, string> resultDictionary = source.ToDictionary(n => n.Split(' ').First, n => n.Substring(n.IndexOf(' ')));
List<string> resultStrings = source.Select(n => string.Format("\"{0}\",{1}", n.Split(' ').First, n.Substring(n.IndexOf(' ')))).ToList;
}
resultDictionary is a dictionary with the key set to the first word of each string in the source list.
The second closer matches the requirements in your question that it outputs a list of strings in the format you specified.
EDIT: Apologies, posted in VB first time round.
checkout:
var parts = lstboxColumnList.Items.OfType<ListItem>().Select(i => new {
Part1 = i.Text.Split(' ').FirstOrDefault(),
Part2 = i.Text.Substring(i.Text.IndexOf(' '))
});
foreach (var part in parts)
{
var p1 = part.Part1;
var p2 = part.Part2;
// TODO: use p1, p2 in magic code!!
}

Categories

Resources