Counting number of words in a text file - c#

I'm trying to count the number of words from a text file, namely this, to start.
This is a test of the word count program. This is only a test. If your
program works successfully, you should calculate that there are 30
words in this file.
I am using StreamReader to put everything from the file into a string, and then use the .Split method to get the number of individual words, but I keep getting the wrong value when I compile and run the program.
using System;
using System.IO;
class WordCounter
{
static void Main()
{
string inFileName = null;
Console.WriteLine("Enter the name of the file to process:");
inFileName = Console.ReadLine();
StreamReader sr = new StreamReader(inFileName);
int counter = 0;
string delim = " ,.";
string[] fields = null;
string line = null;
while(!sr.EndOfStream)
{
line = sr.ReadLine();
}
fields = line.Split(delim.ToCharArray());
for(int i = 0; i < fields.Length; i++)
{
counter++;
}
sr.Close();
Console.WriteLine("The word count is {0}", counter);
}
}

Try to use regular expression, e.g.:
int count = Regex.Matches(input, #"\b\w+\b").Count;

this should work for you:
using System;
using System.IO;
class WordCounter
{
static void Main()
{
string inFileName = null;
Console.WriteLine("Enter the name of the file to process:");
inFileName = Console.ReadLine();
StreamReader sr = new StreamReader(inFileName);
int counter = 0;
string delim = " ,."; //maybe some more delimiters like ?! and so on
string[] fields = null;
string line = null;
while(!sr.EndOfStream)
{
line = sr.ReadLine();//each time you read a line you should split it into the words
line.Trim();
fields = line.Split(delim.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
counter+=fields.Length; //and just add how many of them there is
}
sr.Close();
Console.WriteLine("The word count is {0}", counter);
}
}

A couple hints.
What if you just have the sentence "hi" what would be your output?
Your counter calculation is: from 0 through fields.Length, increment counter. How are fields.Length and your counter related?

you're probably getting a one off error, try something like this
counter = 0;
while(!sr.EndOfStream)
{
line = sr.ReadLine();
fields = line.Split(delim.ToCharArray());
counter += field.length();
}
there is no need to iterate over the array to count the elements when you can get the number directly

using System.IO;
using System;
namespace solution
{
class Program
{
static void Main(string[] args)
{
var readFile = File.ReadAllText(#"C:\test\my.txt");
var str = readFile.Split(new char[] { ' ', '\n'}, StringSplitOptions.RemoveEmptyEntries);
System.Console.WriteLine("Number of words: " + str.Length);
}
}
}

//Easy method using Linq to Count number of words in a text file
/// www.techhowdy.com
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace FP_WK13
{
static class Util
{
public static IEnumerable<string> GetLines(string yourtextfile)
{
TextReader reader = new StreamReader(yourtextfile);
string result = string.Empty;
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
reader.Close();
}
// Word Count
public static int GetWordCount(string str)
{
int words = 0;
string s = string.Empty;
var lines = GetLines(str);
foreach (var item in lines)
{
s = item.ToString();
words = words + s.Split(' ').Length;
}
return words;
}
}
}

Related

Remove Non-ASCII characters from XML file C#

I am trying to write a program to remove open an XML file with Non-ASCII characters and replace those characters with spaces and save and close the file.
Thats basically it, just open the file remove all the non ascii characters and save/close the file.
Here is my code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.IO;
using System.Text.RegularExpressions;
namespace RemoveSpecial
{
class Program
{
static void Main(string[] args)
{
string pth_input = string.Empty;
string pth_output = string.Empty;
for (int i = 1; i < args.Length; i++)
{
//input one
string p_input = args[0];
pth_input = p_input;
pth_input = pth_input.Replace(#"\", #"\\");
//output
string p_output = args[2];
pth_output = p_output;
pth_output = pth_output.Replace(#"\", #"\\");
}
//s = Regex.Replace(s, #"[^\u0000-\u007F]+", string.Empty);
string lx;
using (StreamReader sr = new StreamReader(pth_input))
{
using (StreamWriter x = new StreamWriter(pth_output))
{
while ((lx = sr.ReadLine()) != null)
{
string text = sr.ReadToEnd();
Regex.Replace(text, #"[^\u0000-\u007F]+", "", RegexOptions.Compiled);
x.Write(text);
} sr.Close();
}
}
}
}
}
Thanks in advance guys.
According to documentation, the first string is an input parameter (and not passed by reference, so it could not change anyway). The result of the replacement is in the return value, like so:
var result = Regex.Replace(text, #"[^\u0000-\u007F]+", "", RegexOptions.Compiled);
x.Write(result);
Note that RegexOptions.Compiled might decrease performance here. It makes sense only if you reuse the same regular expression instance on multiple strings. You can still do that, if you create the RegEx instance outside of the loop:
var regex = new Regex(#"[^\u0000-\u007F]+", RegexOptions.Compiled);
using (var sr = new StreamReader(pth_input))
{
using (var x = new StreamWriter(pth_output))
{
while ((lx = sr.ReadLine()) != null)
{
var text = sr.ReadToEnd();
var result = regex.Replace(text, String.Empty);
x.Write(result);
}
}
}

Open a txt file using C# and read the numbers on the file

How can I open a .txt file and read numbers separated by enters or spaces into an array list?
Example:
Now what I want to do is to search (for 1 2 9 ) and send to the console.
I have tried a lot of code but nothing seems to work :(
This is my current code :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace Padroes
{
class Program
{
static void Main(string[] args)
{
try
{
// Open the text file using a stream reader.
const string FILENAME = #"Example.txt";
List<List<int>> data = new List<List<int>>();
string inputLine = "";
StreamReader reader = new StreamReader(FILENAME);
while ((inputLine = reader.ReadLine()) != null)
{
inputLine = inputLine.Trim();
if (inputLine.Length > 0)
{
List<int> inputArray = inputLine.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Select(x => int.Parse(x)).ToList();
data.Add(inputArray);
Console.WriteLine(inputLine);
}
}
}
catch (Exception e)
{
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
Console.ReadKey();
}
}
}
With this code this is my output:
Now what can I do to search only for ( 1 2 9 ) and send only the 1 2 9 to console ?
I belive this would do the trick.. I simply used a StreamReader and looped throught each line.. Im not sure if i got the part of the condition 100% but if i do it should look somthing like this :
StreamReader file = new StreamReader(#"test.txt");
string line= file.ReadLine();
while(line!=null)
{
if (line.Equals("5 8 1 7"))
MessageBox.Show(line);
line = file.ReadLine();
}
Goodluck.
Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.txt";
static void Main(string[] args)
{
List<List<int>> data = new List<List<int>>();
string inputLine = "";
StreamReader reader = new StreamReader(FILENAME);
while((inputLine = reader.ReadLine()) != null)
{
inputLine = inputLine.Trim();
if (inputLine.Length > 0)
{
List<int> inputArray = inputLine.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries).Select(x => int.Parse(x)).ToList();
data.Add(inputArray);
}
}
}
}
}
​

Load items from file and split them into array

I have to do a program in C# Form, which has to load from a file an ID number and a ADN formed by 20 letters, whatever the output has to be something like that:
//Edit: I'll try to explain it better; it's a C# Form program which has to load 20 people from a town(file) with their ADN, ID number and Name, after that i have to load from a file a single ADN without name or id number (which is the murderer; the program is a CSI game, you have a town with 20 people, someone does a murder and i have to find him) and then i have to COMPARE the single ADN with all 20 ADNS and i have to find a % and then to find the murderer..
1;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A
2;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A
3;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A;A
...
The file has 20 lines.
I've tried this so far but.. it doesn't work
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.IO;
namespace CSI_Marconi_FORM
{
public partial class DNAabitanti : Form
{
public DNAabitanti()
{
InitializeComponent();
}
private void DNAabitanti_Load(object sender, EventArgs e)
{
StreamReader reader = new StreamReader(#"\\Repo\Studenti$\Informatica\SezCi\4Ci\Corneliu.Cotet\Documenti\Visual Studio 2012\Projects\CSI Marconi FORM\CSI Marconi FORM\bin\Debug\DNAabitanti.txt");
reader = File.OpenText(#"\\Repo\Studenti$\Informatica\SezCi\4Ci\Corneliu.Cotet\Documenti\Visual Studio 2012\Projects\CSI Marconi FORM\CSI Marconi FORM\bin\Debug\DNAabitanti.txt");
FormPrincipale.utenti = File.ReadAllLines(#"\\Repo\Studenti$\Informatica\SezCi\4Ci\Corneliu.Cotet\Documenti\Visual Studio 2012\Projects\CSI Marconi FORM\CSI Marconi FORM\bin\Debug\DNAabitanti.txt").Length;
string abitanti = reader.ReadToEnd();
richTextBox1.Text = abitanti;
reader.Close();
FormPrincipale.database = new FormPrincipale.Persona[FormPrincipale.utenti];
FormPrincipale.corrispondenze = new int [FormPrincipale.utenti];
for (int i = 0; i < FormPrincipale.utenti; i++)
{
string letto = "";
letto = reader.ReadToEnd();
string[] aus = letto.Split(new char[] { ';' });
FormPrincipale.database[i].dna = new string[20];
for (int j = 0; j < 22; j++)
{
if (j < 20)
{
FormPrincipale.database[i].dna[j] = aus[j];
}
if (j == 20)
{
FormPrincipale.database[i].nome = aus[j];
}
if (j == 21)
{
FormPrincipale.database[i].cognome = aus[j];
}
}
}
}
}
}
Try this :
You have to first replace all ';'s with space and then maek the individual change with ':' after the first digit.That will replace the whole string in correct format.
string line;
System.IO.StreamReader file = new System.IO.StreamReader(#"d:\\textFile.txt");
while ((line = file.ReadLine()) != null)
{
string output = "";
//replacing all ';' with space
output = line.Replace(";", " ");
StringBuilder sb = new StringBuilder(output);
//replacing character after number with ':'
sb[1] = ':';
output = sb.ToString();
MessageBox.Show(output);
}
file.Close();
Without seeing your code, try something like:
var myArray = myFileContents.Split(new char [] { '\n' });

split string into char and add "......" after in C#

i want to split a string into char
that is my string
"the stack overflow in very good website"
and i want to convert this string
like
first word and second split into character
the.. .. .. t.. ..h.. ..e.. .. stack.. .. ..s.. ..t.. ..a.. ..c.. ..k.. .. overflow.. .. ..o.. ..v.. ..e.. ..r.. ..f.. ..l.. ..o.. ..w.. .. in.. .. ..i.. ..n.. .. very.. .. ..v.. ..e.. ..r.. ..y.. .. good.. .. ..g.. ..o.. ..o.. ..d.. .. website.. .. ..w.. ..e.. ..b.. ..s.. ..i.. ..t.. ..e.. ..
i am using natural Reader software and making a dictation mp3 file with spelling
that is my program
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace file
{
class Program
{
public static string fileLoc = #"C:\Users\Administrator\Desktop\sample1.txt";
public static string s;
public static string data = "the stack overflow in very good website";
private static void Main(string[] args)
{
Create_File();
Wrint_in_File();
Read_file();
add_comma();
s = Console.ReadLine();
}
public static void Wrint_in_File()
{
if (File.Exists(fileLoc))
{
using (StreamWriter sw = new StreamWriter(fileLoc))
{
sw.WriteLine(DateTime.Now);
sw.WriteLine(data);
Console.WriteLine("Data is successfully save in File");
}
}
}
public static void Create_File()
{
FileStream fs = null;
if (!File.Exists(fileLoc))
{
using (fs = File.Create(fileLoc))
{
Console.WriteLine(#"File is Successfully Created at C:\Users\Administrator\Desktop\sample1.txt");
Console.ReadLine();
}
}
}
public static void Read_file()
{
if (File.Exists(fileLoc))
{
using (TextReader tr = new StreamReader(fileLoc))
{
string s= tr.ReadToEnd();
Console.WriteLine(s);
Console.ReadLine();
}
}
}
public static void add_comma()
{
if (File.Exists(fileLoc))
{
using (StreamWriter sw = new StreamWriter(fileLoc))
{
sw.WriteLine(DateTime.Now);
string txt =data.Replace(" ", ".. .. .. .. .. .. .. ..");
sw.WriteLine(txt);
Console.WriteLine(txt);
}
}
}
}
}
using LINQ you can do:
string str = "the stock overflow in very good website";
string separator = "...";
string joinedString = string.Join("", (str.Split()
.Select(r=> r + separator +
(string.Join(separator, r.ToCharArray()))
+separator)));
Console.WriteLine(joinedString);
(By the way its stack overflow)
Ouput would be:
the...t...h...e...stock...s...t...o...c...k...overflow...o...v...e...r...f...l..
.o...w...in...i...n...very...v...e...r...y...good...g...o...o...d...website...w.
..e...b...s...i...t...e...
(Remember to include using System.Linq;)
You can use Linq:
string data = "the stock overflow in very good website";
IEnumerable<string> tokens = data.Split()
.Select(w => string.Format("{0}...{1}", w
, string.Join("...", w.Select(c => string.Format("{0}...", c)))));
string result = string.Join(" ", tokens);
Demo
make it simple
string data = "the stack overflow is a very good website";
string []words = data.Split(' ');
string finalString = string.Empty;
string separator ="...";
foreach (string word in words)
{
finalString += word + separator;
string temp = string.Empty;
foreach (char c in word)
{
temp += c + separator;
}
finalString += temp + separator;
temp = string.Empty;
}
//do whatever you want to do with finalString

The old switcheroo (switch position in file)

I would really appreciate if somebody could help me/offer advice on this.
I have a file, probably about 50000 lines long, these files are generated on a weekly basis. each line is identical in terms of type of content.
original file:
address^name^notes
but i need to perform a switch. i need to be able to switch (on each and every line) the address with the name. so after the switch has been done, the names will be first, and then addresses and then notes, like so:
result file:
name^address^notes
50,000 isn't that much these days, so simply reading in the whole file and outputting the wanted format should work fine for you:
string[] lines = File.ReadAllLines(fileName);
string newLine = string.Empty;
foreach (string line in lines)
{
string[] items = line.Split(myItemDelimiter);
newLine = string.Format("{0},{1},{2}", items[1], items[0], items[2]);
// Append to new file here...
}
How about this?
StreamWriter sw = new StreamWriter("c:\\output.txt");
StreamReader sr = new StreamReader("c:\\input.txt");
string inputLine = "";
while ((inputLine = sr.ReadLine()) != null)
{
String[] values = null;
values = inputLine.Split('^');
sw.WriteLine("{0}^{1}^{2}", values[1], values[0], values[2]);
}
sr.Close();
sw.Close();
Go go gadget REGEX!
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static string Switcheroo(string input)
{
return System.Text.RegularExpressions.Regex.Replace
(input,
#"^([^^]+)\^([^^]+)\^(.+)$",
"$2^$1^$3",
System.Text.RegularExpressions.RegexOptions.Multiline);
}
static void Main(string[] args)
{
string input = "address 1^name 1^notes1\n" +
"another address^another name^more notes\n" +
"last address^last name^last set of notes";
string output = Switcheroo(input);
Console.WriteLine(output);
Console.ReadKey(true);
}
}
}

Categories

Resources