I am creating a website which include a comment area for users. Example a guestbook or a product review. And I want to restrict a user on posting inappropriate languages on the comment area. For example: vulgarities.
If the user input any vulgarities, the characters would be replace by * . *Example - from stupid to s * * * * **.
I had been researching on related website but it was unfruitful. Suggestions or tutorials on this would be greatly appreciated.
There is no way to fully stop "bad language" from being used but you can try to prevent it by creating a text file that contains a bad word in every line. Then load the list of words from the file to a List<String> in your program. You can do so by doing the following:
// The list of swear words
List<string> swearWords = new List<string>();
private void GetSwearWords()
{
// Get the path to the file that has the swear words list
string path = <File Path>;
// Open the text file
TextReader reader = new StreamReader(path);
// Loop through each line in the file.
string line = "";
while ((line = reader.ReadLine()) != null)
{
// Lower cases word and removes whitespaces
string word = line.Trim().ToLower();
// Adds the word to the list
swearWords.Add(word);
}
}
Then, to determine if a string has one of these bad words you do the following:
private bool HasSwearWord(string text)
{
// Splits words, removes whitespace and any punctuation
string[] wordArray = Regex.Split(text, #"\W+");
// Check if any word in the string is a swear word
foreach (string word in wordArray)
{
if (swearWords.Contains(word.ToLower()))
{
return true;
}
}
return false;
}
Related
I am working on a final year project. I have a file that contain some text. I need to get words form this file that contain "//jj" tag. e.g abc//jj, bcd//jj etc.
suppose file is containing the following text
ffafa adada//bb adad ssss//jj aad adad adadad aaada dsdsd//jj
dsdsd sfsfhf//vv
dfdfdf
I need all the words that are associated with //jj tag. I am stuck here past few days.
My code that i am trying
// Create OpenFileDialog
Microsoft.Win32.OpenFileDialog dlg = new Microsoft.Win32.OpenFileDialog();
// Set filter for file extension and default file extension
dlg.DefaultExt = ".txt";
dlg.Filter = "Text documents (.txt)|*.txt";
// Display OpenFileDialog by calling ShowDialog method
Nullable<bool> result = dlg.ShowDialog();
// Get the selected file name and display in a TextBox
string filename = string.Empty;
if (result == true)
{
// Open document
filename = dlg.FileName;
FileNameTextBox.Text = filename;
}
string text;
using (var streamReader = new StreamReader(filename, Encoding.UTF8))
{
text = streamReader.ReadToEnd();
}
string FilteredText = string.Empty;
string pattern = #"(?<before>\w+) //jj (?<after>\w+)";
MatchCollection matches = Regex.Matches(text, pattern);
for (int i = 0; i < matches.Count; i++)
{
FilteredText="before:" + matches[i].Groups["before"].ToString();
//Console.WriteLine("after:" + matches[i].Groups["after"].ToString());
}
textbx.Text = FilteredText;
I cant find my result please help me.
With LINQ you could do this with one line:
string[] taggedwords = input.Split(' ').Where(x => x.EndsWith(#"//jj")).ToArray();
And all your //jj words will be there...
Personally I think Regex is overkill if that's definitely how the string will look. You haven't specified that you definitely need to use Regex so why not try this instead?
// A list that will hold the words ending with '//jj'
List<string> results = new List<string>();
// The text you provided
string input = #"ffafa adada//bb adad ssss//jj aad adad adadad aaada dsdsd//jj dsdsd sfsfhf//vv dfdfdf";
// Split the string on the space character to get each word
string[] words = input.Split(' ');
// Loop through each word
foreach (string word in words)
{
// Does it end with '//jj'?
if(word.EndsWith(#"//jj"))
{
// Yes, add to the list
results.Add(word);
}
}
// Show the results
foreach(string result in results)
{
MessageBox.Show(result);
}
Results are:
ssss//jj
dsdsd//jj
Obviously this is not quite as robust as a regex, but you didn't provide any more detail for me to go on.
You have an extra space in your regex, it assumes there's a space before "//jj". What you want is:
string pattern = #"(?<before>\w+)//jj (?<after>\w+)";
This regular expression will yield the words you are looking for:
string pattern = "(\\S*)\\/\\/jj"
A bit nicer without backslash escaping:
(\S*)\/\/jj
Matches will include the //jj but you can get the word from the first bracketed group.
I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?
Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();
Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.
You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.
I know I've been a bit of pain, the last couple of days, that is, with all my questions, but I've been developing this project and I'm (figuratively) inches away from finishing it.
That being said, I would like your help on one more matter. It kind of relates to my previous questions, but you do not need the code for those. The problem lies exactly on this bit of code. What I want from you is to help me identify it and, consequently, solve it.
Before I show you the code I'd been working on, I'd like to say a few extra things:
My application has a file merging feature, merging two files together and handling duplicate entries.
In any given file, each line can have one of these four formats (the last three are optional): Card Name|Amount, .Card Name|Amount, ..Card Name|Amount, _Card Name|Amount.
If a line is not appropriately formatted, the program will skip it (ignore it altogether).
So, basically, a sample file could be as follows:
Blue-Eyes White Dragon|3
..Blue-Eyes Ultimate Dragon|1
.Dragon Master Knight|1
_Kaibaman|1
Now, when it comes to using the file merger, if a line starts with one of the special characters . .. _, it should act accordingly. For ., it operates normally. For lines starting with .., it moves the index to the second dot and, finally, it ignores _ lines completely (they have another use not related to this discussion).
Here is my code for the merge function (for some odd reason, the code inside the second loop won't execute at all):
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
// Save file names to array.
string[] fileNames = openFileDialog1.FileNames;
// Loop through the files.
foreach (string fileName in fileNames)
{
// Save all lines of the current file to an array.
string[] lines = File.ReadAllLines(fileName);
// Loop through the lines of the file.
for (int i = 0; i < lines.Length; i++)
{
// Split current line.
string[] split = lines[i].Split('|');
// If the current line is badly formatted, skip to the next one.
if (split.Length != 2)
continue;
string title = split[0];
string times = split[1];
if (lines[i].StartsWith("_"))
continue;
// If newFile (list used to store contents of the card resource file) contains the current line of the file that we're currently looping through...
for (int k = 0; k < newFile.Count; k++)
{
if (lines[i].StartsWith(".."))
{
string newTitle = lines[i].Substring(
lines[i].IndexOf("..") + 1);
if (newFile[k].Contains(newTitle))
{
// Split the line once again.
string[] secondSplit = newFile.ElementAt(
newFile.IndexOf(newFile[k])).Split('|');
string secondTimes = secondSplit[1];
// Replace the newFile element at the specified index.
newFile[newFile.IndexOf(newFile[k])] =
string.Format("{0}|{1}", newTitle, int.Parse(times) + int.Parse(secondTimes));
}
// If newFile does not contain the current line of the file we're looping through, just add it to newFile.
else
newFile.Add(string.Format(
"{0}|{1}",
newTitle, times));
continue;
}
if (newFile[k].Contains(title))
{
string[] secondSplit = newFile.ElementAt(
newFile.IndexOf(newFile[k])).Split('|');
string secondTimes = secondSplit[1];
newFile[newFile.IndexOf(newFile[k])] =
string.Format("{0}|{1}", title, int.Parse(times) + int.Parse(secondTimes));
}
else
{
newFile.Add(string.Format("{0}|{1}", title, times));
}
}
}
}
// Overwrite resources file with newFile.
using (StreamWriter sw = new StreamWriter("CardResources.ygodc"))
{
foreach (string line in newFile)
sw.WriteLine(line);
}
I know this is quite a long piece of code, but I believe all of it is relevant to a point. I skipped some unimportant bits (after all of this is executed) as they are completely irrelevant.
I have a .txt file with a list of 174 different strings. Each string has an unique identifier.
For example:
123|this data is variable|
456|this data is variable|
789|so is this|
etc..
I wish to write a programe in C# that will read the .txt file and display only one of the 174 strings if I specify the ID of the string I want. This is because in the file I have all the data is variable so only the ID can be used to pull the string. So instead of ending up with the example about I get just one line.
eg just
123|this data is variable|
I seem to be able to write a programe that will pull just the ID from the .txt file and not the entire string or a program that mearly reads the whole file and displays it. But am yet to wirte on that does exactly what I need. HELP!
Well the actual string i get out from the txt file has no '|' they were just in the example. An example of the real string would be: 0111111(0010101) where the data in the brackets is variable. The brackets dont exsist in the real string either.
namespace String_reader
{
class Program
{
static void Main(string[] args)
{
String filepath = #"C:\my file name here";
string line;
if(File.Exists(filepath))
{
StreamReader file = null;
try
{
file = new StreamReader(filepath);
while ((line = file.ReadLine()) !=null)
{
string regMatch = "ID number here"; //this is where it all falls apart.
Regex.IsMatch (line, regMatch);
Console.WriteLine (line);// When program is run it just displays the whole .txt file
}
}
}
finally{
if (file !=null)
file.Close();
}
}
Console.ReadLine();
}
}
}
Use a Regex. Something along the lines of Regex.Match("|"+inputString+"|",#"\|[ ]*\d+\|(.+?)\|").Groups[1].Value
Oh, I almost forgot; you'll need to substitute the d+ for the actual index you want. Right now, that'll just get you the first one.
The "|" before and after the input string makes sure both the index and the value are enclosed in a | for all elements, including the first and last. There's ways of doing a Regex without it, but IMHO they just make your regex more complicated, and less readable.
Assuming you have path and id.
Console.WriteLine(File.ReadAllLines(path).Where(l => l.StartsWith(id + "|")).FirstOrDefault());
Use ReadLines to get a string array of lines then string split on the |
You could use Regex.Split method
FileInfo info = new FileInfo("filename.txt");
String[] lines = info.OpenText().ReadToEnd().Split(' ');
foreach(String line in lines)
{
int id = Convert.ToInt32(line.Split('|')[0]);
string text = Convert.ToInt32(line.Split('|')[1]);
}
Read the data into a string
Split the string on "|"
Read the items 2 by 2: key:value,key:value,...
Add them to a dictionary
Now you can easily find your string with dictionary[key].
first load the hole file to a string.
then try this:
string s = "123|this data is variable| 456|this data is also variable| 789|so is this|";
int index = s.IndexOf("123", 0);
string temp = s.Substring(index,s.Length-index);
string[] splitStr = temp.Split('|');
Console.WriteLine(splitStr[1]);
hope this is what you are looking for.
private static IEnumerable<string> ReadLines(string fspec)
{
using (var reader = new StreamReader(new FileStream(fspec, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
var dict = ReadLines("input.txt")
.Select(s =>
{
var split = s.Split("|".ToArray(), 2);
return new {Id = Int32.Parse(split[0]), Text = split[1]};
})
.ToDictionary(kv => kv.Id, kv => kv.Text);
Please note that with .NET 4.0 you don't need the ReadLines function, because there is ReadLines
You can now work with that as any dictionary:
Console.WriteLine(dict[12]);
Console.WriteLine(dict[999]);
No error handling here, please add your own
You can use Split method to divide the entire text into parts sepparated by '|'. Then all even elements will correspond to numbers odd elements - to strings.
StreamReader sr = new StreamReader(filename);
string text = sr.ReadToEnd();
string[] data = text.Split('|');
Then convert certain data elements to numbers and strings, i.e. int[] IDs and string[] Strs. Find the index of the given ID with idx = Array.FindIndex(IDs, ID.Equals) and the corresponding string will be Strs[idx]
List <int> IDs;
List <string> Strs;
for (int i = 0; i < data.Length - 1; i += 2)
{
IDs.Add(int.Parse(data[i]));
Strs.Add(data[i + 1]);
}
idx = Array.FindIndex(IDs, ID.Equals); // we get ID from input
answer = Strs[idx];
I have a file created from a directory listing. From each of item a user selects from a ListBox, the application reads the directory and writes out a file that matches all the contents. Once that is done it goes through each item in the ListBox and copies out the item that matches the ListBox selection. Example:
Selecting 0001 matches:
0001456.txt
0001548.pdf.
The code i am using isn't handling 0s very well and is giving bad results.
var listItems = listBox1.Items.OfType<string>().ToArray();
var writers = new StreamWriter[listItems.Length];
for (int i = 0; i < listItems.Length; i++)
{
writers[i] = File.CreateText(
Path.Combine(destinationfolder, listItems[i] + "ANN.TXT"));
}
var reader = new StreamReader(File.OpenRead(masterdin + "\\" + "MasterANN.txt"));
string line;
while ((line = reader.ReadLine()) != null)
{
for (int i = 0; i < listItems.Length; i++)
{
if (line.StartsWith(listItems[i].Substring(0, listItems[i].Length - 1)))
writers[i].WriteLine(line);
}
}
Advice for correcting this?
Another Sample:
I have 00001 in my listbox: it returns these values:
00008771~63.txt
00002005~3.txt
00009992~1.txt
00001697~1.txt
00000001~1.txt
00009306~2.txt
00000577~1.txt
00001641~1.txt
00001647~1.txt
00001675~1.txt
00001670~1.txt
It should only return:
00001641~1.txt
00001647~1.txt
00001675~1.txt
00001670~1.txt
00001697~1.txt
Or if someone could just suggest a better method for taking each line in my listbox searching for line + "*" and whatever matches writes out a textfile...
This is all based pretty much on the one example you gave, but I believe the problem is that when you are performing your matching, you are getting the substring if your list item value and chopping off the last character.
In your sample you are attempting to match files starting with "00001", but when you do the match you are getting substring starting at zero and value.length-1 characters, which in this case would be "0000". For example:
string s = "00001";
Console.WriteLine(s.Substring(0,s.Length-1));
results in
0000
So I think if you just changed this line:
if (line.StartsWith(listItems[i].Substring(0, listItems[i].Length - 1)))
writers[i].WriteLine(line);
to this
if (line.StartsWith(listItems[i]))
writers[i].WriteLine(line);
you would be in good shape.
Sorry if I misunderstood your question, but let's start with this:
string line = String.Empty;
string selectedValue = "00001";
List<string> matched = new List<string>();
StreamReader reader = new StreamReader(Path.Combine(masterdin, "MasterANN.txt"));
while((line = reader.ReadLine()) != null)
{
if(line.StartsWith(selectedValue))
{
matched.Add(line);
}
}
This will match all lines from your MasterANN.txt file which begins with "00001" and add them into a collection (later we'll work on writing this into a file, if required).
This clarifies something?