Replace string with multiple different options - c#

Hi there wonderful people of stackOverFlow.
I am currently in a position where im totaly stuck. What i want to be able to do is take out a word from a text and replace it with a synonym. I thought about it for a while and figured out how to do it if i ONLY have one possible synonym with this code.
string pathToDesk = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
string text = System.IO.File.ReadAllText(pathToDesk + "/Text.txt");
string replacementsText = System.IO.File.ReadAllText(pathToDesk + "/Replacements.txt");
string wordsToReplace = System.IO.File.ReadAllText(pathToDesk + "/WordsToReplace.txt");
string[] words = text.Split(' ');
string[] reWords = wordsToReplace.Split(' ');
string[] replacements = replacementsText.Split(' ');
for(int i = 0; i < words.Length; i++) {//for each word
for(int j = 0; j < replacements.Length; j++) {//compare with the possible synonyms
if (words[i].Equals(reWords[j], StringComparison.InvariantCultureIgnoreCase)) {
words[i] = replacements[j];
}
}
}
string newText = "";
for(int i = 0; i < words.Length; i++) {
newText += words[i] + " ";
}
txfInput.Text = newText;
But lets say that we were to get the word hi. Then i want to be able to replace that with {"Hello","Yo","Hola"}; (For example)
Then my code will not be good for anything since they will not have the same position in the arrays.
Is there any smart solution to this I would really like to know.

you need to store your synonyms differently
in your file you need something like
hello yo hola hi
awesome fantastic great
then for each line, split the words, put them in an array array of arrays
Now use that to find replacement words
This won't be super optimized, but you can easily index each word to a group of synonyms as well.
something like
public class SynonymReplacer
{
private Dictionary<string, List<string>> _synonyms;
public void Load(string s)
{
_synonyms = new Dictionary<string, List<string>>();
var lines = s.Split(new[] {'\r', '\n'}, StringSplitOptions.RemoveEmptyEntries);
foreach (var line in lines)
{
var words = line.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries).ToList();
words.ForEach(word => _synonyms.Add(word, words));
}
}
public string Replace(string word)
{
if (_synonyms.ContainsKey(word))
{
return _synonyms[word].OrderBy(a => Guid.NewGuid())
.FirstOrDefault(w => w != word) ?? word;
}
return word;
}
}
The OrderBy gets you a random synonym...
then
var s = new SynonymReplacer();
s.Load("hi hello yo hola\r\nawesome fantastic great\r\n");
Console.WriteLine(s.Replace("hi"));
Console.WriteLine(s.Replace("ok"));
Console.WriteLine(s.Replace("awesome"));
var words = new string[] {"hi", "you", "look", "awesome"};
Console.WriteLine(string.Join(" ", words.Select(s.Replace)));
and you get :-
hello
ok
fantastic
hello you look fantastic

Your first task will be to build a list of words and synonyms. A Dictionary will be perfect for this. The text file containing this list might look like this:
word1|synonym11,synonym12,synonym13
word2|synonym21,synonym22,synonym23
word3|synonym31,synonym32,synonym33
Then you can construct the dictionary like this:
public Dictionary<string, string[]> GetSynonymSet(string synonymSetTextFileFullPath)
{
var dict = new Dictionary<string, string[]>();
string line;
// Read the file and display it line by line.
using (var file = new StreamReader(synonymSetTextFileFullPath))
{
while((line = file.ReadLine()) != null)
{
var split = line.Split('|');
if (!dict.ContainsKey(split[0]))
{
dict.Add(split[0], split[1].Split(','));
}
}
}
return dict;
}
The eventual code will look like this
public string ReplaceWordsInText(Dictionary<string, string[]> synonymSet, string text)
{
var newText = new StringBuilder();
string[] words = text.Split(' ');
for (int i = 0; i < words.Length; i++) //for each word
{
string[] synonyms;
if (synonymSet.TryGetValue(words[i], out synonyms)
{
// The exact synonym you wish to use is up to you.
// I will just use the first one
words[i] = synonyms[0];
}
newText.AppendFormat("{0} ", words[i]);
}
return newText.ToString();
}

Related

Remove Identical Words from a string array

The goal is to remove a certain prefix word from a string in string array example: ["Market1", "Market2", "Market3"]. The prefix word Market is dominant in string array, so we have to remove Market from string array so the result should be ["1", "2", "3"]. Please take note that the Market prefix word in string could be anything.
Look for the first character that is not identical among all strings and select a substring starting at that position to remove the prefix.
string[] words = new string[] { "Market1", "Market2", "Market3" };
int i = 0;
while (words.All(word => word.Length > i && word[i] == words[0][i])) ++i;
var wordsWithoutPrefixes = words.Select(word => word.Substring(i)).ToArray();
Make a delimited string and then replace all the Market with an empty string and then split the string to an array.
string[] arr = new string[] { "Market1", "Market2", "Market3" };
string[] result = string.Join(".", arr).Replace("Market", "").Split('.');
Loop through each item in the array and for each item chop off the beginning the start matches.
var commonPrefix = "Market";
for (int i = 0; i < arr.length, i++) {
if(arr[i].IndexOf(commonPrefix) == 0) {
arr[i] = arr[i].Substring(commonPrefix.Length);
}
}
You can use LINQ:
string[] myArray = ["Market1", "Market2", "Market3"];
string prefix = myArray[0];
foreach (var s in myArray)
{
while (!s.StartsWith(prefix))
prefix = prefix.Substring(0, prefix.Length - 1);
}
string[] result = myArray
.Select(s => s.Substring(prefix.Length))
.ToArray();
Loop through the array of string and replace the substring containing prefix with an empty substring.
string[] s=new string[]{"Market1","Market2","Market3"};
string prefix="Market";
foreach(var x in s)
{
if(x.Contains(prefix))
{
x=x.Replace(prefix,"");
}
}

String split by every 3 words

I've got a problem. I need to split my every string like this:
For example:
"Economic drive without restrictions"
I need array with sub string like that:
"Economic drive without"
"drive without restrictions"
For now i have this:
List<string> myStrings = new List<string>();
foreach(var text in INPUT_TEXT) //here is Economic drive without restrictions
{
myStrings.DefaultIfEmpty();
var textSplitted = text.Split(new char[] { ' ' });
int j = 0;
foreach(var textSplit in textSplitted)
{
int i = 0 + j;
string threeWords = "";
while(i != 3 + j)
{
if (i >= textSplitted.Count()) break;
threeWords = threeWords + " " + textSplitted[i];
i++;
}
myStrings.Add(threeWords);
j++;
}
}
You could use this LINQ query:
string text = "Economic drive without restrictions";
string[] words = text.Split();
List<string> myStrings = words
.Where((word, index) => index + 3 <= words.Length)
.Select((word, index) => String.Join(" ", words.Skip(index).Take(3)))
.ToList();
Because others commented that it would be better to show a loop version since OP is learning this language, here is a version that uses no LINQ at all:
List<string> myStrings = new List<string>();
for (int index = 0; index + 3 <= words.Length; index++)
{
string[] slice = new string[3];
Array.Copy(words, index, slice, 0, 3);
myStrings.Add(String.Join(" ", slice));
}
I try to give a simple solution. So i hope you can better understand it.
List<string> myStrings = new List<string>();
string input = "Economic drive without restrictions";
var allWords = input.Split(new char[] {' '});
for (int i = 0; i < allWords.Length - 2; i++)
{
var textSplitted = allWords.Skip(i).Take(3);
string threeString = string.Join(" ", textSplitted);
myStrings.Add(threeString);
}
foreach (var myString in myStrings)
{
Console.WriteLine(myString);
}
The method Take(n) is from Linq. It takes the first n elements of the given array. for example if you have an array a,b,c,d,e then Take(3) will give you a new array a,b,c.
The method Skip(n) is from Linq. It gives you the new array by skipping first n elements. given array a,b,c,d,e then Skip(1) will return b,c,d,e. as you can see it skipped the first elements.
Now with this two methods you can move on array 3 by 3 and get the words you want.
Just for comparative purposes, here's another solution that doesn't use Linq:
string[] words = INPUT_TEXT.Split();
List<string> myStrings = new List<string>();
for (int i = 0; i < words.Length - 2; ++i)
myStrings.Add(string.Join(" ", words[i], words[i+1], words[i+2]));
Or using ArraySegment<string>:
string[] words = INPUT_TEXT.Split();
List<string> myStrings = new List<string>();
for (int i = 0; i < words.Length - 2; ++i)
myStrings.Add(string.Join(" ", new ArraySegment<string>(words, i, 3)));
I would use one of the methods described here ; for instance the following that takes the elements 3 by 3.
var groups = myStrings.Select((p, index) => new {p,index})
.GroupBy(a =>a.index/3);
Warning, it is not the most memory efficient, if you start parsing big strings, it might blow up on you. Try and observe.
Then you only need to handle the last element. If it has less than 3 strings, fill it up from the left.

Split string by commas ignoring any punctuation marks (including ',') in quotation marks

How can I split string (from a textbox) by commas excluding those in double quotation marks (without getting rid of the quotation marks), along with other possible punctuation marks (e.g. ' . ' ' ; ' ' - ')?
E.g. If someone entered the following into the textbox:
apple, orange, "baboons, cows", rainbow, "unicorns, gummy bears"
How can I split the above string into the following (say, into a List)?
apple
orange
"baboons, cows"
rainbow
"Unicorns, gummy bears..."
Thank you for your help!
You could try the below regex which uses positive lookahead,
string value = #"apple, orange, ""baboons, cows"", rainbow, ""unicorns, gummy bears""";
string[] lines = Regex.Split(value, #", (?=(?:""[^""]*?(?: [^""]*)*))|, (?=[^"",]+(?:,|$))");
foreach (string line in lines) {
Console.WriteLine(line);
}
Output:
apple
orange
"baboons, cows"
rainbow
"unicorns, gummy bears"
IDEONE
Try this:
Regex str = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);
foreach (Match m in str.Matches(input))
{
Console.WriteLine(m.Value.TrimStart(','));
}
You may also try to look at FileHelpers
Much like a CSV parser, instead of Regex, you can loop through each character, like so:
public List<string> ItemStringToList(string inputString)
{
var itemList = new List<string>();
var currentIem = "";
var quotesOpen = false;
for (int i = 0; i < inputString.Length; i++)
{
if (inputString[i] == '"')
{
quotesOpen = !quotesOpen;
continue;
}
if (inputString[i] == ',' && !quotesOpen)
{
itemList.Add(currentIem);
currentIem = "";
continue;
}
if (currentIem == "" && inputString[i] == ' ') continue;
currentIem += inputString[i];
}
if (currentIem != "") itemList.Add(currentIem);
return itemList;
}
Example test usage:
var test1 = ItemStringToList("one, two, three");
var test2 = ItemStringToList("one, \"two\", three");
var test3 = ItemStringToList("one, \"two, three\"");
var test4 = ItemStringToList("one, \"two, three\", four, \"five six\", seven");
var test5 = ItemStringToList("one, \"two, three\", four, \"five six\", seven");
var test6 = ItemStringToList("one, \"two, three\", four, \"five six, seven\"");
var test7 = ItemStringToList("\"one, two, three\", four, \"five six, seven\"");
You could change it to use StringBuilder if you want faster character joining.
Try with this it will work u c an split array string in many waysif you want to split by white space just put a space in (' ') .
namespace LINQExperiment1
{
class Program
{
static void Main(string[] args)
{
string[] sentence = new string[] { "apple", "orange", "baboons cows", " rainbow", "unicorns gummy bears" };
Console.WriteLine("option 1:"); Console.WriteLine("————-");
// option 1: Select returns three string[]’s with
// three strings in each.
IEnumerable<string[]> words1 =
sentence.Select(w => w.Split(' '));
// to get each word, we have to use two foreach loops
foreach (string[] segment in words1)
foreach (string word in segment)
Console.WriteLine(word);
Console.WriteLine();
Console.WriteLine("option 2:"); Console.WriteLine("————-");
// option 2: SelectMany returns nine strings
// (sub-iterates the Select result)
IEnumerable<string> words2 =
sentence.SelectMany(segment => segment.Split(','));
// with SelectMany we have every string individually
foreach (var word in words2)
Console.WriteLine(word);
// option 3: identical to Opt 2 above written using
// the Query Expression syntax (multiple froms)
IEnumerable<string> words3 =from segment in sentence
from word in segment.Split(' ')
select word;
}
}
}
This was trickier than I thought, a good practical problem I think.
Below is the solution I came up with for this. One thing I don't like about my solution is having to add double quotations back and the other one being names of the variables :p:
internal class Program
{
private static void Main(string[] args)
{
string searchString =
#"apple, orange, ""baboons, cows. dogs- hounds"", rainbow, ""unicorns, gummy bears"", abc, defghj";
char delimeter = ',';
char excludeSplittingWithin = '"';
string[] splittedByExcludeSplittingWithin = searchString.Split(excludeSplittingWithin);
List<string> splittedSearchString = new List<string>();
for (int i = 0; i < splittedByExcludeSplittingWithin.Length; i++)
{
if (i == 0 || splittedByExcludeSplittingWithin[i].StartsWith(delimeter.ToString()))
{
string[] splitttedByDelimeter = splittedByExcludeSplittingWithin[i].Split(delimeter);
for (int j = 0; j < splitttedByDelimeter.Length; j++)
{
splittedSearchString.Add(splitttedByDelimeter[j].Trim());
}
}
else
{
splittedSearchString.Add(excludeSplittingWithin + splittedByExcludeSplittingWithin[i] +
excludeSplittingWithin);
}
}
foreach (string s in splittedSearchString)
{
if (s.Trim() != string.Empty)
{
Console.WriteLine(s);
}
}
Console.ReadKey();
}
}
Another Regex solution:
private static IEnumerable<string> Parse(string input)
{
// if used frequently, should be instantiated with Compiled option
Regex regex = new Regex(#"(?<=^|,\s)(\""(?:[^\""]|\""\"")*\""|[^,\s]*)");
return regex.Matches(inputData).Where(m => m.Success);
}

split string in to several strings at specific points

I have a text file with lines of text laid out like so
12345MLOL68
12345MLOL68
12345MLOL68
I want to read the file and add commas to the 5th point, 6th point and 9th point and write it to a different text file so the result would be.
12345,M,LOL,68
12345,M,LOL,68
12345,M,LOL,68
This is what I have so far
public static void ToCSV(string fileWRITE, string fileREAD)
{
int count = 0;
string x = "";
StreamWriter commas = new StreamWriter(fileWRITE);
string FileText = new System.IO.StreamReader(fileREAD).ReadToEnd();
var dataList = new List<string>();
IEnumerable<string> splitString = Regex.Split(FileText, "(.{1}.{5})").Where(s => s != String.Empty);
foreach (string y in splitString)
{
dataList.Add(y);
}
foreach (string y in dataList)
{
x = (x + y + ",");
count++;
if (count == 3)
{
x = (x + "NULL,NULL,NULL,NULL");
commas.WriteLine(x);
x = "";
count = 0;
)
}
commas.Close();
}
The problem I'm having is trying to figure out how to split the original string lines I read in at several points. The line
IEnumerable<string> splitString = Regex.Split(FileText, "(.{1}.{5})").Where(s => s != String.Empty);
Is not working in the way I want to. It's just adding up the 1 and 5 and splitting all strings at the 6th char.
Can anyone help me split each string at specific points?
Simpler code:
public static void ToCSV(string fileWRITE, string fileREAD)
{
string[] lines = File.ReadAllLines(fileREAD);
string[] splitLines = lines.Select(s => Regex.Replace(s, "(.{5})(.)(.{3})(.*)", "$1,$2,$3,$4")).ToArray();
File.WriteAllLines(fileWRITE, splitLines);
}
Just insert at the right place in descending order like this.
string str = "12345MLOL68";
int[] indices = {5, 6, 9};
indices = indices.OrderByDescending(x => x).ToArray();
foreach (var index in indices)
{
str = str.Insert(index, ",");
}
We're doing this in descending order because if we do other way indices will change, it will be hard to track it.
Here is the Demo
Why don't you use substring , example
editedstring=input.substring(0,5)+","+input.substring(5,1)+","+input.substring(6,3)+","+input.substring(9);
This should suits your need.

C# random char between first letter to special char [=]

My problem is with getting values from .txt file.
I have this for example[without enter]:
damage=20 big explosion=50 rangeweapon=50.0
and I want to get these values after "=". Just to make a string[] with something like that:
damage=20
big explosion=50
rangeweapon=50.0
I got some other mechanic but i want to find universal mechanic to get all values into string[] and then just check it in switch.
Thank You.
I have try to solve your problem with regex. I found one solution that is not best solution.
May be it can help or guide you to find best solution.
Please try like this
string operation = "damage=20 big explosion=50 rangeweapon=50.0";
string[] wordText = Regex.Split(operation, #"(\=\d+\.?\d?)");
/*for split word but result array has last value that empty you will delete its*/
string[] wordValue = Regex.Split(operation, #"(\s*\D+\=)"); /*for split digit that is value of word but result array has first value that empty you will delete its*/
After that you can join or do anything you want with those array.
This should parse the string you describe, but keep in mind it isn't very robust and has no error handling.
string stringToParse = "damage=20 big explosion=50 rangeweapon=50.0";
string[] values = stringToParse.Split(' ');
Dictionary<string, double> parsedValues = new Dictionary<string, double>();
string temp = "";
foreach (var value in values)
{
if (value.Contains('='))
{
string[] keyValue = value.Split('=');
parsedValues.Add(temp + keyValue[0], Double.Parse(keyValue[1]));
temp = string.Empty;
}
else
{
temp += value + ' ';
}
}
After this, the parsedValues dictionary should have the information you're looking for.
I'm not an expert about it, but what about using a Regex ?
Not the cleanest code in the world, but will work for your situation.
string input = "damage=20 big explosion=50 rangeweapon=50.0";
string[] parts = input.Split('=');
Dictionary<string, double> dict = new Dictionary<string, double>();
for (int i = 0; i < (parts.Length - 1); i++)
{
string key = i==0?parts[i]:parts[i].Substring(parts[i].IndexOf(' '));
string value = i==parts.Length-2?parts[i+1]:parts[i + 1].Substring(0, parts[i + 1].IndexOf(' '));
dict.Add(key.Trim(), Double.Parse(value));
}
foreach (var el in dict)
{
Console.WriteLine("Key {0} contains value {1}", el.Key, el.Value);
}
Console.ReadLine();
You want to read number from text.You can save your data in text like this.
damage=20,big explosion=50,rangeweapon=50. And read from text via File.ReadAllLines().
string[] Lines;
string[] myArray;
Lines = File.ReadAllLines(your file path);
for (int i = 0; i < Lines.Length; i++)
{
myArray = Lines[i].Split(',');
}
for (int j = 0; j < myArray .Length; j++)
{
string x =myArray [j].ToString();
x = Regex.Replace(x, "[^0-9.]", "");
Console.WriteLine(x);
}

Categories

Resources