Checking for and removing any characters in a string - c#

I am wondering what would be the best way to specify an array of characters like,
{
}
[
]
and then check a string for these and if they are there, to completely remove them.
if (compiler.Parser.GetErrors().Count == 0)
{
AstNode root = compiler.Parse(phrase.ToLower());
if (compiler.Parser.GetErrors().Count == 0)
{
try
{
fTextSearch = SearchGrammar.ConvertQuery(root, SearchGrammar.TermType.Inflectional);
}
catch
{
fTextSearch = phrase;
}
}
else
{
fTextSearch = phrase;
}
}
else
{
fTextSearch = phrase;
}
string[] brackets = brackets = new string[]
{
"{",
"}",
"[",
"]"
};
string[] errorChars = errorChars = new string[]
{
"'",
"&"
};
StringBuilder sb = new StringBuilder();
string[] splitString = fTextSearch.Split(errorChars, StringSplitOptions.None);
int numNewCharactersAdded = 0;
foreach (string itm in splitString)
{
sb.Append(itm); //append string
if (fTextSearch.Length > (sb.Length - numNewCharactersAdded))
{
sb.Append(fTextSearch[sb.Length - numNewCharactersAdded]); //append splitting character
sb.Append(fTextSearch[sb.Length - numNewCharactersAdded - 1]); //append it again
numNewCharactersAdded++;
}
}
string newString = sb.ToString();

A regular expression can do this far more easily:
var result = Regex.Replace(input, #"[[\]()]", "");
Using a character set ([...]) to match anyone of the characters in it and replace with nothing. Regex.Replace will replace all matches.

Another concise way is using Enumerable.Except to get the set difference of the Chars(assuming brackets are chars):
String newString = new String(oldString.Except(brackets).ToArray());

string str = "faslkjnro(fjrmn){ferqwe}{{";
char[] separators = new []{'[', ']','{','}' };
var sb = new StringBuilder();
foreach (var c in str)
{
if (!separators.Contains(c))
{
sb.Append(c);
}
}
return sb.ToString();

How about this:
string myString = "a12{drr[ferr]vgb}rtg";
myString = myString.Replace("[", "").Replace("{", "").Replace("]", "").Replace("}", "");
You end up with:
a12drrferrvgbrtg

I don't know if I understand your problem, but you can solve your problem with this:
string toRemove = "{}[]";
string result = your_string_to_be_searched;
foreach(char c in toRemove)
result = result.Replace(c.ToString(), "");
or with an extension method
static class Extensions
{
public static string RemoveAll(this string src, string chars)
{
foreach(char c in chars)
src= src.Replace(c.ToString(), "");
return src;
}
}
With this you can use string result = your_string_to_be_searched.RemoveAll("{}[]");

string charsToRemove = #"[]{}";
string pattern = string.Format("[{0}]", Regex.Escape(charsToRemove));
var result = Regex.Replace(input, pattern, "");
The primary advantage of this over some of the other similar answers is that you aren't bothered with determining which characters need to be escaped in RegEx; you can let the library take care of that for you.

You can do this in a pretty compact fashion like this:
string s = "ab{c[d]}";
char[] ca = new char[] {'{', '}', '[', ']'};
Array.ForEach(ca, e => s = s.Replace(e.ToString(), ""));
Or this:
StringBuilder s = new StringBuilder("ab{c[d]}");
char[] ca = new char[] {'{', '}', '[', ']'};
Array.ForEach(ca, e => s.Replace(e.ToString(), ""));

Taken from this answer: https://stackoverflow.com/a/12800424/1498669
Just use .Split() with the char[] of your desired removeables and recapture it with .Join() or .Concat()
char[] delChars = "[]{}<>()".ToCharArray();
string input = "some (crazy) string with brac[et]s in{si}de";
string output = string.Join(string.Empty, input.Split(delChars));
//or
string output = string.Concat(input.Split(delChars));
References:
https://learn.microsoft.com/en-us/dotnet/csharp/how-to/parse-strings-using-split
https://learn.microsoft.com/en-us/dotnet/csharp/how-to/concatenate-multiple-strings#code-try-4

Related

Removing special characters from a string with RegEx

Am reading a text file that contains words, numbers and special characters, I want to remove certain special characters like: [](),'
I have this code but it is not working !
using (var reader = new StreamReader ("C://Users//HP//Documents//result2.txt")) {
string line = reader.ReadToEnd ();
Regex rgx = new Regex ("[^[]()',]");
string res = rgx.Replace (line, "");
Message1.text = res;
}
what am I missing, thanks
Some of the characters in your Regex, specifically [ ] ( ) ^, hold special meaning in Regex and in order to use them literally they must be escaped.
Use the following properly escaped Regex:
Regex rgx = new Regex (#"[\^\[\]\(\)',]");
Note that it is necessary to use the # verbatim string, because we don't want to escape these characters from the string, only from the Regex.
Alternatively, double escape the backslashes:
Regex rgx = new Regex ("[\\^\\[\\]\\(\\)',]");
But that's less readable in this case.
You could skip Regex and just maintain a list of characters you want to remove and then replace the old fashioned way:
string[] specialCharsToRemove = new [] { "[", "]", "(", ")", "'", "," };
using (var reader = new StreamReader ("C://Users//HP//Documents//result2.txt"))
{
string line = reader.ReadToEnd();
foreach(string s in specialCharsToRemove)
{
line = line.Replace(s, string.Empty);
}
Message1.text = res;
}
Ideally this would be in its own method, something like this:
private static string RemoveCharacters(string input, string[] specialCharactersToRemove)
{
foreach(string s in specialCharactersToRemove)
{
input = input.Replace(s, string.Empty);
}
return input;
}
I made a fiddle here
Replace them one at a time with String.Replace:
using (var reader = new StreamReader ("C://Users//HP//Documents//result2.txt"))
{
string line = reader.ReadToEnd ();
string res = line.Replace(line, "[", "");
res = res.Replace(line, "]", "");
res = res.Replace(line, "(", "");
res = res.Replace(line, ")", "");
res = res.Replace(line, "'", "");
res = res.Replace(line, ",", "");
Message1.text = res;
}
I agree with avoiding regex for this, but I would not use string.Replace multiple times, either.
Consider implementing a Replace or Remove method that accepts an array of characters to replace, and scan the input string only once. For example:
var builder = new StringBuilder();
foreach (char ch in input)
{
if (!chars.Contains(ch))
{
builder.Append(ch):
}
}
return builder.ToString();

Replace all occurrences of a string (in array) with a single value

I have a string array:
string[] arr2 = { "/", "#", "&" };
I have another string (i.e. strValue). Is there a clean way to replace all instances of the array contents with a single value (i.e. an underscore)? So before:
strValue = "a/ new string, with some# values&"
And after:
strValue = "a_ new string, with some_ values_"
I considered doing this:
strValue = strValue.Replace("/", "_");
strValue = strValue.Replace("#", "_");
strValue = strValue.Replace("&", "_");
But my array of characters to replace may become a lot bigger.
Instead of using the Replace over and over you could just write your own. This might even be a performance gain since you mentioned
But my array may get a lot bigger.
public string Replace(string original, char replacement, params char[] replaceables)
{
StringBuilder builder = new StringBuilder(original.Length);
HashSet<char> replaceable = new HashSet<char>(replaceables);
foreach(Char character in original)
{
if (replaceable.Contains(character))
builder.Append(replacement);
else
builder.Append(character);
}
return builder.ToString();
}
public string Replace(string original, char replacement, string replaceables)
{
return Replace(original, replacement, replaceables.ToCharArray());
}
Can be called like this:
Debug.WriteLine(Replace("a/ new string, with some# values&", '_', '/', '#', '&'));
Debug.WriteLine(Replace("a/ new string, with some# values&", '_', new[] { '/', '#', '&' }));
Debug.WriteLine(Replace("a/ new string, with some# values&", '_', existingArray));
Debug.WriteLine(Replace("a/ new string, with some# values&", '_',"/#&"));
Output:
a_ new string, with some_ values_
a_ new string, with some_ values_
a_ new string, with some_ values_
a_ new string, with some_ values_
As #Sebi pointed out, this would also work as an extension method:
public static class StringExtensions
{
public static string Replace(this string original, char replacement, params char[] replaceables)
{
StringBuilder builder = new StringBuilder(original.Length);
HashSet<Char> replaceable = new HashSet<char>(replaceables);
foreach (Char character in original)
{
if (replaceable.Contains(character))
builder.Append(replacement);
else
builder.Append(character);
}
return builder.ToString();
}
public static string Replace(this string original, char replacement, string replaceables)
{
return Replace(original, replacement, replaceables.ToCharArray());
}
}
Usage:
"a/ new string, with some# values&".Replace('_', '/', '#', '&');
existingString.Replace('_', new[] { '/', '#', '&' });
// etc.
This is how i'd do it building a regex clause from the list of delimiters and replacing them with an underscore
string[] delimiters = { "/", "#", "&" };
string clause = $"[{string.Join("]|[", delimiters)}]";
string strValue = "a/ new string, with some# values&";
Regex chrsToReplace = new Regex(clause);
string output = chrsToReplace.Replace(strValue, "_");
You'll probably want to encapsulate within if(delimiters.Any()), else it will crash if the array is empty
Sure. Here's one approach:
var newString = arr2.Aggregate(strValue, (net, curr) => net.Replace(curr, "_"));
If you're only substituting individual characters and have large enough input sizes to need optimization, you can create a set from which to substitute:
var substitutions = new HashSet<char>() { '/', '#', '&' };
var strValue = "a/ new string, with some# values&";
var newString = new string(strValue.Select(c => substitutions.Contains(c) ? '_' : c).ToArray());
Maybe not the fastest but the easiest would be a Select with a Contains.
Something like this : source.Select(c => blacklist.Contains(c) ? letter : c)
Demo on .NetFiddle.
using System;
using System.Linq;
public class Program
{
public static void Main()
{
var strValue = "a/ new string, with some# values&";
Console.WriteLine(strValue.Replace("/#&", '_'));
}
}
public static class Extensions {
public static string Replace(this string source, string blacklist, char letter) =>
new string(source.Select(c => blacklist.Contains(c) ? letter : c).ToArray());
}
You can split your string with your list of string []:
string[] arr2 = { "/", "#", "&" };
string strValue = "a/ new string, with some# values&";
string Output = null;
string[] split = strValue.Split(arr2, StringSplitOptions.RemoveEmptyEntries);
foreach (var item in split)
{
Output += item + "_";
}
Console.WriteLine(Output);
//-> a_ new string, with some_ values_
Updated answer with #aloisdg comment (interesting article, thank you).
string[] arr2 = { "/", "#", "&" };
string strValue = "a/ new string, with some# values&";
string[] split = strValue.Split(arr2, StringSplitOptions.RemoveEmptyEntries);
StringBuilder Output = new StringBuilder();
foreach (var item in split)
{
Output.Append(item + "_");
}
Console.WriteLine(Output);
//-> a_ new string, with some_ values_
You could use a foreach in a single line to achieve what you want:
arr2.ToList().ForEach(x => strValue = strValue.Replace(x, "_"));

How can i delete space from text file and replace it semicolon?

I have this data into the test text file:
behzad razzaqi xezerlooot abrizii ast
i want delete space and replace space one semicolon character,write this code in c# for that:
string[] allLines = File.ReadAllLines(#"d:\test.txt");
using (StreamWriter sw = new StreamWriter(#"d:\test.txt"))
{
foreach (string line in allLines)
{
if (!string.IsNullOrEmpty(line) && line.Length > 1)
{
sw.WriteLine(line.Replace(" ", ";"));
}
}
}
MessageBox.Show("ok");
behzad;;razzaqi;;xezerlooot;;;abrizii;;;;;ast
but i want one semicolon in space.how can i solve that?
Regex is an option:
string[] allLines = File.ReadAllLines(#"d:\test.txt");
using (StreamWriter sw = new StreamWriter(#"d:\test.txt"))
{
foreach (string line in allLines)
{
if (!string.IsNullOrEmpty(line) && line.Length > 1)
{
sw.WriteLine(Regex.Replace(line,#"\s+",";"));
}
}
}
MessageBox.Show("ok");
Use this code:
string[] allLines = File.ReadAllLines(#"d:\test.txt");
using (StreamWriter sw = new StreamWriter(#"d:\test.txt"))
{
foreach (string line in allLines)
{
string[] words = line.Split(" ", StringSplitOptions.RemoveEmptyEntries);
string joined = String.Join(";", words);
sw.WriteLine(joined);
}
}
You need to use a regular expression:
(\s\s+)
Usage
var input = "behzad razzaqi xezerlooot abrizii ast";
var pattern = "(\s\s+)";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, ';');
You can do that with a regular expression.
using System.Text.RegularExpressions;
and:
string pattern = "\\s+";
string replacement = ";";
Regex rgx = new Regex(pattern);
sw.WriteLine(rgx.Replace(line, replacement));
This regular expression matches any series of 1 or more spaces and replaces the entire series with a semicolon.
you can try this
Regex r=new Regex(#"\s+");
string result=r.Replace("YourString",";");
\s+ is for matching all spaces. + is for one or more occurrences.
for more information on regular expression see http://www.w3schools.com/jsref/jsref_obj_regexp.asp
You should check a string length after replacement, not before ;-).
const string file = #"d:\test.txt";
var result = File.ReadAllLines(file).Select(line => Regex.Replace(line, #"\s+", ";"));
File.WriteAllLines(file, result.Where(line => line.Length > 1));
...and don't forget, that for input hello you will get ;hello;.

Get character after certain character from a String

I need to get a characters after certain character match in a string. Please consider my Input string with expected resultant character set.
Sample String
*This is a string *with more than *one blocks *of values.
Resultant string
Twoo
I have done this
string[] SubIndex = aut.TagValue.Split('*');
string SubInd = "";
foreach (var a in SubIndex)
{
SubInd = SubInd + a.Substring(0,1);
}
Any help to this will be appreciated.
Thanks
LINQ solution:
var str = "*This is a string *with more than *one blocks *of values.";
var chars = str.Split(new char[] {'*'}, StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.First());
var output = String.Join("", chars);
string s = "*This is a string *with more than *one blocks *of values.";
string[] splitted = s.Split(new char[] { '*' }, StringSplitOptions.RemoveEmptyEntries);
string result = "";
foreach (string split in splitted)
result += split[0];
Console.WriteLine(result);
Below code should work
var s = "*This is a string *with more than *one blocks *of values."
while ((i = s.IndexOf('*', i)) != -1)
{
// Print out the next char
if(i<s.Length)
Console.WriteLine(s[i+1]);
// Increment the index.
i++;
}
String.Join("",input.Split(new char[]{'*'},StringSplitOptions.RemoveEmptyEntries)
.Select(x=>x.First())
);
string strRegex = #"(?<=\*).";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline | RegexOptions.Singleline);
string strTargetString = "*This is a string *with more than *one blocks *of values.";
StringBuilder sb = new StringBuilder();
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success) sb.Append(myMatch.Value);
}
string result = sb.ToString();
please see below...
char[] s3 = "*This is a string *with more than *one blocks *of values.".ToCharArray();
StringBuilder s4 = new StringBuilder();
for (int i = 0; i < s3.Length - 1; i++)
{
if (s3[i] == '*')
s4.Append(s3[i+1]);
}
Console.WriteLine(s4.ToString());

How to trim/replace a char '\' or '"' from a string?

I need to use a string for path for a file but sometimes there are forbidden characters in this string and I must replace them. For example, my string _title is rumbaton jonathan \"racko\" contreras.
Well I should replace the chars \ and ".
I tried this but it doesn't work:
_title.Replace(#"/", "");
_title.Replace(#"\", "");
_title.Replace(#"*", "");
_title.Replace(#"?", "");
_title.Replace(#"<", "");
_title.Replace(#">", "");
_title.Replace(#"|", "");
Since strings are immutable, the Replace method returns a new string, it doesn't modify the instance you are calling it on. So try this:
_title = _title
.Replace(#"/", "")
.Replace(#"""", "")
.Replace(#"*", "")
.Replace(#"?", "")
.Replace(#"<", "")
.Replace(#">", "")
.Replace(#"|", "");
Also if you want to replace " make sure you have properly escaped it.
Try regex
string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
illegal = r.Replace(illegal, "");
Before: "M"\a/ry/ h**ad:>> a/:?"| litt|le|| la"mb.?
After: Mary had a little lamb.
Also another answer from same post is much cleaner
private static string CleanFileName(string fileName)
{
return Path.GetInvalidFileNameChars().Aggregate(fileName, (current, c) => current.Replace(c.ToString(), string.Empty));
}
from How to remove illegal characters from path and filenames?
Or you could try this (probably terribly inefficient) method:
string inputString = #"File ~!##$%^&*()_+|`1234567890-=\[];',./{}:""<>? name";
var badchars = Path.GetInvalidFileNameChars();
foreach (var c in badchars)
inputString = inputString.Replace(c.ToString(), "");
The result will be:
File ~!##$%^&()_+`1234567890-=[];',.{} name
But feel free to add more chars to the badchars before running the foreach loop on them.
See http://msdn.microsoft.com/cs-cz/library/fk49wtc1.aspx:
Returns a string that is equivalent to the current string except that all instances of oldValue are replaced with newValue.
I have written a method to do the exact operation that you want and with much cleaner code.
The method
public static string Delete(this string target, string samples) {
if (string.IsNullOrEmpty(target) || string.IsNullOrEmpty(samples))
return target;
var tar = target.ToCharArray();
const char deletechar = '♣'; //a char that most likely never to be used in the input
for (var i = 0; i < tar.Length; i++) {
for (var j = 0; j < samples.Length; j++) {
if (tar[i] == samples[j]) {
tar[i] = deletechar;
break;
}
}
}
return tar.ConvertToString().Replace(deletechar.ToString(CultureInfo.InvariantCulture), string.Empty);
}
Sample
var input = "rumbaton jonathan \"racko\" contreras";
var cleaned = input.Delete("\"\\/*?><|");
Will result in:
rumbaton jonathan racko contreras
Ok ! I've solved my issue thanks to all your indications. This is my correction :
string newFileName = _artist + " - " + _title;
char[] invalidFileChars = Path.GetInvalidFileNameChars();
char[] invalidPathChars = Path.GetInvalidPathChars();
foreach (char invalidChar in invalidFileChars)
{
newFileName = newFileName.Replace(invalidChar.ToString(), string.Empty);
}
foreach (char invalidChar in invalidPathChars)
{
newFilePath = newFilePath.Replace(invalidChar.ToString(), string.Empty);
}
Thank you so musch everybody :)

Categories

Resources