Replace all occurrences of a string (in array) with a single value - c#

I have a string array:
string[] arr2 = { "/", "#", "&" };
I have another string (i.e. strValue). Is there a clean way to replace all instances of the array contents with a single value (i.e. an underscore)? So before:
strValue = "a/ new string, with some# values&"
And after:
strValue = "a_ new string, with some_ values_"
I considered doing this:
strValue = strValue.Replace("/", "_");
strValue = strValue.Replace("#", "_");
strValue = strValue.Replace("&", "_");
But my array of characters to replace may become a lot bigger.

Instead of using the Replace over and over you could just write your own. This might even be a performance gain since you mentioned
But my array may get a lot bigger.
public string Replace(string original, char replacement, params char[] replaceables)
{
StringBuilder builder = new StringBuilder(original.Length);
HashSet<char> replaceable = new HashSet<char>(replaceables);
foreach(Char character in original)
{
if (replaceable.Contains(character))
builder.Append(replacement);
else
builder.Append(character);
}
return builder.ToString();
}
public string Replace(string original, char replacement, string replaceables)
{
return Replace(original, replacement, replaceables.ToCharArray());
}
Can be called like this:
Debug.WriteLine(Replace("a/ new string, with some# values&", '_', '/', '#', '&'));
Debug.WriteLine(Replace("a/ new string, with some# values&", '_', new[] { '/', '#', '&' }));
Debug.WriteLine(Replace("a/ new string, with some# values&", '_', existingArray));
Debug.WriteLine(Replace("a/ new string, with some# values&", '_',"/#&"));
Output:
a_ new string, with some_ values_
a_ new string, with some_ values_
a_ new string, with some_ values_
a_ new string, with some_ values_
As #Sebi pointed out, this would also work as an extension method:
public static class StringExtensions
{
public static string Replace(this string original, char replacement, params char[] replaceables)
{
StringBuilder builder = new StringBuilder(original.Length);
HashSet<Char> replaceable = new HashSet<char>(replaceables);
foreach (Char character in original)
{
if (replaceable.Contains(character))
builder.Append(replacement);
else
builder.Append(character);
}
return builder.ToString();
}
public static string Replace(this string original, char replacement, string replaceables)
{
return Replace(original, replacement, replaceables.ToCharArray());
}
}
Usage:
"a/ new string, with some# values&".Replace('_', '/', '#', '&');
existingString.Replace('_', new[] { '/', '#', '&' });
// etc.

This is how i'd do it building a regex clause from the list of delimiters and replacing them with an underscore
string[] delimiters = { "/", "#", "&" };
string clause = $"[{string.Join("]|[", delimiters)}]";
string strValue = "a/ new string, with some# values&";
Regex chrsToReplace = new Regex(clause);
string output = chrsToReplace.Replace(strValue, "_");
You'll probably want to encapsulate within if(delimiters.Any()), else it will crash if the array is empty

Sure. Here's one approach:
var newString = arr2.Aggregate(strValue, (net, curr) => net.Replace(curr, "_"));
If you're only substituting individual characters and have large enough input sizes to need optimization, you can create a set from which to substitute:
var substitutions = new HashSet<char>() { '/', '#', '&' };
var strValue = "a/ new string, with some# values&";
var newString = new string(strValue.Select(c => substitutions.Contains(c) ? '_' : c).ToArray());

Maybe not the fastest but the easiest would be a Select with a Contains.
Something like this : source.Select(c => blacklist.Contains(c) ? letter : c)
Demo on .NetFiddle.
using System;
using System.Linq;
public class Program
{
public static void Main()
{
var strValue = "a/ new string, with some# values&";
Console.WriteLine(strValue.Replace("/#&", '_'));
}
}
public static class Extensions {
public static string Replace(this string source, string blacklist, char letter) =>
new string(source.Select(c => blacklist.Contains(c) ? letter : c).ToArray());
}

You can split your string with your list of string []:
string[] arr2 = { "/", "#", "&" };
string strValue = "a/ new string, with some# values&";
string Output = null;
string[] split = strValue.Split(arr2, StringSplitOptions.RemoveEmptyEntries);
foreach (var item in split)
{
Output += item + "_";
}
Console.WriteLine(Output);
//-> a_ new string, with some_ values_
Updated answer with #aloisdg comment (interesting article, thank you).
string[] arr2 = { "/", "#", "&" };
string strValue = "a/ new string, with some# values&";
string[] split = strValue.Split(arr2, StringSplitOptions.RemoveEmptyEntries);
StringBuilder Output = new StringBuilder();
foreach (var item in split)
{
Output.Append(item + "_");
}
Console.WriteLine(Output);
//-> a_ new string, with some_ values_

You could use a foreach in a single line to achieve what you want:
arr2.ToList().ForEach(x => strValue = strValue.Replace(x, "_"));

Related

Deleting all empty elements from the end of a string array [duplicate]

I want to remove empty and null string in the split operation:
string number = "9811456789, ";
List<string> mobileNos = number.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries).Select(mobile => mobile.Trim()).ToList();
I tried this but this is not removing the empty space entry
var mobileNos = number.Replace(" ", "")
.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries).ToList();
As I understand it can help to you;
string number = "9811456789, ";
List<string> mobileNos = number.Split(',').Where(x => !string.IsNullOrWhiteSpace(x)).ToList();
the result only one element in list as [0] = "9811456789".
Hope it helps to you.
a string extension can do this in neat way as below
the extension :
public static IEnumerable<string> SplitAndTrim(this string value, params char[] separators)
{
Ensure.Argument.NotNull(value, "source");
return value.Trim().Split(separators, StringSplitOptions.RemoveEmptyEntries).Select(s => s.Trim());
}
then you can use it with any string as below
char[] separator = { ' ', '-' };
var mobileNos = number.SplitAndTrim(separator);
I know it's an old question, but the following works just fine:
string number = "9811456789, ";
List<string> mobileNos = number.Split(new char[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();
No need for extension methods or whatsoever.
"string,,,,string2".Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
return ["string"],["string2"]
The easiest and best solution is to use both StringSplitOptions.TrimEntries to trim the results and StringSplitOptions.RemoveEmptyEntries to remove empty entries, fed in through the pipe operator (|).
string number = "9811456789, ";
List<string> mobileNos = number
.Split(',', StringSplitOptions.TrimEntries | StringSplitOptions.RemoveEmptyEntries)
.ToList();
Checkout the below test results to compare how each option works,

Checking for and removing any characters in a string

I am wondering what would be the best way to specify an array of characters like,
{
}
[
]
and then check a string for these and if they are there, to completely remove them.
if (compiler.Parser.GetErrors().Count == 0)
{
AstNode root = compiler.Parse(phrase.ToLower());
if (compiler.Parser.GetErrors().Count == 0)
{
try
{
fTextSearch = SearchGrammar.ConvertQuery(root, SearchGrammar.TermType.Inflectional);
}
catch
{
fTextSearch = phrase;
}
}
else
{
fTextSearch = phrase;
}
}
else
{
fTextSearch = phrase;
}
string[] brackets = brackets = new string[]
{
"{",
"}",
"[",
"]"
};
string[] errorChars = errorChars = new string[]
{
"'",
"&"
};
StringBuilder sb = new StringBuilder();
string[] splitString = fTextSearch.Split(errorChars, StringSplitOptions.None);
int numNewCharactersAdded = 0;
foreach (string itm in splitString)
{
sb.Append(itm); //append string
if (fTextSearch.Length > (sb.Length - numNewCharactersAdded))
{
sb.Append(fTextSearch[sb.Length - numNewCharactersAdded]); //append splitting character
sb.Append(fTextSearch[sb.Length - numNewCharactersAdded - 1]); //append it again
numNewCharactersAdded++;
}
}
string newString = sb.ToString();
A regular expression can do this far more easily:
var result = Regex.Replace(input, #"[[\]()]", "");
Using a character set ([...]) to match anyone of the characters in it and replace with nothing. Regex.Replace will replace all matches.
Another concise way is using Enumerable.Except to get the set difference of the Chars(assuming brackets are chars):
String newString = new String(oldString.Except(brackets).ToArray());
string str = "faslkjnro(fjrmn){ferqwe}{{";
char[] separators = new []{'[', ']','{','}' };
var sb = new StringBuilder();
foreach (var c in str)
{
if (!separators.Contains(c))
{
sb.Append(c);
}
}
return sb.ToString();
How about this:
string myString = "a12{drr[ferr]vgb}rtg";
myString = myString.Replace("[", "").Replace("{", "").Replace("]", "").Replace("}", "");
You end up with:
a12drrferrvgbrtg
I don't know if I understand your problem, but you can solve your problem with this:
string toRemove = "{}[]";
string result = your_string_to_be_searched;
foreach(char c in toRemove)
result = result.Replace(c.ToString(), "");
or with an extension method
static class Extensions
{
public static string RemoveAll(this string src, string chars)
{
foreach(char c in chars)
src= src.Replace(c.ToString(), "");
return src;
}
}
With this you can use string result = your_string_to_be_searched.RemoveAll("{}[]");
string charsToRemove = #"[]{}";
string pattern = string.Format("[{0}]", Regex.Escape(charsToRemove));
var result = Regex.Replace(input, pattern, "");
The primary advantage of this over some of the other similar answers is that you aren't bothered with determining which characters need to be escaped in RegEx; you can let the library take care of that for you.
You can do this in a pretty compact fashion like this:
string s = "ab{c[d]}";
char[] ca = new char[] {'{', '}', '[', ']'};
Array.ForEach(ca, e => s = s.Replace(e.ToString(), ""));
Or this:
StringBuilder s = new StringBuilder("ab{c[d]}");
char[] ca = new char[] {'{', '}', '[', ']'};
Array.ForEach(ca, e => s.Replace(e.ToString(), ""));
Taken from this answer: https://stackoverflow.com/a/12800424/1498669
Just use .Split() with the char[] of your desired removeables and recapture it with .Join() or .Concat()
char[] delChars = "[]{}<>()".ToCharArray();
string input = "some (crazy) string with brac[et]s in{si}de";
string output = string.Join(string.Empty, input.Split(delChars));
//or
string output = string.Concat(input.Split(delChars));
References:
https://learn.microsoft.com/en-us/dotnet/csharp/how-to/parse-strings-using-split
https://learn.microsoft.com/en-us/dotnet/csharp/how-to/concatenate-multiple-strings#code-try-4

Trimstart and TrimEnd not working as wanted

I am testing to cut the strings via C#, but I am not getting the results correctly.
It is still showing the full text exactString.
String exactString = ABC##^^##DEF
char[] Delimiter = { '#', '#', '^', '^', '#', '#' };
string getText1 = exactString.TrimEnd(Delimiter);
string getText2 = exactString.TrimStart(Delimiter);
MessageBox.Show(getText1);
MessageBox.Show(getText2);
OUTPUT:
ABC##^^##DEF for both getText1 and getText2.
Correct OUTPUT should be
ABC for getText1 and DEF for getText2.
How do I fix it?
Thanks.
You want to split your string, not trim it. Thus, the correct method to use is String.Split:
String exactString = "ABC##^^##DEF";
var result = exactString.Split(new string[] {"##^^##"}, StringSplitOptions.None);
Console.WriteLine(result[0]); // outputs ABC
Console.WriteLine(result[1]); // outputs DEF
You are looking for String.Replace, not Trim.
char[] Delimiter = { '#', '^' };
string getText1 = exactString.Replace(Delimiter,'');
Trim only removes the characters at the beginning, Replace looks through the whole string.
You can split strings up in 2 pieces using the (conveniently named) String.Split method.
char[] Delimiter = { '#', '^' };
string[] text = exactString.Split(Delimiter, StringSplitOptions.RemoveEmptyEntries);
//text[0] = "ABC", text[1] = "DEF
you can use String.Split Method
String exactString = "ABC##^^##DEF";
string[] splits = exactString.Split(new string[]{"##^^##"}, StringSplitOptions.None);
string getText1 = splits[0];
string getText2 = splits[1];
MessageBox.Show(getText1);
MessageBox.Show(getText2);

how to perform tokenization and stopword removal in C#?

Basically i want to tokenise each word of the paragraph and then perform stopword removal. Which will be preprocessed data for my algorithm.
You can remove all punctuation and split the string for whitespace.
string s = "This is, a sentence.";
s = s.Replace(",","").Replace(".");
string words[] = s.split(" ");
if read from text file or any text you can:
char[] dele = { ' ', ',', '.', '\t', ';', '#', '!' };
List<string> allLinesText = File.ReadAllText(text file).Split(dele).ToList();
then you can convert stop-words to dictionary and save your document to list then
foreach (KeyValuePair<string, string> word in StopWords)
{
if (list.contain(word.key))
list.RemovAll(s=>s==word.key);
}
You can store all separation symbols and stopwords in constants or db:
public static readonly char[] WordsSeparators = {
' ', '\t', '\n', '\n', '\r', '\u0085'
};
public static readonly string[] StopWords = {
"stop", "word", "is", "here"
};
Remove all puctuations. Split text and filter:
var words = new List<string>();
var stopWords = new HashSet<string>(TextOperationConstants.StopWords);
foreach (var term in text.Split(TextOperationConstants.WordsSeparators))
{
if (String.IsNullOrWhiteSpace(term)) continue;
if (stopWords.Contains(term)) continue;
words .Add(term);
}

How do I split a string into an array?

I want to split a string into an array. The string is as follows:
:hello:mr.zoghal:
I would like to split it as follows:
hello mr.zoghal
I tried ...
string[] split = string.Split(new Char[] {':'});
and now I want to have:
string something = hello ;
string something1 = mr.zoghal;
How can I accomplish this?
String myString = ":hello:mr.zoghal:";
string[] split = myString.Split(':');
string newString = string.Empty;
foreach(String s in split) {
newString += "something = " + s + "; ";
}
Your output would be:
something = hello; something = mr.zoghal;
For your original request:
string myString = ":hello:mr.zoghal:";
string[] split = myString.Split(new[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
var somethings = split.Select(s => String.Format("something = {0};", s));
Console.WriteLine(String.Join("\n", somethings.ToArray()));
This will produce
something = hello;
something = mr.zoghal;
in accordance to your request.
Also, the line
string[] split = string.Split(new Char[] {':'});
is not legal C#. String.Split is an instance-level method whereas your current code is either trying to invoke Split on an instance named string (not legal as "string" is a reserved keyword) or is trying to invoke a static method named Split on the class String (there is no such method).
Edit: It isn't exactly clear what you are asking. But I think that this will give you what you want:
string myString = ":hello:mr.zoghal:";
string[] split = myString.Split(new[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
string something = split[0];
string something1 = split[1];
Now you will have
something == "hello"
and
something1 == "mr.zoghal"
both evaluate as true. Is this what you are looking for?
It is much easier than that. There is already an option.
string mystring = ":hello:mr.zoghal:";
string[] split = mystring.Split(new char[] {':'}, StringSplitOptions.RemoveEmptyEntries);

Categories

Resources