Sprache parser and characters escaping

Sprache parser and characters escaping - c#

I haven't found an example - what to do with characters escaping. I have found a code example:
static void Main(string[] args)
{
string text = "'test \\\' text'";
var result = Grammar.QuotedText.End().Parse(text);
}
public static class Grammar
{
private static readonly Parser<char> QuoteEscape = Parse.Char('\\');
private static Parser<T> Escaped<T>(Parser<T> following)
{
return from escape in QuoteEscape
from f in following
select f;
}
private static readonly Parser<char> QuotedTextDelimiter = Parse.Char('\'');
private static readonly Parser<char> QuotedContent =
Parse.AnyChar.Except(QuotedTextDelimiter).Or(Escaped(QuotedTextDelimiter));
public static Parser<string> QuotedText = (
from lquot in QuotedTextDelimiter
from content in QuotedContent.Many().Text()
from rquot in QuotedTextDelimiter
select content
).Token();
}
It parses a text successfully if the text doesn't contain escaping, but it doesn't parse text with characters escaping.

I had a similar problem, parsing strings using " as delimiter and \ as escape character. I wrote a simple parser for this (may not be the most elegant solution) and it seems to work nicely.
You should be able to adapt it, since the only difference appears to be the delimiter.
var escapedDelimiter = Parse.String("\\\"").Text().Named("Escaped delimiter");
var singleEscape = Parse.String("\\").Text().Named("Single escape character");
var doubleEscape = Parse.String("\\\\").Text().Named("Escaped escape character");
var delimiter = Parse.Char('"').Named("Delimiter");
var simpleLiteral = Parse.AnyChar.Except(singleEscape).Except(delimiter).Many().Text().Named("Literal without escape/delimiter character");
var stringLiteral = (from start in delimiter
from v in escapedDelimiter.Or(doubleEscape).Or(singleEscape).Or(simpleLiteral).Many()
from end in delimiter
select string.Concat(start) + string.Concat(v) + string.Concat(end));
The key part is from v in .... It searches for escaped delimiters first, then for double escape characters and then for single escape characters before trying to parse it as a "simpleLiteral" w/o any escape or delimiter characters. Changing the order here would result in parse errors (e.g if you would try to parse single escape before escaped delimiters, you would never find the latter, same for double escapes and single escape).
This step is repeated many times, until an unescaped delimiter occurs (from v in ... does not handle unescaped delimiters, but from end in delimiterdoes of course).

I had a requirement to parse string literals that can be denoted with single-quote or double-quotes, and moreover also support escaping of those.
The method generating the string literal parser:
private readonly StringBuilder _reusableStringBuilder = new StringBuilder();
private Parser<string> BuildStringLiteralParser(char delimiterChar)
{
var escapeChar = '\\';
var delimiter = Sprache.Parse.Char(delimiterChar);
var escape = Sprache.Parse.Char(escapeChar);
var escapedDelimiter = Sprache.Parse.String($"{escapeChar}{delimiterChar}");
var splitByEscape = Sprache.Parse.AnyChar
.Except(delimiter.Or(escape))
.Many()
.Text()
.DelimitedBy(escapedDelimiter);
string BuildStr(IEnumerable<IEnumerable<string>> splittedByEscape)
{
_reusableStringBuilder.Clear();
var i = 0;
foreach (var splittedByEscapedDelimiter in splittedByEscape)
{
if (i > 0)
{
_reusableStringBuilder.Append(escapeChar);
}
var j = 0;
foreach (var str in splittedByEscapedDelimiter)
{
if (j > 0)
{
_reusableStringBuilder.Append(delimiterChar);
}
_reusableStringBuilder.Append(str);
j++;
}
i++;
}
return _reusableStringBuilder.ToString();
}
return (from ln in delimiter
from splittedByEscape in splitByEscape.DelimitedBy(escape)
from rn in delimiter
select BuildStr(splittedByEscape)).Named("string");
}
Usage:
var stringParser = BuildStringLiteralParser('\"').Or(BuildStringLiteralParser('\''));
var str1 = stringParser.Parse("\"'Hello' \\\"John\\\"\"");
Console.WriteLine(str1);
var str2 = stringParser.Parse("\'\\'Hello\\' \"John\"\'");
Console.WriteLine(str2);
Output:
'Hello' "John"
'Hello' "John"
Check the working demo:
https://dotnetfiddle.net/8wFNbj

Related

How to remove a portion of string

I want to remove word Test and Leaf from the specified string beginning only,not from the other side,so string Test_AA_234_6874_Test should be AA_234_6874_Test,But when i use .Replace it will replace word Test from everywhere which i don't want.How to do it
This is the code what i have done it
string st = "Test_AA_234_6874_Test";
st = st.Replace("Test_","");

You could use a regex to do this. The third argument of the regex replace method specifics how many times you want to replace.
string st = "Test_AA_234_6874_Test";
var regex = new Regex("(Test|Leaf)_");
var value = regex.Replace(st, "", 1);
Or if the string to replace only occurs on the start just use ^ which asserts the position at start of the string.
string st = "Test_AA_234_6874_Test";
var regex = new Regex("^(Test|Leaf)_");
var value = regex.Replace(st, "");
If you know that you allways have to remove the first 5 letters you can also use Substring which is more performant.
string st = "Test_AA_234_6874_Test";
var value = st.Substring(5, st.Length - 5);

The simplest way to do this is by using a Regular Expression like so.
using System;
using System.Text.RegularExpressions;
using System.Text;
namespace RegExTest
{
class Program
{
static void Main(string[] args)
{
var input = "Test_AA_234_6874_Test";
var matchText = "Test";
var replacement = String.Empty;
var regex = new Regex("^" + matchText);
var output = regex.Replace(input, replacement);
Console.WriteLine("Converted String: {0}", output);
Console.ReadKey();
}
}
}
The ^ will match text at the beginning of the string.

Consider checking whether the string starts with "Start" and/or ends with "Trim" and decide the end and start positions you'd like to maintain. Then use Substring method to get only the portion you need.
public string Normalize(string input, string prefix, string suffix)
{
// Validation
int length = input.Length;
int startIndex = 0;
if(input.StartsWith(prefix))
{
startIndex = prefix.Length;
length -= prefix.Length;
}
if (input.EndsWith (suffix))
{
length -= suffix.Length;
}
return input.Substring(startIndex, length);
}
Hope this helps.

string wordToRemoveFromBeginning = "Test_";
int index = st.IndexOf(wordToRemoveFromBeginning);
string cleanPath = (index < 0) ? st : st.Remove(index,
wordToRemoveFromBeginning.Length);

Use a regular expression.
var str1 = "Test_AA_234_6874_Test";
var str2 = "Leaf_AA_234_6874_Test";
str1 = Regex.Replace(str1, "^Test", "");
str2 = Regex.Replace(str2, "^Leaf", "");
Regex.Replace parameters are your input string (str1), the pattern you want to match, and what to replace it with, in this case a blank space. The ^ character means look at the start of the string, so something like "MyTest_AAAA_234_6874_Test" would stil return "MyTest_AA_234_6874_Test".

I am gonna use some very simple code here
string str = "Test_AA_234_6874_Test";
string substring = str.Substring(0, 4);
if (substring == "Test" || substring == "Leaf")
{
str= str.Remove(0, 5);
}

How can I convert text to Pascal case?

I have a variable name, say "WARD_VS_VITAL_SIGNS", and I want to convert it to Pascal case format: "WardVsVitalSigns"
WARD_VS_VITAL_SIGNS -> WardVsVitalSigns
How can I make this conversion?

You do not need a regular expression for that.
var yourString = "WARD_VS_VITAL_SIGNS".ToLower().Replace("_", " ");
TextInfo info = CultureInfo.CurrentCulture.TextInfo;
yourString = info.ToTitleCase(yourString).Replace(" ", string.Empty);
Console.WriteLine(yourString);

Here is my quick LINQ & regex solution to save someone's time:
using System;
using System.Linq;
using System.Text.RegularExpressions;
public string ToPascalCase(string original)
{
Regex invalidCharsRgx = new Regex("[^_a-zA-Z0-9]");
Regex whiteSpace = new Regex(#"(?<=\s)");
Regex startsWithLowerCaseChar = new Regex("^[a-z]");
Regex firstCharFollowedByUpperCasesOnly = new Regex("(?<=[A-Z])[A-Z0-9]+$");
Regex lowerCaseNextToNumber = new Regex("(?<=[0-9])[a-z]");
Regex upperCaseInside = new Regex("(?<=[A-Z])[A-Z]+?((?=[A-Z][a-z])|(?=[0-9]))");
// replace white spaces with undescore, then replace all invalid chars with empty string
var pascalCase = invalidCharsRgx.Replace(whiteSpace.Replace(original, "_"), string.Empty)
// split by underscores
.Split(new char[] { '_' }, StringSplitOptions.RemoveEmptyEntries)
// set first letter to uppercase
.Select(w => startsWithLowerCaseChar.Replace(w, m => m.Value.ToUpper()))
// replace second and all following upper case letters to lower if there is no next lower (ABC -> Abc)
.Select(w => firstCharFollowedByUpperCasesOnly.Replace(w, m => m.Value.ToLower()))
// set upper case the first lower case following a number (Ab9cd -> Ab9Cd)
.Select(w => lowerCaseNextToNumber.Replace(w, m => m.Value.ToUpper()))
// lower second and next upper case letters except the last if it follows by any lower (ABcDEf -> AbcDef)
.Select(w => upperCaseInside.Replace(w, m => m.Value.ToLower()));
return string.Concat(pascalCase);
}
Example output:
"WARD_VS_VITAL_SIGNS" "WardVsVitalSigns"
"Who am I?" "WhoAmI"
"I ate before you got here" "IAteBeforeYouGotHere"
"Hello|Who|Am|I?" "HelloWhoAmI"
"Live long and prosper" "LiveLongAndProsper"
"Lorem ipsum dolor..." "LoremIpsumDolor"
"CoolSP" "CoolSp"
"AB9CD" "Ab9Cd"
"CCCTrigger" "CccTrigger"
"CIRC" "Circ"
"ID_SOME" "IdSome"
"ID_SomeOther" "IdSomeOther"
"ID_SOMEOther" "IdSomeOther"
"CCC_SOME_2Phases" "CccSome2Phases"
"AlreadyGoodPascalCase" "AlreadyGoodPascalCase"
"999 999 99 9 " "999999999"
"1 2 3 " "123"
"1 AB cd EFDDD 8" "1AbCdEfddd8"
"INVALID VALUE AND _2THINGS" "InvalidValueAnd2Things"

First off, you are asking for title case and not camel-case, because in camel-case the first letter of the word is lowercase and your example shows you want the first letter to be uppercase.
At any rate, here is how you could achieve your desired result:
string textToChange = "WARD_VS_VITAL_SIGNS";
System.Text.StringBuilder resultBuilder = new System.Text.StringBuilder();
foreach(char c in textToChange)
{
// Replace anything, but letters and digits, with space
if(!Char.IsLetterOrDigit(c))
{
resultBuilder.Append(" ");
}
else
{
resultBuilder.Append(c);
}
}
string result = resultBuilder.ToString();
// Make result string all lowercase, because ToTitleCase does not change all uppercase correctly
result = result.ToLower();
// Creates a TextInfo based on the "en-US" culture.
TextInfo myTI = new CultureInfo("en-US",false).TextInfo;
result = myTI.ToTitleCase(result).Replace(" ", String.Empty);
Note: result is now WardVsVitalSigns.
If you did, in fact, want camel-case, then after all of the above, just use this helper function:
public string LowercaseFirst(string s)
{
if (string.IsNullOrEmpty(s))
{
return string.Empty;
}
char[] a = s.ToCharArray();
a[0] = char.ToLower(a[0]);
return new string(a);
}
So you could call it, like this:
result = LowercaseFirst(result);

Single semicolon solution:
public static string PascalCase(this string word)
{
return string.Join("" , word.Split('_')
.Select(w => w.Trim())
.Where(w => w.Length > 0)
.Select(w => w.Substring(0,1).ToUpper() + w.Substring(1).ToLower()));
}

Extension method for System.String with .NET Core compatible code by using System and System.Linq.
Does not modify the original string.
.NET Fiddle for the code below
using System;
using System.Linq;
public static class StringExtensions
{
/// <summary>
/// Converts a string to PascalCase
/// </summary>
/// <param name="str">String to convert</param>
public static string ToPascalCase(this string str){
// Replace all non-letter and non-digits with an underscore and lowercase the rest.
string sample = string.Join("", str?.Select(c => Char.IsLetterOrDigit(c) ? c.ToString().ToLower() : "_").ToArray());
// Split the resulting string by underscore
// Select first character, uppercase it and concatenate with the rest of the string
var arr = sample?
.Split(new []{'_'}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => $"{s.Substring(0, 1).ToUpper()}{s.Substring(1)}");
// Join the resulting collection
sample = string.Join("", arr);
return sample;
}
}
public class Program
{
public static void Main()
{
Console.WriteLine("WARD_VS_VITAL_SIGNS".ToPascalCase()); // WardVsVitalSigns
Console.WriteLine("Who am I?".ToPascalCase()); // WhoAmI
Console.WriteLine("I ate before you got here".ToPascalCase()); // IAteBeforeYouGotHere
Console.WriteLine("Hello|Who|Am|I?".ToPascalCase()); // HelloWhoAmI
Console.WriteLine("Live long and prosper".ToPascalCase()); // LiveLongAndProsper
Console.WriteLine("Lorem ipsum dolor sit amet, consectetur adipiscing elit.".ToPascalCase()); // LoremIpsumDolorSitAmetConsecteturAdipiscingElit
}
}

var xs = "WARD_VS_VITAL_SIGNS".Split('_');
var q =
from x in xs
let first_char = char.ToUpper(x[0])
let rest_chars = new string(x.Skip(1).Select(c => char.ToLower(c)).ToArray())
select first_char + rest_chars;

Some answers are correct but I really don't understand why they set the text to LowerCase first, because the ToTitleCase will handle that automatically:
var text = "WARD_VS_VITAL_SIGNS".Replace("_", " ");
TextInfo textInfo = CultureInfo.CurrentCulture.TextInfo;
text = textInfo.ToTitleCase(text).Replace(" ", string.Empty);
Console.WriteLine(text);

You can use this:
public static string ConvertToPascal(string underScoreString)
{
string[] words = underScoreString.Split('_');
StringBuilder returnStr = new StringBuilder();
foreach (string wrd in words)
{
returnStr.Append(wrd.Substring(0, 1).ToUpper());
returnStr.Append(wrd.Substring(1).ToLower());
}
return returnStr.ToString();
}

This answer understands that there are Unicode categories which can be tapped while processing the text to ignore the connecting characters such as - or _. In regex parlance it is \p (for category) then the type which is {Pc} for punctuation and connector type character; \p{Pc} using our MatchEvaluator which is kicked off for each match within a session.
So during the match phase, we get words and ignore the punctuations, so the replace operation handles the removal of the connector character. Once we have the match word, we can push it down to lowercase and then only up case the first character as the return for the replace:
public static class StringExtensions
{
public static string ToPascalCase(this string initial)
=> Regex.Replace(initial,
// (Match any non punctuation) & then ignore any punctuation
#"([^\p{Pc}]+)[\p{Pc}]*",
new MatchEvaluator(mtch =>
{
var word = mtch.Groups[1].Value.ToLower();
return $"{Char.ToUpper(word[0])}{word.Substring(1)}";
}));
}
Usage:
"TOO_MUCH_BABY".ToPascalCase(); // TooMuchBaby
"HELLO|ITS|ME".ToPascalCase(); // HelloItsMe
See Word Character in Character Classes in Regular Expressions
Pc Punctuation, Connector. This category includes ten characters, the
most commonly used of which is the LOWLINE character (_), u+005F.

If you did want to replace any formatted string into a pascal case then you can do
public static string ToPascalCase(this string original)
{
string newString = string.Empty;
bool makeNextCharacterUpper = false;
for (int index = 0; index < original.Length; index++)
{
char c = original[index];
if(index == 0)
newString += $"{char.ToUpper(c)}";
else if (makeNextCharacterUpper)
{
newString += $"{char.ToUpper(c)}";
makeNextCharacterUpper = false;
}
else if (char.IsUpper(c))
newString += $" {c}";
else if (char.IsLower(c) || char.IsNumber(c))
newString += c;
else if (char.IsNumber(c))
newString += $"{c}";
else
{
makeNextCharacterUpper = true;
newString += ' ';
}
}
return newString.TrimStart().Replace(" ", "");
}
Tested with strings
I|Can|Get|A|String
ICan_GetAString
i-can-get-a-string
i_can_get_a_string
I Can Get A String
ICanGetAString

I found this gist useful after adding a ToLower() to it.
"WARD_VS_VITAL_SIGNS"
.ToLower()
.Split(new [] {"_"}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => char.ToUpperInvariant(s[0]) + s.Substring(1, s.Length - 1))
.Aggregate(string.Empty, (s1, s2) => s1 + s2)

Find a substring, replace a substring according the case

What's the easiest and fastest way to find a sub-string(template) in a string and replace it with something else following the template's letter case (if all lower case - replace with lowercase, if all upper case - replace with uppercase, if begins with uppercase and so on...)
so if the substring is in curly braces
"{template}" becomes "replaced content"
"{TEMPLATE}" becomes "REPLACED CONTENT" and
"{Template}" becomes "Replaced content" but
"{tEMPLATE}" becomes "rEPLACED CONTENT"

Well, you could use regular expressions and a match evaluator callback like this:
regex = new Regex(#"\{(?<value>.*?)\}",
RegexOptions.CultureInvariant | RegexOptions.ExplicitCapture);
string replacedText = regex.Replace(<text>,
new MatchEvaluator(this.EvaluateMatchCallback));
And your evaluator callback would do something like this:
private string EvaluateMatchCallback(Match match) {
string templateInsert = match.Groups["value"].Value;
// or whatever
string replacedText = GetReplacementTextBasedOnTemplateValue(templateInsert);
return replacedText;
}
Once you get the regex match value you can just do a case-sensitive comparison and return the correct replacement value.
EDIT I sort of assumed you were trying to find the placeholders in a block of text rather than worry about the casing per se, if your pattern is valid all the time then you can just check the first two characters of the placeholder itself and that will tell you the casing you need to use in the replacement expression:
string foo = "teMPLATE";
if (char.IsLower(foo[0])) {
if (char.IsLower(foo[1])) {
// first lower and second lower
}
else {
// first lower and second upper
}
}
else {
if (char.IsLower(foo[1])) {
// first upper and second lower
}
else {
// first upper and second upper
}
}
I would still use a regular expression to match the replacement placeholder, but that's just me.

You can check the case of the first two letters of the placeholder and choose one of the four case transforming strategies for the inserted text.
public static string Convert(string input, bool firstIsUpper, bool restIsUpper)
{
string firstLetter = input.Substring(0, 1);
firstLetter = firstIsUpper ? firstLetter.ToUpper() : firstLetter.ToLower();
string rest = input.Substring(1);
rest = restIsUpper ? rest.ToUpper() : rest.ToLower();
return firstLetter + rest;
}
public static string Replace(string input, Dictionary<string, string> valueMap)
{
var ms = Regex.Matches(input, "{(\\w+?)}");
int i = 0;
var sb = new StringBuilder();
for (int j = 0; j < ms.Count; j++)
{
string pattern = ms[j].Groups[1].Value;
string key = pattern.ToLower();
bool firstIsUpper = char.IsUpper(pattern[0]);
bool restIsUpper = char.IsUpper(pattern[1]);
sb.Append(input.Substring(i, ms[j].Index - i));
sb.Append(Convert(valueMap[key], firstIsUpper, restIsUpper));
i = ms[j].Index + ms[j].Length;
}
return sb.ToString();
}
public static void DoStuff()
{
Console.WriteLine(Replace("--- {aAA} --- {AAA} --- {Aaa}", new Dictionary<string,string> {{"aaa", "replacement"}}));
}

Ended up doing that:
public static string ReplaceWithTemplate(this string original, string pattern, string replacement)
{
var template = Regex.Match(original, pattern, RegexOptions.IgnoreCase).Value.Remove(0, 1);
template = template.Remove(template.Length - 1);
var chars = new List<char>();
var isLetter = false;
for (int i = 0; i < replacement.Length; i++)
{
if (i < (template.Length)) isLetter = Char.IsUpper(template[i]);
chars.Add(Convert.ToChar(
isLetter ? Char.ToUpper(replacement[i])
: Char.ToLower(replacement[i])));
}
return new string(chars.ToArray());
}

Replace placeholders in order

I have a part of a URL like this:
/home/{value1}/something/{anotherValue}
Now i want to replace all between the brackets with values from a string-array.
I tried this RegEx pattern: \{[a-zA-Z_]\} but it doesn't work.
Later (in C#) I want to replace the first match with the first value of the array, second with the second.
Update: The /'s cant be used to separate. Only the placeholders {...} should be replaced.
Example: /home/before{value1}/and/{anotherValue}
String array: {"Tag", "1"}
Result: /home/beforeTag/and/1
I hoped it could works like this:
string input = #"/home/before{value1}/and/{anotherValue}";
string pattern = #"\{[a-zA-Z_]\}";
string[] values = {"Tag", "1"};
MatchCollection mc = Regex.Match(input, pattern);
for(int i, ...)
{
mc.Replace(values[i];
}
string result = mc.GetResult;
Edit:
Thank you Devendra D. Chavan and ipr101,
both solutions are greate!

You can try this code fragment,
// Begin with '{' followed by any number of word like characters and then end with '}'
var pattern = #"{\w*}";
var regex = new Regex(pattern);
var replacementArray = new [] {"abc", "cde", "def"};
var sourceString = #"/home/{value1}/something/{anotherValue}";
var matchCollection = regex.Matches(sourceString);
for (int i = 0; i < matchCollection.Count && i < replacementArray.Length; i++)
{
sourceString = sourceString.Replace(matchCollection[i].Value, replacementArray[i]);
}

[a-zA-Z_] describes a character class. For words, you'll have to add * at the end (any number of characters within a-zA-Z_.
Then, to have 'value1' captured, you'll need to add number support : [a-zA-Z0-9_]*, which can be summarized with: \w*
So try this one : {\w*}
But for replacing in C#, string.Split('/') might be easier as Fredrik proposed. Have a look at this too

You could use a delegate, something like this -
string[] strings = {"dog", "cat"};
int counter = -1;
string input = #"/home/{value1}/something/{anotherValue}";
Regex reg = new Regex(#"\{([a-zA-Z0-9]*)\}");
string result = reg.Replace(input, delegate(Match m) {
counter++;
return "{" + strings[counter] + "}";
});

My two cents:
// input string
string txt = "/home/{value1}/something/{anotherValue}";
// template replacements
string[] str_array = { "one", "two" };
// regex to match a template
Regex regex = new Regex("{[^}]*}");
// replace the first template occurrence for each element in array
foreach (string s in str_array)
{
txt = regex.Replace(txt, s, 1);
}
Console.Write(txt);

String cleaning and formatting

I have a URL formatter in my application but the problem is that the customer wants to be able to enter special characters like:
: | / - “ ‘ & * # #
I have a string:
string myCrazyString = ":|/-\“‘&*##";
I have a function where another string is being passed:
public void CleanMyString(string myStr)
{
}
How can I compare the string being passed "myStr" to "myCrazyString" and if "myStr has any of the characters in myCrazyString to remove it?
So if I pass to my function:
"this ' is a" cra#zy: me|ssage/ an-d I& want#to clea*n it"
It should return:
"this is a crazy message and I want to clean it"
How can I do this in my CleanMyString function?

Use Regular Expression for that Like:
pattern = #"(:|\||\/|\-|\\|\“|\‘|\&|\*|\#|\#)";
System.Text.RegularExpressions.Regex.Replace(inputString, pattern, string.Empty);
split each string you want to remove by |
To remove the special characters like the | itself use \, so \| this will handle the | as normal character.
Test:
inputString = #"H\I t&he|r#e!";
//output is: HI there!

solution without regular expressions, just for availability purposes:
static string clear(string input)
{
string charsToBeCleared = ":|/-\“‘&*##";
string output = "";
foreach (char c in input)
{
if (charsToBeCleared.IndexOf(c) < 0)
{
output += c;
}
}
return output;
}

You can use Regex as others mentioned, or code like this:
char[] myCrazyChars = "\"\':|/-\\“‘&*##".ToCharArray();
string myCrazyString = "this ' is a\" cra#zy: me|ssage/ an-d I& want#to clea*n it";
string[] splittedCrazyString = myCrazyString.Split(myCrazyChars);
string notCrazyStringAtAll = string.Join("", splittedCrazyString);

Try using a Regular Expression.

Here's a fairly straight-forward way to do it. Split the string based on all of the characters in your "crazy string and then join them back together without the bad characters.
string myCrazyString = #":|/-\“‘&*##";
string str = #"this ' is a"" cra#zy: me|ssage/ an-d I& want#to clea*n it";
string[] arr = str.Split(myCrazyString.ToCharArray(), StringSplitOptions.None);
str = string.Join(string.Empty, arr);

Another possible solution:
namespace RemoveChars
{
class Program
{
static string str = #"this ' is a\“ cra#zy: me|ssage/ an-d I& want#to clea*n it";
static void Main(string[] args)
{
CleanMyString(str);
}
public static void CleanMyString(string myStr)
{
string myCrazyString = #":|/-\“‘&*##";
var result = "";
foreach (char c in myStr)
{
var t = true; // t will remain true if c is not a crazy char
foreach (char ch in myCrazyString)
if (c == ch)
{
t = false;
break;
}
if (t)
result += c;
}
}
}
}

You could try an if statement and if a character is present then mention the craziness
if (myCrazyString.Contains("#"))
{
Console.WriteLine("This string is out of controL!");
}
Regex is also a good idea(Maybe better)

Try this :
1.Define a StringBuilder
2.Iterate through the characters of the string to be cleaned.
3.Put everything required in the StringBuilder and ignore other special charactersby simply putting if conditions.
4.Rerurn StringBuilder.
Or
Try using Regular Expression.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Sprache parser and characters escaping - c#

Related

How to remove a portion of string

How can I convert text to Pascal case?

Find a substring, replace a substring according the case

Replace placeholders in order

String cleaning and formatting

Categories

Resources