Regex to remove multiple consecutive commas and replace with single comma - c#

Given the input string "Test,,test,,,test,test"
and using the following C# snippet I would have expected the duplicate commas to be replaced by a single comma and results in...
"Test,test,test,test"
private static string TruncateCommas(string input)
{
return Regex.Replace(input, #",+", ",");
}
Code was pinched from this answer...
C# replace all occurrences of a character with just a character
But what I am seeing is "Test,,test,,,test,test" as the output from this function.
Do I need to escape the comma in the regex? Or should this regex be working.

Do I need to escape the comma in the regex?
No.
Or should this regex be working.
Yes.
Please construct your test the following way:
void Main()
{
string s = "Test,,test,,,test,test";
string result = TruncateCommas(s);
Console.WriteLine(result);
}
Output
Test,test,test,test

Related

Regex to match only numbers , no apostrophes

I want to match only numbers in the following string
String : "40’000"
Match : "40000"
basically tring to ignore apostrophe.
I am using C#, in case it matters.
Cant use any C# methods, need to only use Regex.
Replace like this it replace all char excpet numbers
string input = "40’000";
string result = Regex.Replace(input, #"[^\d]", "");
Since you said; I just want to pick up numbers only, how about without regex?
var s = "40’000";
var result = new string(s.Where(char.IsDigit).ToArray());
Console.WriteLine(result); // 40000
I suggest use regex to find the special characters not the digits, and then replace by ''.
So a simple (?=\S)\D should be enough, the (?=\S) is to ignore the whitespace at the end of number.
DEMO
Replace like this it replace all char excpet numbers and points
string input = "40’000";
string result = Regex.Replace(input, #"[^\d^.]", "");
Don't complicate your life, use Regex.Replace
string s = "40'000";
string replaced = Regex.Replace(s, #"\D", "");

Escape parts of string with a character in c# using regex

I have a string such as this:
/one/two/three-four/five six seven/eight/nine ten eleven-twelve
I need to first replace dashes with spaces, and then be able to escape any grouping of words that have a space between them with a "#" symbol, so the above string should be:
/one/two/#three four#/#five six seven#/eight/#nine ten eleven twelve#
I have the following extension method which works great for two words, but how can I make it work for any number of words.
public static string QueryEscape(this string str)
{
str = str.Replace("-", " ");
return Regex.Replace(str, #"(\w*) (\w*)", new MatchEvaluator(EscapeMatch));
}
private static string EscapeMatch(Match match)
{
return string.Format("#{0}#", match.Value);
}
So I guess I really need help with the proper regex that takes into account that
there could be any number of spaces
there may or may not be a trailing slash ("/")
takes into account that words are grouped between slashes, with the exception of #2 above.
Dashes are illegal and need to replaced with spaces
Thank you in advance for your support.
This should work for you:
public static string QueryEscape(this string str)
{
return Regex.Replace(str.Replace("-", " "), #"[^/]*(\s[^/]*)+", "#$&#");
}
Basically the idea is to match spans of text that isn't a slash that contains a (white-)space character in it. Then add the pound signs around the match.

Regex.Replace rufuses to replace with newline

Hi I wrote a very simple C# program to use the C# Regex from command line instead of relying on the MS Word search and replace. The problem is that even though the Regex recognizes \r and \n fine, when I try to replace the string with either of these, it seems to replace it with the escaped character instead of the character itself.
[STAThread]
static void Main(string[] args)
{
string initial = Clipboard.GetText();
Console.Write("Find: ");
string find = Console.ReadLine();
Console.Write("Replace: ");
string replace = Console.ReadLine();
string final = Regex.Replace(initial, find, replace);
Clipboard.SetText(final);
}
For example, my input string from the clipboard would be "Woodcock, american" (with a carriage return-newline at the end). the pattern would be #",.+\r", which matches fine, and the replacement string would be #"\r\n". This produces the string "Woodcock\r\n" (which are the letters r and n just to be clear). What am I doing wrong?
edit: Anirudh's answer solved my problem partially and I updated the code accordingly. However, it seems that when I input "\r\n" to the ReadLine it also escapes somehow, whereas if I write string replace = "\r\n" it actually replaces the string with a carriage return-newline. Link to new question : C# ReadLine escapes carriage return/newline?
This is because you are using verbatim string i.e #"" which would escape \r\n treating them as literals and not special characters!
The replacement string should be "\r\n" NOT #"\r\n"
To solve your other problem
Output=Regex.Replace(input,"\\r?\\n","\r\n");

Need RegEx to remove all alphabets from string

I need a regex to move all alphabets from string (A-Z) and (a-z)..everything including any kind of special character should remain intact. I tried #"[^\d]" but it only returns numbers in string.
String : asd!## $%dfdf4545D jasjkd #(*)jdjd56
desired output : !## $%4545 #(*)56
Just replace all undesired characters with an empty string sequence:
string filtered = Regex.Replace(input, "[A-Za-z]", "");
Try the following regular expression:
[^a-zA-Z]
This will match all non-english letters.

C# string to sentence

Is there a way to convert string without spaces to a proper sentence??
E.g. "WhoAmI" needs to be converted to "Who Am I"
A regex replacement would do this, if you're just talking about inserting a space before each capital letter:
using System;
using System.Text.RegularExpressions;
class Test
{
static void Main()
{
var input = "WhoAmI";
var output = Regex.Replace(input, #"\p{Lu}", " $0").TrimStart();
Console.WriteLine(output);
}
}
However, I suspect there will be significant corner cases. Note that the above uses \p{Lu} instead of just [A-Z] to cope with non-ASCII capital letters; you may find A-Z simpler if you only need to deal with ASCII. The TrimStart() call is to remove the leading space you'd get otherwise.
If every word in the string is starting with uppercase you may just convert each part that is starting with uppercase to a space separated string.
You can use LINQ
string words = "WhoAmI";
string sentence = String.Concat(words.Select(letter => Char.IsUpper(letter) ? " " + letter
: letter.ToString()))
.TrimStart();

Categories

Resources