How to validate comma separated string with space using Regex - c#

I need to validate comma separated string using regex,but I have two problem.
My sample input as follows,
ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7 - Valid
ERW SW1,ERW SW2,ASA,S4,ERW SW5,ERW SW6,ERWSW7 - Valid(space between word should valid)
ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7, - Invalid - Comma at end
,ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7 - Invalid - Comma at beginning
ERWSW1,ERWSW2,,ASA,S4,ERWSW5,ERWSW6,ERWSW7 - Invalid - No value between 2,3 comma
I wrote following Regex to validate the input
^([a-z A-Z0-9 !##$%?=*&-]+,)*[a-z A-Z0-9 !##$%?=*&\s-]+$
First problem is when space between the commas showing as a valid string.
Eg: ERWSW1, , ,ERWSW2,ASA,S4
I need to avoid that, how can I do it?
And my second problem is, I also need to remove extra space from the string. two remove extra space I need function.(this is not related to above regex)
Input: ERWSW1 , ERW SW2,ASA ,S4 ,ERW SW5,ERWSW6,ERWSW7
I need the following output,
RWSW1,ERW SW2,ASA,S4,ERW SW5,ERWSW6,ERWSW7
Updated :
for my second problem, I wrote the following code,
string str = " ERW SW1 , ERW SW2 , ASA";
var ss = Regex.Replace(str, " *, *", ",");
But it's not removing spaces properly, I need this output
ERW SW1,ERW SW2,ASA

You could use a character class specifying what you would allow to match. For the spaces between the words you could use a repeating group preceded with a space.
^[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*(?:,[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*)*$
Regex demo
To remove the spaces around the comma's, you could match the string including the spaces and comma *, * and then replace the comma's surrounded by spaces with a single comma.
^ *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*(?: *, *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*)* *$
Regex demo | C# demo
Code example
string[] strings = {
"ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7",
"ERW SW1,ERW SW2,ASA,S4,ERW SW5,ERW SW6,ERWSW7",
"ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7,",
",ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7",
"ERWSW1,ERWSW2,,ASA,S4,ERWSW5,ERWSW6,ERWSW7",
"ERWSW1 , ERW SW2,ASA ,S4 ,ERW SW5,ERWSW6,ERWSW7",
"ERW*SW1,ERW-SW2,A.SA",
" ERWSW1 , ERWSW2 ,ASA,S4,ERWSW5 "
};
string pattern = #"^ *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*(?: *, *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*)* *$";
foreach (String s in strings) {
if (Regex.IsMatch(s, pattern)) {
Console.WriteLine(Regex.Replace(s, " *, *", ",").Trim());
}
}
Output
ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7
ERW SW1,ERW SW2,ASA,S4,ERW SW5,ERW SW6,ERWSW7
ERWSW1,ERW SW2,ASA,S4,ERW SW5,ERWSW6,ERWSW7
ERW*SW1,ERW-SW2,A.SA
ERWSW1,ERWSW2,ASA,S4,ERWSW5

Related

Validating a string with comma-separated alphanumeric words only OR just spaces

I am working on a regex that allows alphanumeric characters separated by comma. Or just spaces. Without a comma as the first character.
What I am trying to do:
"101010101sadadsasd,120120310231023a,adasdads1231,asdasdasda1231"
" " < -- case of just spaces of any number
What I am trying to avoid:
"&###$,asdasdads,asdsd#!#"
",aasdas,asdasd"
" asda asdsad asdasd ,asdasd"
What's acceptable but not wanted: (can live with it)
",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"
"asdasdasdas,asdasd123123,adsasd23123," <-- I can just trim(",")
Below is screenshot of the implementation and the event where isMatch = true even though the value is symbols and not alphanumeric.
The link shows a screenshot of the problem, and the screenshot code is as follows:
bool result = true;
Regex regx = new Regex(#"(^[a-zA-Z0-9]+[a-zA-Z0-9,-,]*$| *)");
if (regx.IsMatch(rowUpdate.ConNoteNumber))
{
result = false;
}
return result;
You can use
^(?:[a-zA-Z0-9]+(?:,[a-zA-Z0-9]+)*|\s*)$
Details:
^ - start of a string
(?:
[a-zA-Z0-9]+(?:,[a-zA-Z0-9]+)* - one or more letters/digits, and then zero or more occurrences of a comma and one or more letters/digits
| - or
\s* - zero or more whitespaces
) - end of the group
$ - end of string (if the regex is executed on the server side, $ can be replaced with \z, too).
See the regex demo.

Regex.Split string into substrings by a delimiter while preserving whitespace

I created a Regex to split a string by a delimiter ($), but it's not working the way I want.
var str = "sfdd fgjhk fguh $turn.bak.orm $hahr*____f";
var list = Regex.Split(str, #"(\$\w+)").Where(x => !string.IsNullOrEmpty(x)).ToList();
foreach (var item in list)
{
Console.WriteLine(item);
}
Output:
"sfdd fgjhk fguh "
"$turn"
".bak.orm "
"$hahr"
"*____f"
The problem is \w+ is not matching any periods or stars. Here's the output I want:
"sfdd fgjhk fguh "
"$turn.bak.orm"
" "
"$hahr*____f"
Essentially, I want to split a string by $ and make sure $ appears at the beginning of a substring and nowhere else (it's okay for a substring to be $ only). I also want to make sure whitespace characters are preserved as in the first substring, but any match should not contain whitespace as in the second and fourth cases. I don't care for case sensitivity.
It appears you want to split with a pattern that starts with a dollar and then has any 0 or more chars other than whitespace and dollar chars:
var list = Regex.Split(s, #"(\$[^\s$]*)")
.Where(x => !string.IsNullOrEmpty(x))
.ToList();
Details
( - start of a capturing group (so that Regex.Split tokenized the string, could keep the matches inside the resulting array)
\$ - a dollar sign
[^\s$]* - a negated character class matching 0 or more chars other than whitespace (\s) and dollar symbols
) - end of the capturing group.
See the regex demo:
To include a second delimiter, you may use #"([€$][^\s€$]*)".

C# Regex split() without removing the split condition character

I am splitting a string with regex using its Split() method.
var splitRegex = new Regex(#"[\s|{]");
string input = "/Tests/ShowMessage { 'Text': 'foo' }";
//second version of the input:
//string input = "/Tests/ShowMessage{ 'Text': 'foo' }";
string[] splittedText = splitRegex.Split(input, 2);
The string is just a sample pattern of the input. There are two different structures of input, once with a space before the { or without the space. I want to split the input on the { bracket in order to get the following result:
/Tests/ShowMessage
{ 'Text': 'foo' }
If there is a space, the string gets splitted there (space gets removed) and i get my desired result. But if there isnt a space i split the string on the {, so the { gets removed, what i dont want though. How can i use Regex.Split() without removing the split condition character?
The square brackets create a character set, so you want it to match exactly one of those inner characters. For your desire start off by removing them.
So to match it a random count of whitespaces you have to add *, the result is this one\s*.
\s is a whitespace
* means zero-or-more
That you don't remove the split condition character, you can use lookahead assertion (?=...).
(?=...) or (?!...) is a lookahead assertion
The combined Regex looks like this: \s*(?={)
This is a really good and detailed documentation of all the different Regex parts, you might have a look at it. Furthermore you can test your Regex easy and for free here.
In order to not include the curly brace in the match you can put it into a look ahead
\s*(?={)
That will match any number of white spaces up to the position before a open curly brace.
You can use regular string split, on "{" and trim the spaces off:
var bits = "/Tests/ShowMessage { 'Text': 'foo' }".Split("{", StringSplitOptions.RemoveEmptyEntries);
bits[0] = bits[0].TrimEnd();
bits[1] = "{" + bits[1];
If you want to use the RegEx route, you can add the { back if you change the regex a bit:
var splitRegex = new Regex(#"\s*{");
string input = "/Tests/ShowMessage { 'Text': 'foo' }";
//second version of the input:
//string input = "/Tests/ShowMessage{ 'Text': 'foo' }";
string[] splittedText = splitRegex.Split(input, 2);
splittedText[1] = "{" + splittedText[1];
It means "split at occurrence of (zero or more whitespace followed by {)" - so the split operation nukes your spaces (you want), and your { (you don't want) but you can put the { back with certainty that it will mean you get what you want
var splitedList = srt.Text.Replace(".", ".#").Replace("?", "?#").Replace("!", "!#").Split(new[] { "#"}, StringSplitOptions.RemoveEmptyEntries).ToList();
This will split text for .!? and will not remove condition chars. For better result just replace # with some uniq char. Like this one for example '®' That is all. Simple as it is. No regex.split which is slow and difficult due to many different task criterias, etc...
passing-> "Hello. I'am dev!"
result (split condition character exist )
"Hello."
"I'am dev!"

Regex & C#: Replace all Special Characters except Emojis

I need to replace all special characters in a string except the following (which includes alphabetic characters):
:)
:P
;)
:D
:(
This is what I have now:
string input = "Hi there!!! :)";
string output = Regex.Replace(input, "[^0-9a-zA-Z]+", "");
This replaces all special characters. How can I modify this to not replace mentioned characters (emojis) but replace any other special character?
You may use a known technique: match and capture what you need and match only what you want to remove, and replace with the backreference to Group 1:
(:(?:[D()P])|;\))|[^0-9a-zA-Z\s]
Replace with $1. Note I added \s to the character class, but in case you do not need spaces, remove it.
See the regex demo
Pattern explanation:
(:(?:[D()P])|;\)) - Group 1 (what we need to keep):
:(?:[D()P]) - a : followed with either D, (, ) or P
| - or
;\) - a ;) substring
(here, you may extend the capture group with more |-separated branches).
| - or ...
[^0-9a-zA-Z\s] - match any char other than ASCII digits, letters (and whitespace, but as I mentioned, you may remove \s if you do not need to keep spaces).
I would use a RegEx to match all emojis and select them out of the text
string input = "Hi there!!! :)";
string output = string.Concat(Regex.Matches(input, "[;|:][D|P|)|(]+").Cast<Match>().Select(x => x.Value));
Pattern [;|:][D|P|)|(]+
[;|:] starts with : or ;
[D|P|)|(] ends with D, P, ) or (
+ one or more

How to replace words following certain character and extract rest with REGEX

Assume that i have the following sentence
select PathSquares from tblPathFinding where RouteId=470
and StartingSquareId=267 and ExitSquareId=13
Now i want to replace words followed by = and get the rest of the sentence
Lets say i want to replace following word of = with %
Words are separated with space character
So this sentence would become
select PathSquares from tblPathFinding where RouteId=%
and StartingSquareId=% and ExitSquareId=%
With which regex i can achieve this ?
.net 4.5 C#
Use a lookbehind to match all the non-space or word chars which are just after to = symbol . Replacing the matched chars with % wiil give you the desired output.
#"(?<==)\S+"
OR
#"(?<==)\w+"
Replacement string:
%
DEMO
string str = #"select PathSquares from tblPathFinding where RouteId=470
and StartingSquareId=267 and ExitSquareId=13";
string result = Regex.Replace(str, #"(?<==)\S+", "%");
Console.WriteLine(result);
IDEONE
Explanation:
(?<==) Asserts that the match must be preceded by an = symbol.
\w+ If yes, then match the following one or more word characters.

Categories

Resources