I need to validate comma separated string using regex,but I have two problem.
My sample input as follows,
ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7 - Valid
ERW SW1,ERW SW2,ASA,S4,ERW SW5,ERW SW6,ERWSW7 - Valid(space between word should valid)
ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7, - Invalid - Comma at end
,ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7 - Invalid - Comma at beginning
ERWSW1,ERWSW2,,ASA,S4,ERWSW5,ERWSW6,ERWSW7 - Invalid - No value between 2,3 comma
I wrote following Regex to validate the input
^([a-z A-Z0-9 !##$%?=*&-]+,)*[a-z A-Z0-9 !##$%?=*&\s-]+$
First problem is when space between the commas showing as a valid string.
Eg: ERWSW1, , ,ERWSW2,ASA,S4
I need to avoid that, how can I do it?
And my second problem is, I also need to remove extra space from the string. two remove extra space I need function.(this is not related to above regex)
Input: ERWSW1 , ERW SW2,ASA ,S4 ,ERW SW5,ERWSW6,ERWSW7
I need the following output,
RWSW1,ERW SW2,ASA,S4,ERW SW5,ERWSW6,ERWSW7
Updated :
for my second problem, I wrote the following code,
string str = " ERW SW1 , ERW SW2 , ASA";
var ss = Regex.Replace(str, " *, *", ",");
But it's not removing spaces properly, I need this output
ERW SW1,ERW SW2,ASA
You could use a character class specifying what you would allow to match. For the spaces between the words you could use a repeating group preceded with a space.
^[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*(?:,[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*)*$
Regex demo
To remove the spaces around the comma's, you could match the string including the spaces and comma *, * and then replace the comma's surrounded by spaces with a single comma.
^ *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*(?: *, *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*)* *$
Regex demo | C# demo
Code example
string[] strings = {
"ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7",
"ERW SW1,ERW SW2,ASA,S4,ERW SW5,ERW SW6,ERWSW7",
"ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7,",
",ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7",
"ERWSW1,ERWSW2,,ASA,S4,ERWSW5,ERWSW6,ERWSW7",
"ERWSW1 , ERW SW2,ASA ,S4 ,ERW SW5,ERWSW6,ERWSW7",
"ERW*SW1,ERW-SW2,A.SA",
" ERWSW1 , ERWSW2 ,ASA,S4,ERWSW5 "
};
string pattern = #"^ *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*(?: *, *[\w!##$%?=*&.-]+(?: [\w!##$%?=*&.-]+)*)* *$";
foreach (String s in strings) {
if (Regex.IsMatch(s, pattern)) {
Console.WriteLine(Regex.Replace(s, " *, *", ",").Trim());
}
}
Output
ERWSW1,ERWSW2,ASA,S4,ERWSW5,ERWSW6,ERWSW7
ERW SW1,ERW SW2,ASA,S4,ERW SW5,ERW SW6,ERWSW7
ERWSW1,ERW SW2,ASA,S4,ERW SW5,ERWSW6,ERWSW7
ERW*SW1,ERW-SW2,A.SA
ERWSW1,ERWSW2,ASA,S4,ERWSW5
Regex are simple yet complex at times. Stuck to replace an expression having variables, assuming variable is of the following pattern:
\w+(\.\w+)*
I want to replace all the occurrences of my variable replacing dot (.) because i have to eventually tokenize the expression where tokenizer do not recognize variable having dots. So i thought to replace them with underscore before parsing. After tokenizing however i want to get the variable token with original value.
Expression:
(x1.y2.z3 + 9.99) + y2_z1 - x1.y2.z3
Three Variables:
x1.y2.z3
y2_z1
x1.y2.z3
Desired Output:
(x1_y2_z3 + 9.99) + y2_z1 - x1_y2_z3
Question 1: How to use Regex replace in this case?
Question 2: Is there any better way to address above mentioned problem because variable can have underscore so replacing dot with underscore is not a viable solution to get the original variable back in tokens?
This regex pattern seems to work: [a-zA-Z]+\d+\S+
To replace a dot found only in a match you use MatchEvaluator:
private static char charToReplaceWith = '_';
static void Main(string[] args)
{
string s = "(x1.y2.z3 + 9.99) + y2_z1 - x1.y2.z3";
Console.WriteLine(Regex.Replace(s, #"[a-zA-Z]+\d+\S+", new MatchEvaluator(ReplaceDotWithCharInMatch)));
Console.Read();
}
private static string ReplaceDotWithCharInMatch(Match m)
{
return m.Value.Replace('.', charToReplaceWith);
}
Which gives this output:
(x1_y2_z3 + 9.99) + y2_z1 - x1_y2_z3
I don't fully understand your second question and how to deal with tokenizing variables that already have underscores, but you should be able to choose a character to replace with (i.e., if (string.Contains('_')) is true then you choose a different character to replace with, but probably have to maintain a dictionary that says "I replaced all dots with underscores, and all underscores with ^, etc..).
Try this:
string input = "(x1.y2.z3 + 9.99) + y2_z1 - x1.y2.z3";
string output = Regex.Replace(input, "\\.(?<![a-z])", "_");
This will replace only periods which are followed by a letter (a-z).
Use Regex' negative lookahead by making a group that starts with (?!
A dot followed by something non-numeric would be as simple as this:
// matches any dot NOT followed by a character in the range 0-9
String output = Regex.Replace(input, "\\.(?![0-9])", "_");
This has the advantage that while the [0-9] is part of the expression, it is only checked as being behind the match, but is not actually part of the match.
i have the following sample cases :
1) "Sample"
2) "[10,25]"
I want to form a(only one) regular expression pattern, to which the above examples are passed returns me "Sample" and "10,25".
Note: Input strings do not include Quotes.
I came up with the following expression (?<=\[)(.*?)(?=\]), this satisfies the second case and retreives me only "10,25" but when the first case is matched it returns me blank. I want "Sample" to be returned? can anyone help me.
C#.
here you go, a small regex using a positive lookbehind, sometime these are very handy
Regex
(?<=^|\[)([\w,]+)
Test string
Sample
[10,25]
Result
MATCH 1
[0-6] Sample
MATCH 2
[8-13] 10,25
try at regex101.com
if " is included in your original string, use this regex, this will look for " mark as well, you may choose to remove ^| from lookup if " mark is always included or you may choose to leave it as it is if your text has combination of with and without " marks
Regex
(?<=^|\[|\")([\w,]+)
try at regex101.com
As far as I can tell, the below regex should help:
Regex regex = new Regex(#"^\w+|[[](\w)+\,(\w)+[]]$");
This will match multiple words, or 2 words (alphanumeric) separated by commas and inside square brackets.
One Java example:
// String input = "Sample";
String input = "[10,25]";
String text = "[^,\\[\\]]+";
Pattern pMod = Pattern.compile("(" + text + ")|(?>\\[(" + text + "," + text + ")\\])");
Matcher mMod = pMod.matcher(input);
while (mMod.find()) {
if(mMod.group(1) != null) {
System.out.println(mMod.group(1));
}
if(mMod.group(2)!=null) {
System.out.println(mMod.group(2));
}
}
if input is "[hello&bye,25|35]", then the output is hello&bye,25|35
I need to use regex in C# to split up something like "21A244" where
The first two numbers can be 1-99
The letter can only be 1 letter, A-Z
The last three numbers can be 111-999
So I made this match
"([0-9]+)([A-Z])([0-9]+)"
but for some reason when used in C#, the match functions just return the input string. So I tried it in Lua, just to make sure the pattern was correct, and it works just fine there.
Here's the relevant code:
var m = Regex.Matches( mdl.roomCode, "(\\d+)([A-Z])(\\d+)" );
System.Diagnostics.Debug.Print( "Count: " + m.Count );
And here's the working Lua code in case you were wondering
local str = "21A244"
print(string.match( str, "(%d+)([A-Z])(%d+)" ))
Thank you for any help
EDIT: Found the solution
var match = Regex.Match(mdl.roomCode, "(\\d+)([A-Z])(\\d+)");
var group = match.Groups;
System.Diagnostics.Debug.Print( "Count: " + group.Count );
System.Diagnostics.Debug.Print("houseID: " + group[1].Value);
System.Diagnostics.Debug.Print("section: " + group[2].Value);
System.Diagnostics.Debug.Print("roomID: " + group[3].Value);
Firstly you should make your regex a little more specific and limit how many numbers are allowed at the beginning/end. How about:
([1-9]{1,2})([A-Z])([1-9]{1,3})
Next, the results of the captures (i.e. the 3 parts in parens) will be in the Groups property of your regex matcher object. I.e.
m.Groups[1] // First number
m.Groups[2] // Letter
m.Groups[3] // Second number
Regex.Matches(mdl.roomCode, "(\d+)([A-Z])(\d+)") returns an collection of matches. If there is no match, then it will return an empty MatchCollection.
Since the regular expression matches the string, it returns a colletion with one item, the input string.
I am new to stackoverflow (my first post) and regex.
Currently i am working on a simple dirty app to replace baseclass properties with ctor injected fields. (cos i need to edit about 400 files)
It should find this:
ClassName(WiredObjectRegistry registry) : base(registry)
{
and replace with:
ClassName(IDependency paramName, ISecondDependency secondParam, ... )
{
_fieldName = paramName;
...
so i need to replace the two old lines with three or more new lines.
basically i was thinking:
find this ->
className + ctorParams + zero or more
whitespaces + newline + zero or more
whitespaces + {
replace with ->
className + newCtorParams + newline +
{
my field assignments
i tried this regex for .net
className + ctorParam + #"\w*" + "\r|\n" + #"\w*" + #"\{"
which does not replace the "{" and the whitespaces correctly
the replaced file content looks like this:
public CacheManager(ICallManager callManager, ITetraEventManager tetraEventManager, IConferenceManager conferenceManager, IAudioManager audioManager)
{
_callManager = callManager;
_tetraEventManager = tetraEventManager;
_conferenceManager = conferenceManager;
_audioManager = audioManager;
{
can u please help me with this :-|
david
If you're translating
className + ctorParams + zero or more whitespaces + newline + zero or more whitespaces + {
into regex as
className + ctorParam + #"\w*" + "\r|\n" + #"\w*" + #"\{"
then you're making several errors.
First, the character class for whitespace is \s. \w means "alphanumeric character".
Second, "\r|\n" will result in the alternation operator | separating the entire regex in two alternative parts (= "match either the regex before the | or the regex after the |"). In your case, you don't need this bit at all since \s will already match spaces, tabs and newlines. If you do want a regex that matches a Unix, Mac or DOS newline, use \r?\n?.
But, as the comments show, unless you show us what you really want to do, we can't help you further.