Simple Regular Expression (Regex) issue (.net) - c#

I'm trying to use a .NET Regex to validate the input format of a string. The string can be of the format
single digit 0-9 followed by
single letter A-Z OR 07 OR 03 or AA followed by
two letters A-Z
So 0AAA, 107ZF, 503GH, 0AAAA are all valid. The string with which I construct my Regex is as follows:
"([0-9]{1})" +
"((03$)|(07$)|(AA$)|[A-Z]{1})" +
"([A-Z]{2})"
Yet this does not validate strings in which the second term is one of 03, 07 or AA. Whilst debugging, I removed the third term from the string used to construct the regex, and found that input strings of the form 103, 507, 6AA WOULD validate.......
Any ideas why, when I then put the third term back into the Regex, the input strings such as 1AAGM do not match?
Thanks
Tom

This is because your expression requires the strings with 03, 07 and AA to end right there ($ matches the end of input). Remove the $ from these sub-expressions, and move it to the end of the expression.
"^[0-9](03|07|AA|[A-Z])[A-Z]{2}$"

I believe this is because you are using the "$" in the regex, which means in this case to assert position at the end of a line (at the end of the string or before a line break character). Remove it and it should work. From Regex Buddy, here is what you were doing:
([0-9]{1})((03$)|(07$)|(AA$)|[A-Z]{1})([A-Z]{2})
Options: ^ and $ match at line breaks
Match the regular expression below and capture its match into backreference number 1 «([0-9]{1})»
Match a single character in the range between “0” and “9” «[0-9]{1}»
Exactly 1 times «{1}»
Match the regular expression below and capture its match into backreference number 2 «((03$)|(07$)|(AA$)|[A-Z]{1})»
Match either the regular expression below (attempting the next alternative only if this one fails) «(03$)»
Match the regular expression below and capture its match into backreference number 3 «(03$)»
Match the characters “03” literally «03»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(07$)»
Match the regular expression below and capture its match into backreference number 4 «(07$)»
Match the characters “07” literally «07»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
Or match regular expression number 3 below (attempting the next alternative only if this one fails) «(AA$)»
Match the regular expression below and capture its match into backreference number 5 «(AA$)»
Match the characters “AA” literally «AA»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
Or match regular expression number 4 below (the entire group fails if this one fails to match) «[A-Z]{1}»
Match a single character in the range between “A” and “Z” «[A-Z]{1}»
Exactly 1 times «{1}»
Match the regular expression below and capture its match into backreference number 6 «([A-Z]{2})»
Match a single character in the range between “A” and “Z” «[A-Z]{2}»
Exactly 2 times «{2}»
Followed by the revised version:
([0-9]{1})((03)|(07)|(AA)|[A-Z]{1})([A-Z]{2})
Options: ^ and $ match at line breaks
Match the regular expression below and capture its match into backreference number 1 «([0-9]{1})»
Match a single character in the range between “0” and “9” «[0-9]{1}»
Exactly 1 times «{1}»
Match the regular expression below and capture its match into backreference number 2 «((03)|(07)|(AA)|[A-Z]{1})»
Match either the regular expression below (attempting the next alternative only if this one fails) «(03)»
Match the regular expression below and capture its match into backreference number 3 «(03)»
Match the characters “03” literally «03»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(07)»
Match the regular expression below and capture its match into backreference number 4 «(07)»
Match the characters “07” literally «07»
Or match regular expression number 3 below (attempting the next alternative only if this one fails) «(AA)»
Match the regular expression below and capture its match into backreference number 5 «(AA)»
Match the characters “AA” literally «AA»
Or match regular expression number 4 below (the entire group fails if this one fails to match) «[A-Z]{1}»
Match a single character in the range between “A” and “Z” «[A-Z]{1}»
Exactly 1 times «{1}»
Match the regular expression below and capture its match into backreference number 6 «([A-Z]{2})»
Match a single character in the range between “A” and “Z” «[A-Z]{2}»
Exactly 2 times «{2}»

Related

Regular expression that stops at first letter encountered

I want my regex expression to stop matching numbers of length between 2 and 10 after it encounters a letter.
So far I've come up with (\d{2,10})(?![a-zA-Z]) this. But it continues to match even after letters are encountered.
2216101225 /ROC/PL FCT DIN 24.03.2022 PL ERBICIDE' - this is the text I've been testing the regex on, but it matches 24 03 and 2022 also.
This is tested and intended for C#.
Can you help ? Thanks
Another option is to anchor the pattern and to match any character except chars a-zA-Z or a newline, and then capture the 2-10 digits in a capture group.
Then get the capture group 1 value from the match.
^[^A-Za-z\r\n]*\b([0-9]{2,10})\b
Explanation
^ Start of string
[^A-Za-z\r\n]* Optionally match chars other than a-zA-Z or a newline
\b([0-9]{2,10})\b Capture 2-10 digits between word boundaries in group 1
See a regex demo.
Note that in .NET \d matches all numbers except for only 0-9.
You can use the following .NET regex
(?<=^\P{L}*)(?<!\d)\d{2,10}(?!\d)
(?<=^[^a-zA-Z]*)(?<!\d)\d{2,10}(?!\d)
See the regex demo. Details:
(?<=^\P{L}*) - there must be no letters from the current position till the start of string ((?<=^[^a-zA-Z]*) only supports ASCII letters)
(?<!\d) - no digit immediately on the left is allowed.
\d{2,10} - two to ten digits
(?!\d) - no digit immediately on the right is allowed.

Regular expression to match 1-5 symbols when start symbol letter and at least one number

I tried this expression:
/([a-z]+[0-9]+[a-z]*){1,5}$/
but it's works for every word that start with letter and contains at least one number and more then two symbol for example "re1111e" when its not supposed to, what am I doing wrong?
One possible way to write your regex uses a positive lookahead to check for a number:
/(?=[^0-9]*[0-9])[a-z][a-z0-9]{0,4}/
This pattern says to:
(?=[^0-9]*[0-9]) assert that a single digit appears somewhere
[a-z] match an initial letter character
[a-z0-9]{0,4} then match zero to four letter or number characters
In your pattern, the quantifier {1,5} apllies to the group repeating this match [a-z]+[0-9]+[a-z]* 1 - 5 times.
If I am not mistaken, you want to match [a-z] from the start followed by 4 chars from which one of them is at least 1 digit so the minimum amount of characters is 2 and the maximum is 5.
You might use:
^(?=.{2,5}$)[a-z][a-z0-9]*[0-9][a-z0-9]*$
About the pattern
^ Start of string
(?=.{2,5}$ Assert string length 2 - 5 characters
[a-z] Match a-z
[a-z0-9]* Repeat 0+ times matching a-z 0-9
[0-9] Match a digit
[a-z0-9]* Repeat 0+ times matching a-z 0-9
$ End of string
Regex demo

Regular Expression for no repeating special characters (C#)

I am new to regular expressions and need a regular expression for address, in which user cannot enter repeating special characters such as: ..... or ,,,.../// etc and none of the special characters could be entered more than 5 times in the string.
...,,,....// =>No Match
Street no. 40. hello. =>Match
Thanks in advance!
I have tried this:
([a-zA-Z]+|[\s\,\.\/\-]+|[\d]+)|(\(([\da-zA-Z]|[^)^(]+){1,}\))
It selects all alphanumeric n some special character with no empty brackets.
You can use Negative lookahead construction that asserts what is invalid to match. Its format is (?! ... )
For your case you can try something like this:
This will not match the input string if it has 2 or more consecutive dots, commas or slashes (or any combination of them)
(?!.*[.,\/]{2}) ... rest of the regex
This will not match the input string if it has more than 5 characters 'A'.
(?!(.*A.*){5}) ... rest of the regex
This will match everything except your restrictions. Repplace last part (.*) with your regex.
^(?!.*[.,\/]{2})(?!(.*\..*){5})(?!(.*,.*){5})(?!(.*\/.*){5}).*$
Note: This regex may no be optimized. It may be faster if you use loop to iterate over string characters and count their occurences.
You can use this regex:
^(?![^,./-]*([,./-])\1)(?![^,./-]*([,./-])(?:[^,./-]*\2){4})[ \da-z,./-]+$
In C#:
foundMatch = Regex.IsMatch(yourString, #"^(?![^,./-]*([,./-])\1)(?![^,./-]*([,./-])(?:[^,./-]*\2){4})[ \da-z,./-]+$", RegexOptions.IgnoreCase);
Explanation
The ^ anchor asserts that we are at the beginning of the string
The negative lookahead (?![^,./-]*([,./-])\1) asserts that it is not possible to match any number of special chars, followed by one special char (captured to Group 1) followed by the same special char (the \1 backreference)
The negative lookahead (?![^,./-]*([,./-])(?:[^,./-]*\2){4}) ` asserts that it is not possible to match any number of special chars, followed by one special char (captured to Group 2), then any non-special char and that same char from Group 2, four times (five times total)
The $ anchor asserts that we are at the end of the string
A regular expression string to detect invalid strings is:
[^\w \-\r\n]{2}|(?:[\w \-]+[^\w \-\r\n]){5}
As C# string literal (regular and verbatim):
"[^\\w \\-\\r\\n]{2}|(?:[\\w \\-]+[^\\w \\-\\r\\n]){5}"
#"[^\w \-\r\n]{2}|(?:[\w \-]+[^\w \-\r\n]){5}"
It is much easier to find a string than to validate if a string does not contain ...
It can be checked with this expression if the string entered by the user is invalid because of a match of 2 special characters in sequence OR 5 special characters used in the string.
Explanation:
[^...] ... a negative character class definition which matches any character NOT being one of the characters listed within the square brackets.
\w ... a word character which is either a letter, a digit or an underscore.
The next character is simply a space character.
\- ... the hyphen character which must be escaped with a backslash within square brackets as otherwise the hyphen character would be interpreted as "FROM x TO z" (except when being the first or the last character within the square brackets).
\r ... carriage return
\n ... line-feed
Therefore [^\w \-\r\n] finds a character which is NOT a letter, NOT a digit, NOT an underscore, NOT a space, NOT a hyphen, NOT a carriage return and also NOT a line-feed.
{2} ... the preceding expression must match 2 such characters.
So with the expression [^\w \-\r\n]{2} it can be checked if the string contains 2 special characters in a sequence which makes the string invalid.
| ... OR
(?:...) ... none marking group needed here for applying the expression inside with the multiplier {5} at least 5 times.
[...] ... a positive character class definition which matches any character being one of the characters listed within the square brackets.
[\w \-]+ ... find a word character, or a space, or a hyphen 1 or more times.
[^\w \-\r\n] ... and next character being NOT a word character, space, hyphen, carriage return or line-feed.
Therefore (?:[\w \-]+[^\w \-\r\n]){5} finds a string with 5 "special" characters between "standard" characters.

Using Regular Expression for Phone Number

I know this question is asked like a thousand times in here, but I can't get the hang of it yet. I need help with checking a textbox if it matches a Phone Number format. The format should be likes this :
000-000-000 or (+000)00-000-000. Can anybody help me ?
give this pattern a try,
^(\(\+\d{3}\)|\d)\d{2}(-\d{3}){2}$
ScreenShot:
Generated Explanation:
Assert position at the beginning of a line (at beginning of the string or after a line break character) ^
Match the regular expression below and capture its match into backreference number 1 (\(\+\d{3}\)|\d)
Match either the regular expression below (attempting the next alternative only if this one fails) \(\+\d{3}\)
Match the character “(” literally \(
Match the character “+” literally \+
Match a single digit 0..9 \d{3}
Exactly 3 times {3}
Match the character “)” literally \)
Or match regular expression number 2 below (the entire group fails if this one fails to match) \d
Match a single digit 0..9 \d
Match a single digit 0..9 \d{2}
Exactly 2 times {2}
Match the regular expression below and capture its match into backreference number 2 (-\d{3}){2}
Exactly 2 times {2}
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. {2}
Match the character “-” literally -
Match a single digit 0..9 \d{3}
Exactly 3 times {3}
Assert position at the end of a line (at the end of the string or before a line break character) $
Pattern 1 is \d{3}\-\d{3}\-\d{3}
Pattern 2 is \(\+\d{3}\)\d{2}\-d{3}\-\d{3}
So you need to match for Pattern1 OR Pattern2:
(\d{3}\-\d{3}\-\d{3})|(\(\+\d{3}\)\d{2}\-d{3}\-\d{3})
(?:\d|\(\+\d{3}\))\d{2}(?:-\d{3}){2}
Or, if you're regarding of performance, better change it to:
(?:\(\+\d{3}\)|\d)\d{2}(?:-\d{3}){2}

Regex statement for only numbers between 0 and 255 in C#

How do i write regex statement for only numbers between 0 and 255? 0 and 255 will be valid for the statement.
You can find some numeric ranges here:
http://www.regular-expressions.info/numericranges.html
Your example would be:
^([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])$
^([0-9]{1,2}|1[0-9]{2}|2[0-4][0-9]|25[0-5])$
This tool is quite helpful for such things. A little searching doesn't hurt anyone either.
If you want to allow leading zeroes the pattern needs to be adapted, though. E.g.:
^([01][0-9][0-9]|2[0-4][0-9]|25[0-5])$
Try a negative look behind:
(?<!\-)\b0*([0-9]{1,2}|1[0-9]{2}|2[0-4][0-9]|25[0-5])\b
Explanation
<!--
(?<!\-)\b0*([0-9]{1,2}|1[0-9]{2}|2[0-4][0-9]|25[0-5])\b
Options: ^ and $ match at line breaks
Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!\-)»
Match the character “-” literally «\-»
Assert position at a word boundary «\b»
Match the character “0” literally «0*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the regular expression below and capture its match into backreference number 1 «([0-9]{1,2}|1[0-9]{2}|2[0-4][0-9]|25[0-5])»
Match either the regular expression below (attempting the next alternative only if this one fails) «[0-9]{1,2}»
Match a single character in the range between “0” and “9” «[0-9]{1,2}»
Between one and 2 times, as many times as possible, giving back as needed (greedy) «{1,2}»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «1[0-9]{2}»
Match the character “1” literally «1»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Or match regular expression number 3 below (attempting the next alternative only if this one fails) «2[0-4][0-9]»
Match the character “2” literally «2»
Match a single character in the range between “0” and “4” «[0-4]»
Match a single character in the range between “0” and “9” «[0-9]»
Or match regular expression number 4 below (the entire group fails if this one fails to match) «25[0-5]»
Match the characters “25” literally «25»
Match a single character in the range between “0” and “5” «[0-5]»
Assert position at a word boundary «\b»
-->
Try
([01]?\d\d?|2[0-4]\d|25[0-5])

Categories

Resources