Regex - Match first and last character within capturing group - c#

I want to capture the first and last character within a capturing group.
My current RegEx is -
([\w\.]+)#([\w]+)\.com
For example, if there is an email address -
xyz#test.com
This is the output -
Full match 0-12 `xyz#test.com`
Group 1. 0-3 `xyz`
Group 2. 4-8 `test`
The email address can have alphanumeric and period values.
If I want to curtail the Group 1 such that it starts and ends with only alphanumeric values, how to do that?
I want to modify this capturing group -
([\w\.]+)
The required output is -
xyz.#test.com Invalid
.xyz#test.com Invalid
xy.z#test.com Valid

To tell engine match English alphanumeric characters at the start position and one before # you need to do this:
^([a-zA-Z0-9][\.a-zA-Z0-9]*[a-zA-Z0-9])#([a-zA-Z0-9]+)\.com$
Note: \w includes _ that you may not desire.
But this doesn't allow usernames with one character long. So you have to modify it a little:
^([a-zA-Z0-9]+(?:\.+[a-zA-Z0-9]+)*)#([a-zA-Z0-9]+)\.com$
Also this shouldn't be considered a good email validator. But as it seems you narrow down matching to .com TLD so I assume this is a very specific requirement otherwise it limits domain name to alphanumerics and doesn't allow many more characters that would be valid in an email address according to RFC 822. This would be enough for capturing an email address from user input:
^[^\s#]+#[^\s#]+$

Try this regex - (^[\w][\w\.\w]+[\w])#([\w]+)\.com

This works:
^([0-9a-zA-Z][a-zA-Z0-9_\.]*)(?<!\.)#([a-zA-Z0-9_]+)\.com$
Demo
Basically, it tries to match alphanumeric characters at the start, then [a-zA-Z0-9_\.] for 0 or more times. Before it reaches #, it will look behind to check if there is a dot (if it is not an alphanumeric, it's gotta be a dot).

Related

Regex pattern is not correct

I have the following regex to match this:
U$MichaelU$P#$asdqwe123P#$ - this is correct; the other two are not
U$NameU$P#$PasswordP#$
U$UserU$P#$ad2P#$
A registration is valid when:
The username is surrounded by "U$"
The username needs to be minimum 3 characters long, start with an uppercase letter, followed only by lowercase letters
The password is surrounded by "P#$"
The password needs to start with minimum 5 alphabetical letters (not including digits) and must end with a digit
My regex is
#"^(U\S)([A-z][a-z]{3,})\1(P#\S)([a-z]{5,}[^\d])([\d]+)\3$"
The problem is that it matches the first one but when I submit to the judge it passes first 2 test but the rest it breaks, could you please tell me where is my mistake.
Hello your regex must be
#"^(U\S)([A-Za-z]{3,})\1(P#\S)([A-Za-z0-9]{5,})\3$"
it works for you

Regex to match more than one word

I have an ASP.NET MVC application containing a form field called 'First/last name'. I need to add some basic validation to ensure people enter at least two words. It doesn't need to be totally comprehensive in checking word length etc, we essentially just need to prevent people from entering just their first name which is what's happening currently. I don't want to limit to just alphabetic characters as some names include punctuation. I just want to ensure that people have entered at least two words separated by a space.
I have the following regex currently:
[RegularExpression(#"^((\b[a-zA-Z]{2,40}\b)\s*){2,}$", ErrorMessage = "Invalid first/last name")]
This works to an extent (it checks for 2 words) but it's invalid if punctuation is entered, which isn't what I'm looking for.
Could anyone suggest how to modify the above so that it doesn't matter if punctuation is used in the words? I'm not good with the regular expression syntax, hence asking here.
Thanks.
You want two words, so at least one space between them, and beyond that you want to allow everything else (e.g., punctuation). So keep it simple:
\w.*\s.*\w
Or if you must anchor it to start and end:
^.*\w.*\s.*\w.*$
These will match, for example, D' Addario (but not D'Artagnan by itself, since it counts as one word by the space criterion).
Maybe just:
#"\w\s\w"
word white space word
Hi you can use this regex for validation
'^[a-zA-Z0-9]+ {1}[a-zA-Z0-9]+$`'
Demo http://rubular.com/r/YN8eFa1yFE
If you just want to allow a sequence of non-whitespace characters followed by 1 or more sequences of whitespace characters followed by non-whitespace characters, you can use
^\s*\S+(?:\s+\S+)+\s*$
See regex demo
It won't accept just First or First .
Regex breakdown:
^ - start of string
\s* - zero or more whitespace
\S+ - 1 or more non-whitespace symbols
(?:\s+\S+)+ - 1 or more sequences of ...
\s+ - 1 or more whitespace sequences (remove + to allow only 1 whitespace between words)
\S+ - 1 or more non-whitespace symbols
\s* - zero or more whitespace
$ - end of string

Regular Expression group reversed order

I am reading in a very messy file with very little (if any) format. I am looking for the following two of which I have working properly.
Name (first and last) working
Email addresses (varying types (eg. .edu .net .com) There could be others as well.) working
Employee number (two capital letters followed by 5 digit values then the same two letters as the first but reversed) NOT Working
The code I have currently for the Employee regex:
string employeeNumber = #"(?<grp1>[A-Z]{2})[0-9]{5}[A-Z]{2}";
This finds the required values, but would also find invalid employee numbers since it is not actually looking for the first two capital chars in the opposite order.
What I would like in the end is to some how use the <grp1> only in the reversed order.
Example of a valid employee number XY12345YX.
I could not find any good documentation on any type of regular expression group reversal. Any Ideas would be great!
EDIT:
This is an example of a line from a text document that I am reading in.
'Name list from PQP-97 system &%$ Bill Williams MK12345KM bwilliams01#msn.com ^ %20%
Fredericka Hanover GW22887WG freddie#verizon.net'
Try this:
/.*?([A-Z][a-z]*)\s+([A-Z][a-z]*)\s+(([A-Z])([A-Z])[0-9]{5}\5\4)\s+\(\S+#\S+).*/g
Regex101 Demo: https://regex101.com/r/iB9vF2/2
Match1 = First Name
Match2 = Last Name
Match3 = Employee ID
Match4 = (ignore this; just used for finding employee id)
Match5 = (ignore this; just used for finding employee id)
Match6 = Email
Explanation:
.*? - ignore any rubbish before the first name
([A-Z][a-z]*) - first name begins with a capital followed by any number of lower case letters
\s+ - 1 or more spaces marks the end of the first name
([A-Z][a-z]*) - last name follows first name, and follows the same pattern
\s+ - last name terminated by space(s)
(([A-Z])([A-Z])[0-9]{5}\5\4) - employee id follows last name, in the format Capital1, Capital2 then 5 digits, then a repeat of Capital2 (match5) and Capital1 (match4)
\s+ - space(s) shows the end of the employee id
(\S+#\S+) - non space characters either side of an # symbol make up the email*
.* - this just allows for junk on the end of the string. It won't match the mail, since the \S+ is greedy, but it will cater for any other character, thus also representing the end of the email.
* NB: the email regex is overly simple; should be enough for your needs, but this couldn't check for valid emails, since the rules around those are complex.
Further reading: Using a regular expression to validate an email address

Regular expression for adding special character to phone number

I have added the following regular expression for validating a mobile phone number:
(^07[1,2,3,4,5,7,8,9][0-9]{7,8}$)
I want to allow the user to enter a # character too and I'm not sure where to fit it in. They may need to enter # character after they have dialed a number, or at the beginning of a number to dial a direct number or an extension.
First, your current regex matches 'numbers' of the format 07,12345678 as well. So you need to change [1,2,3,4,5,7,8,9] to [1-9] (when you have a - between two characters in a character class, it usually means that there's a range)
If you want to accept an optional # character, you can use the ? quantifier which means 0 or 1 times.
^#?07[1-9][0-9]{7,8}#?$
regex101 demo
Except that, as you can see in the demo, it will also match numbers with two hashes; one at the front and one at the end. One option to circumvent this is to use some conditionals (which C# can support).
^(#)?07[1-9][0-9]{7,8}(?(1)|#?)$
regex101 demo
(?(1)|#?) basically means that if the first hash was matched, then nothing more should be matched. Otherwise, if no hash was initially matched, then it can match a hash, if there is one at the end of the number.
In C#, it will be a bit like this:
Regex.Match(myString, #"^(#)?07[1-9][0-9]{7,8}(?(1)|#?)$");
Or you could use a negative lookahead to make sure there's never more than one hash in the number:
^(?!.*#.*#.*$)#?07[1-9][0-9]{7,8}#?$

How to match a comma separated list of emails with regex?

Trying to validate a comma-separated email list in the textbox with asp:RegularExpressionValidator, see below:
<asp:RegularExpressionValidator ID="RegularExpressionValidator1"
runat="server" ErrorMessage="Wrong email format (separate multiple email by comma [,])" ControlToValidate="txtEscalationEmail"
Display="Dynamic" ValidationExpression="([\w+-.%]+#[\w-.]+\.[A-Za-z]{2,4},?)" ValidationGroup="vgEscalation"></asp:RegularExpressionValidator>
It works just fine when I test it at http://regexhero.net/tester/, but it doesn't work on my page.
Here's my sample input:
test#test.com,test1#test.com
I've tried a suggestion in this post, but couldn't get it to work.
p.s. I don't want a discussion on proper email validation
This Regex will allow emails with spaces after the commas.
^[\W]*([\w+\-.%]+#[\w\-.]+\.[A-Za-z]{2,4}[\W]*,{1}[\W]*)*([\w+\-.%]+#[\w\-.]+\.[A-Za-z]{2,4})[\W]*$
Playing around with this, a colleague came up with this RegEx that's more accurate. The above answer seems to let through an email address list where the first element is not an email address. Here's the update which also allows spaces after the commas.
Try this:
^([\w+-.%]+#[\w-.]+\.[A-Za-z]{2,4},?)+$
Adding the + after the parentheses means that the preceding group can be present 1 or more times.
Adding the ^ and $ means that anything between the start of the string and the start of the match (or the end of the match and the end of the string) causes the validation to fail.
The first answer which is selected as best matches the string like abc#xyz.comxyz#abc.com which is invalid.
The following regex will work for comma separated email ids awesomely.
^([\w+-.%]+#[\w.-]+\.[A-Za-z]{2,4})(,[\w+-.%]+#[\w.-]+\.[A-Za-z]{2,4})*$
It will match single emailId, comma separated emailId but not if comma is missed.
First group will match string of single emailId. Second group is optionally required by '*' token i.e. either 0 or more number of such group but ',' is required to be at the beginning of such emailId which makes comma separated emailId to match to the above regex.
A simple modification of #Donut's answer allows adjacent commas, all TLDs of two characters or more, and arbitrary whitespace between email addresses and commas.
^([\w+-.%]+#[\w-.]+\.[A-Za-z]{2,}(\s*,?\s*)*)+$
You will need to split and remove whitespace and empty strings on your side, but this should be an overall better user experience.
Examples of matched lists:
person#example.co,chris#o.com,simon#example.capetown
person#example.co ,, chris#o.com, simon#example.capetown
^([a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+,*[\W]*)+$
This will also work. It's a little bit stricter on emails, and doesn't that there be more than one email address entered or that a comma be present at all.
The following RegEx will work even with some of the weirdest emails out there, and it supports a comma between emails.
((?:[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")#(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]),?)+
A few Examples:
Valid: planetearth#solar.com
Valid: planet.earth#solar.com
Valid: planet.earth#solar.com,blue.planet#solar.com
Valid: planet-earth#solar-system.com,/#!$%&'*+-/=?^_`{}|~#solar.org,"!#$%&'-/=^_`{}|~.a"#solar.org
Invalid: planet earth#solar.com
Hope This helps.
^([\w+.%-]+#[\w.-]+\.[A-Za-z]{2,})( *,+ *(?1))*( *,* *)$
The point about requiring a comma between groups, but not necessarily at the end is handled here - I'm mostly adding this as it includes a nice subgroup with the (?1) so you only define the actual email address regex once, and then can muck about with delimiters.
Email address ref here: https://www.regular-expressions.info/email.html
The regex below is less restrictive and more appropriate for validating a manually-entered list of comma-separated email addresses. It allows for adjacent commas.
^([\w+-.%]+#[\w-.]+\.[A-Za-z]{2,4},*[\W]*)+$
Use the following regex, it will resolve your problem. The following regex will entertain post and pre spaces with comma too
/^((([a-zA-Z0-9_-.]+)#([a-zA-Z0-9_-.]+).([a-zA-Z\s?]{2,5}){1,25})(\s?,\s*?))$/
I'm a bit late to the party, I know, but I figured I'd add my two cents, since the accepted answer has the problem of matching email addresses next to each other without a comma.
My proposed regex is this:
^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,}(,[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,})*$
It's similar to the accepted answer, but solves the problem I was talking about. The solution I came up with was instead of searching for "an email address followed by an optional comma" one or more times, which is what the accepted answer does, this regex searches for "an email address followed by an optional comma prefixed email address any number of times".
That solves the problem by grouping the comma with the email address after it, and making the entire group optional, instead of just the comma.
Notes:
This regex is meant to be used with the insensitive flag enabled.
You can use whichever regex to match an email address you please, I just used the one that I was already using. You would just replace each [A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,} with whichever regex you want to use.
The solution that work for me is the following
^([a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)(,([a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+))*
The easiest solution would be as following. This will match the string with comma-separated list. Use the following regex in your code.
Regex: '[^,]+,?'
^([\w+-.%]+#[\w-.]+\.[A-Za-z]+)(, ?[\w+-.%]+#[\w-.]+\.[A-Za-z]+)*$
Works correctly with 0 or 1 spaces after each comma and also for long domain extensions
This works for me in JS and TS
^([a-z0-9!#$%&'*+/=?^_`{|}~.-]+#[a-z0-9]([a-z0-9-]*[a-z0-9])?(\.[a-z0-9]([a-z0-9-]*[a-z0-9])?)*)(([, ]+[a-z0-9!#$%&'*+/=?^_`{|}~.-]+#[a-z0-9]([a-z0-9-]*[a-z0-9])\.([a-z0-9]([a-z0-9-]*[a-z0-9]))*)?)*$
You can check it out here
https://regex101.com/r/h0l9ks/1
The regex i have for this issue all well except that we need to add comma after every email address.
^((\s*?)[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z,]{2,4}(\s*?),)*
The explanation for this will be like this:
(\s*?) will allow spaces at the start.
[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z,]{2,4} is common email pattern.
(\s*?) will allow space at the end too.
, will restrict comma.
For me, this one works perfectly for multiple emails:
^(\w+((-\w+)|(\.\w+))*\#[A-Za-z0-9]+((\.|-)[A-Za-z0-9]+)*\.[A-Za-z0-9]{2,4}\s*?,?\s*?)+$
RegEx Component
Explanation
^
Matches the start of the string.
\w+
Matches one or more word characters (letters, digits or underscores).
((-\w+)|(\.\w+))*
Matches zero or more occurrences of a hyphen followed by one or more word characters or a period followed by one or more word characters.
\#
Matches the # symbol.
[A-Za-z0-9]+
Matches one or more letters or digits.
((\.|-)[A-Za-z0-9]+)*
Matches zero or more occurrences of a period or hyphen followed by one or more letters or digits.
\.[A-Za-z0-9]{2,4}
Matches a period followed by two to four letters or digits.
\s*?,?\s*?
Matches optional whitespace followed by an optional comma followed by optional whitespace.
+
Matches one or more occurrences of the entire expression.
$
Matches the end of the string.

Categories

Resources