regular expression validation not to allow asterisk - c#

I am trying not to allow asterisk character in my validation.
My regex expression is
addressFormat="^[a-zA-Z0-9 \~\!\#\#\$\%\^\*\(\)_\'\-\+\=\{\}\[\]\|\:\;\,\.\?\/]{0,45}$"
As specified from the link, Link I tried adding [^\*] as below.
"^[a-zA-Z0-9 \~\!\#\#\$\%\^\*\(\)_\'\-\+\=\{\}\[\]\|\:\;\,\.\?\/][^\*]{0,45}$"
"^[^\*][a-zA-Z0-9 \~\!\#\#\$\%\^\*\(\)_\'\-\+\=\{\}\[\]\|\:\;\,\.\?\/]{0,45}$"
But it is allowing asterisk * character in my textbox. What is the mistake in my code. ? Any suggestions..

Your regex can be simplified to:
"^[a-zA-Z0-9 ~!##$%^*()_'+={}\[\]|:;,.?/-]{0,45}$"
and, as [a-zA-Z0-9_] is the same as \w:
"^[\w~!##$%^*()'+={}\[\]|:;,.?/-]{0,45}$"
then you could remove the *:
"^[\w~!##$%^()'+={}\[\]|:;,.?/-]{0,45}$"

First, for your information, you can simplify your regex to:
^(?i)[-a-z0-9 ~!##$%^()_'+={}[\]|:;,.?/]{0,45}$
Since you are using C#, do not yield to the temptation of replacing [0-9a-z_] with \w unless you use the ECMAScript option, as C# assumes your strings are utf-8 by default, and \w will too happily match Arabic digits, Nepalese characters and so forth, which you might not want... Unless this is okay:
abcdᚠᚱᚩᚠᚢᚱტყაოსdᚉᚔమరמטᓂᕆᔭᕌसられま래도654۳۲١८৮੪૯୫୬१७੩௮௫౫೮൬൪๘໒໕២៧៦᠖
(But that's 60 chars, over your 45 limit anyway... Whew.)
More interestingly:
What was wrong before?
When you have a regex such as [^*][a-z] (simplifying your earlier expression), the [^*] matches exactly one character, then the [a-z] matches exactly one other character (the next one). They do not work together to impose a condition on the next character. Each of them are character classes, and each character specifies the next character to be matched, subject to an optional quantifier (in your case, the {0,45}
Would this work?
On the surface, this might look like the ticket, but I do not recommend it:
^[^*]{0,45}$
Why not? This matches any character that is not an asterisk, zero to 45 times. That sounds good, but eligible characters would include tabs, new lines, and any glyph in any language... Probably not what you are looking for.

Delete \* from your expression.
Also look at this link - it's really helpfull when you writing the regular expressions.
jsFiddle example
HTML
<form>
<input type="text" required pattern="^[a-zA-Z0-9 \~\!\#\#\$\%\^\(\)_\'\-\+\=\{\}\[\]\|\:\;\,\.\?\/]{0,45}$" title="incorrect format"/>
<input type="submit"/>
</form>

Related

Difficulty finding where to insert "word exclusion" in a regex

I know the regex for excluding words, roughly anyway, It would be (!?wordToIgnore|wordToIgnore2|wordToIgnore3)
But I have an existing, complicated regex that I need to add this to, and I am a bit confused about how to go about that. I'm still pretty new to regex, and it took me a very long time to make this particular one, but I'm not sure where to insert it or how ...
The regex I have is ...
^(?!.*[ ]{2})(?!.*[']{2})(?!.*[-]{2})(?:[a-zA-Z0-9 \:/\p{L}'-]{1,64}$)$
This should only allow the person typing to insert between 1 and 64 letters that match that pattern, cannot start with a space, quote, double quote, special character, a dash, an escape character, etc, and only allows a-z both upper and lowercase, can include a space, ":", a dash, and a quote anywhere but the beginning.
But I want to forbid them from using certain words, so I have this list of words that I want to be forbidden, I just cannot figure out how to get that to fit into here.. I tried just pasting the whole .. "block" in, and that didn't work.
?!the|and|or|a|given|some|that|this|then|than
Has anyone encountered this before?
ciel, first off, congratulations for getting this far trying to build your regex rule. If you want to read something detailed about all kinds of exclusions, I suggest you have a look at Match (or replace) a pattern except in situations s1, s2, s3 etc
Next, in your particular situation, here is how we could approach your regex.
For consision, let's make all the negative lookarounds more compact, replacing them with a single (?!.*(?: |-|'){2})
In your character class, the \: just escapes the colon, needlessly so as : is enough. I assume you wanted to add a backslash character, and if so we need to use \\
\p{L} includes [a-zA-Z], so you can drop [a-zA-Z]. But are you sure you want to match all letters in any script? (Thai etc). If so, remember to set the u flag after the regex string.
For your "bad word exclusion" applying to the whole string, place it at the same position as the other lookarounds, i.e., at the head of the string, but using the .* as in your other exclusions: (?!.*(?:wordToIgnore|wordToIgnore2|wordToIgnore3)) It does not matter which lookahead comes first because lookarounds do not change your position in the string. For more on this, see Mastering Lookahead and Lookbehind
This gives us this glorious regex (I added the case-insensitive flag):
^(?i)(?!.*(?:wordToIgnore|wordToIgnore2|wordToIgnore3))(?!.*(?: |-|'){2})(?:[\\0-9 :/\p{L}'-]{1,64}$)$
Of course if you don't want unicode letters, replace \p{L} with a-z
Also, if you want to make sure that the wordToIgnore is a real word, as opposed to an embedded string (for instance you don't want cat but you are okay with catalog), add boundaries to the lookahead rule: (?!.*\b(?:wordToIgnore|wordToIgnore2|wordToIgnore3)\b)
use this:
^(?!.*(the|and|or|a|given|some|that|this|then|than))(?!.*[ ]{2})(?!.*[']{2})(?!.*[-]{2})(?:[a-zA-Z0-9 \:\p{L}'-]{1,64}$)$
see demo

What does .* do in regex?

After extensive search, I am unable to find an explanation for the need to use .* in regex. For example, MSDN suggests a password regex of
#\"(?=.{6,})(?=(.*\d){1,})(?=(.*\W){1,})"
for length >= 6, 1+ digit and 1+ special character.
Why can't I just use:
#\"(?=.{6,})(?=(\d){1,})(?=(\W){1,})"
.* just means "0 or more of any character"
It's broken down into two parts:
. - a "dot" indicates any character
* - means "0 or more instances of the preceding regex token"
In your example above, this is important, since they want to force the password to contain a special character and a number, while still allowing all other characters. If you used \d instead of .*, for example, then that would restrict that portion of the regex to only match decimal characters (\d is shorthand for [0-9], meaning any decimal). Similarly, \W instead of .*\W would cause that portion to only match non-word characters.
A good reference containing many of these tokens for .NET can be found on the MSDN here: Regular Expression Language - Quick Reference
Also, if you're really looking to delve into regex, take a look at http://www.regular-expressions.info/. While it can sometimes be difficult to find what you're looking for on that site, it's one of the most complete and begginner-friendly regex references I've seen online.
Just FYI, that regex doesn't do what they say it does, and the way it's written is needlessly verbose and confusing. They say it's supposed to match more than seven characters, but it really matches as few as six. And while the other two lookaheads correctly match at least one each of the required character types, they can be written much more simply.
Finally, the string you copied isn't just a regex, it's an XML attribute value (including the enclosing quotes) that seems to represent a C# string literal (except the closing quote is missing). I've never used a Membership object, but I'm pretty sure that syntax is faulty. In any case, the actual regex is:
(?=.{6,})(?=(.*\d){1,})(?=(.*\W){1,})
..but it should be:
(?=.{8,})(?=.*\d)(?=.*\W)
The first lookahead tries to match eight or more of any characters. If it succeeds, the match position (or cursor, if you prefer) is reset to the beginning and the second lookahead scans for a digit. If it finds one, the cursor is reset again and the third lookahead scans for a special character. (Which, by the way, includes whitespace, control characters, and a boatload of other esoteric characters; probably not what the author intended.)
If you left the .* out of the latter two lookaheads, you would have (?=\d) asserting that the first character is a digit, and (?=\W) asserting that it's not a digit. (Digits are classed as word characters, and \W matches anything that's not a word character.) The .* in each lookahead causes it to initially gobble up the whole string, then backtrack, giving back one character at a time until it reaches a spot where the \d or \W can match. That's how they can match the digit and the special character anywhere in the string.
The .* portion just allows for literally any combination of characters to be entered. It's essentially allowing for the user to add any level of extra information to the password on top of the data you are requiring
Note: I don't think that MSDN page is actually suggesting that as a password validator. It is just providing an example of a possible one.

ASP.Net: RegularExpressionValidator ValidationExpression to prevent `;` and `–` at a time

I want to make an ASP.Net RegularExpressionValidator ValidationExpression that will prevent to give input ; and –-.
I can do it for – and it is ValidationExpression="[^-]*".
<asp:RegularExpressionValidator ID="RegularExpressionValidator1"
runat="server"
ControlToValidate="TextBox1" Display="Dynamic" ErrorMessage="**"
ValidationExpression="[^-]*"></asp:RegularExpressionValidator>
Above expression can prevent single – character. I will give permission for single – character. I will block for double – characters (--).
I need to prevent ; and – at a time.
Can anyone help me?
The Regex needs to get smarter. You want to block multiple hyphens, the m-dash, and semicolons.
So, try #"^(?=[^–;]*)((?!--).)*$"
Breaking it down:
The ^ matches the start of a line, and helps ensure that the validator is used to match the entire string.
The expression in the first set of parentheses will match any set of characters that do not include the m-dash and semicolon. You may want to substitute the hex value for the m-dash using the \x2014 escape sequence. It is defined as non-consuming with the ?=, meaning the Regex engine must match this pattern but will not advance its index when it finds the match, so the same set of characters will be tested for the next pattern as well.
The expression in the second set of parentheses is an inverse look-ahead; it will match any set of characters not containing two (or more) adjacent hyphens. This may be a bit slow, however; this regex basically forces the Regex engine to consider each character one at a time, looking ahead from that point to ensure the next character won't make match the inverse pattern.
The trailing $ marks the end of a line; together with the ^ it ensures that you are looking at everything in a single string (or line in multiline data) when determining a match.
Plug this into a Regex tester like Derek Slater's and play around with it to make sure it will stop all the scenarios you want.

Password Regex (client side javascript)

I need a regex for the following criteria:
Atleast 7 alphanumeric characters with 1 special character
I used this:
^.*(?=.{7,})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$!%^&+=]).*$
It works fine if I type Password1! but doesnt work for PASSWORD1!.
Wont work for: Stmaryshsp1tal!
I am using the Jquery validation plugin where I specify the regex.
When I use a regular expression validator and specify the following regex:
^.*(?=.{7,})(?=(.*\W){1,}).*$
It works perfectly without any issues. When I set this regex in the Jquery validation I am using it doesnt work.
Please can someone shed some light on this? I want to understand why my first regex doesnt work.
(?=.\d)(?=.[a-z])
tries to match a digit and an alphanumeric character at the same place. Remember that (?= ... ) does not glob anything.
What you want is probably:
^(?=.*\W)(?=(.*\w){7})
This is exactly the same as veryfying that your string both matches ^.*\W (at least one special character) and ^(.*\w){7}) (7 alphanumeric characters. Note that it also matches if there are more.
Try this regex:
\S*[##$!%^&+=]+\S*(?<=\S{7,})
EDIT3: Ok, this is last edit ;).
This will match also other special characters. So if you wan't limit the number of valid characters change \S to range of all valid characters.
Here is the regex , I think it can handle all possible combination..
^(?=.{7,})\w*[.##$!%^&+=]+(\w*[.##$!%^&+=]*)*$
here is the link for this regex, http://regexr.com?2tuh5
As a good tool for quickly testing regular expressions I'd suggest http://regexpal.com/ (no relations ;) ). Sometimes simplifying your expression helps a lot.
Then you might want to try something like ^[a-zA-Z0-9##$!%^&+=]{7,}$
Update 2 now including digits
^.*(?=.{7,})(?=.*\d)(?=.*[a-zA-Z])(?=.*[##$%^&+=!]).*$
This matches:
Stmarysh3sptal!, password1!, PASSWORD1P!!!!!!##^^ASSWORD1, 122ss121a212!!
... but not:
Password1, PASSWORD1PASSWORD1, PASSWORD!, Password!, 1221121212!! etc
The reason it matches Password1! but not PASSWORD1! is this clause:
(?=.*[a-z])
That requires at least one lowercase letter in the password. The pattern says that the password must be at least 7 characters long, and contain both uppercase and lowercase letters, at least one number, and at least one of ##$!%^&+=. PASSWORD1! fails because there are no lowercase letters in it.
The second pattern accepts PASSWORD1! because it's a far, far weaker password requirement. All it requires is that the password is 7+ characters and has at least one special character in it (other than _). The {1,} is unnecessary, by the way.
If I were you, I'd avoid weakening the password and just leave it as it is. If I wanted to allow all-lowercase or all-uppercase passwords for some reason, I'd simply change it to
^(?=.*\d)(?=.*[a-zA-Z])(?=.*[##$!%^&+=]).{7,}$
...thus not weakening the password requirements any more than I had to.

How Can I Check If a C# Regular Expression Is Trying to Match 1-(and-only-1)-Character Strings?

Maybe this is a very rare (or even dumb) question, but I do need it in my app.
How can I check if a C# regular expression is trying to match 1-character strings?
That means, I only allow the users to search 1-character strings. If the user is trying to search multi-character strings, an error message will be displaying to the users.
Did I make myself clear?
Thanks.
Peter
P.S.: I saw an answer about calculating the final matched strings' length, but for some unknown reason, the answer is gone.
I thought it for a while, I think calculating the final matched strings length is okay, though it's gonna be kind of slow.
Yet, the original question is very rare and tedious.
a regexp would be .{1}
This will allow any char though. if you only want alpanumeric then you can use [a-z0-9]{1} or shorthand /w{1}
Another option its to limit the number of chars a user can type in an input field. set a maxlength on it.
Yet another option is to save the forms input field to a char and not a string although you may need some handling around this to prevent errors.
Why not use maxlength and save to a char.
You can look for unescaped *, +, {}, ? etc. and count the number of characters (don't forget to flatten the [] as one character).
Basically you have to parse your regex.
Instead of validating the regular expression, which could be complicated, you could apply it only on single characters instead of the whole string.
If this is not possible, you may want to limit the possibilities of regular expression to some certain features. For instance the user can only enter characters to match or characters to exclude. Then you build up the regex in your code.
eg:
ABC matches [ABC]
^ABC matches [^ABC]
A-Z matches [A-Z]
# matches [0-9]
\w matches \w
AB#x-z matches [AB]|[0-9]|[x-z]|\w
which cases do you need to support?
This would be somewhat easy to parse and validate.

Categories

Resources