I need a regex pattern (must be a single pattern) to match any text that contains a number, excluding a specific literal (i.e. "SomeText1").
I have the match any text containing a number part:
^.*[0-9]+.*$
But am having a problem excluding a specific literal.
Update: This is for .NET Regex.
Thanks in advance.
As a verbose regex:
^ # Start of string
(?=.*[0-9]) # Assert presence of at least one digit
(?!SomeText1$) # Assert that the string is not "SomeText1"
.* # If so, then match any characters
$ # until the end of the string
If your regex flavor doesn't support those:
^(?=.*[0-9])(?!SomeText1$).*$
Use negative-look-ahead:
^(?!.*?SomeText1).*?[0-9]+.*$
Related
Currently i using this pattern: [HelloWorld]{1,}.
So if my input is: Hello -> It will be match.
But if my input is WorldHello -> Still match but not right.
So how to make input string must match exactly will value inside pattern?
Just get rid of the square brackets, and the comma and you're good to go!
HelloWorld{1}
In regex what's between square brackets is a character set.
So [HelloWorld] matches 1 character that's in the set [edlorHW].
And .{1,} or .+ both match 1 or more characters.
What you probably want is the literal word.
So the regex would simple be "HelloWorld".
That would match HelloWord in the string "blaHelloWorldbla".
If you want the word to be a single word, and not part of a word?
Then you could use wordboundaries \b, which indicate the transition between a word character (\w = [A-Za-z0-9_]) and a non-word character (\W = [^A-Za-z0-9_]) or the beginning of a line ^ or the end of a line $.
For example #"\bHelloWorld\b" to get a match from "bla HelloWorld bla" but not from "blaHelloWorldbla".
Note that the regex string this time was proceeded by #.
Because by using a verbatim string the backslashes don't have to be backslashed.
it seems you need to use online regex tester web sites to check your pattern. for example you could find one of them here and also you could study c# regex reference here
Try this pattern:
[a-zA-Z]{1,}
You can test it online
I'm trying to build a regular expression in c# to check whether a string follow a specific format.
The format i want is: [digit][white space][dot][letters]
For example:
123 .abc follow the format
12345 .def follow the format
123 abc does not follow the format
I write this expression but it not works completelly well
Regex.IsMatch(exampleString, #"^\d+ .")
^ matches the start of the string, and you got it right.
\d+ matches one or more digits, and you got that one right as well.
A space in a regex matches a literal space, so that works too!
However, a . is a wildcard and will match any one character. You will need to escape it with a backslash like this if you want to match a literal period: \..
To match letters now, you can use [a-z]+ right after the period.
#"^\d+ \.[a-z]+"
The dot is a special character in regex, which matches any character (except, typically, newlines). To match a literal ., you need to escape it:
Regex.IsMatch(exampleString, #"^\d+ \.")
If you want to include the condition for the succeeding letters, use:
Regex.IsMatch(exampleString, #"^\d+ \.[A-Za-z]+$")
For you to get yours to match, keep in mind that the period in regular expressions is a special character that will match any character, so you'll need to escape that.
In addition, \s is a match for any white-space character (tabs, line breaks).
^\d+\s+ \..+
(untested)
I created the following regex expression for my C# file. Bascily I want the user's input to only be regular characters (A-Z lower or upper) and numbers. (spaces or symbols ).
[a-zA-Z0-9]
For some reason it only fails when its a symbol on its own. if theres characters mixed with it then the expression passes.
I can show you my code of how I implment it but I think its my expression.
Thanks!
The problem is that it can match anywhere. You need anchors:
^[a-zA-Z0-9]+\z
^ matches the start of a string, and \z matches the end of a string.
(Note: in .NET regex, $ matches the end of a string with an optional newline.)
This is because it will match any character in the string you need the following.
Forces it to match the entire string not just part of it
^[0-9a-zA-Z]*$
That regex will match every single alphanumeric character in the string as separate matches.
If you want to make sure the whole string the user entered only has alphanumeric characters you need to do something like:
^[a-zA-Z0-9]+$
Are you making sure to check the whole string? That is are you using an expression like
^[a-zA-Z0-9]*$
where ^ means the start of the string and $ means the end of the string?
I have this password regex for an application that is being built its purpose is to:
Make sure users use between 6 - 12 characters.
Make sure users use either one special character or one number.
Also that its case insensitive.
The application is in .net I have the following regex:
I have the following regex for the password checker, bit lengthy but for your viewing if you feel any of this is wrong please let me know.
^(?=.*\d)(?=.*[A-Za-z]).{6-12}$|^(?=.*[A-Za-z])(?=.*[!#$%&'\(\)\*\+-\.:;<=>\?#\[\\\]\^_`\{\|\}~0x0022]|.*\s).{6,12}$
Just a break down of the regex to make sure your all happy it’s correct.
^ = start of string ”^”
(?=.*\d) = must contain “?=” any set of characters “.*” but must include a digit “\d”.
(?=.*[A-Za-z]) = must contain “?=” any set of characters “.*” but must include an insensitive case letter.
.{6-12}$ = must contain any set of characters “.” but must have between 6-12 characters and end of string “$”.
|^ = or “|” start of string “^”
(?=.*[A-Za-z]) = must contain “?=” any set of characters “.*” but must include an insensitive case letter.
(?=.*[!#$%&'\(\)\*\+-\.:;<=>\?#\[\\\]\^_`\{\|\}~0x0022]|.*\s) = must contain “?=” any set of characters “.*” but must include at least special character we have defined or a space ”|.*\s)”. “0x0022” is Unicode for single quote “ character.
.{6,12}$ = set of characters “.” must be between 6 – 12 and this is the end of the string “$”
It's quite long winded, seems to be doing the job but I want to know if there is simpler methods to write this sort of regex and I want to know how I can shorten it if its possible?
Thanks in Advanced.
Does it have to be regex? Looking at the requirements, all you need is String.Length and String.IndexOfAny().
First, good job at providing comments for your regex. However, there is a much better way. Simply write your regex from the get-go in free-spacing mode with lots of comments. This way you can document your regex right in the source code (and provide indentation to improve readability when there are lots of parentheses). Here is how I would write your original regex in C# code:
if (Regex.IsMatch(usernameString,
#"# Validate username having a digit and/or special char.
^ # Either... Anchor to start of string.
(?=.*\d) # Assert there is a digit AND
(?=.*[A-Za-z]) # assert there is an alpha.
.{6-12} # Match any name with length from 6 to 12.
$ # Anchor to end of string.
| ^ # Or... Anchor to start of string
(?=.*[A-Za-z]) # Assert there is an alpha AND
(?=.* # assert there is either a special char
[!#$%&'\(\)\*\+-\.:;<=>\?#\[\\\]\^_`\{\|\}~\x22]
| .*\s # or a space char.
) # End specialchar-or-space assertion.
.{6-12} # Match any name with length from 6 to 12.
$ # Anchor to end of string.
", RegexOptions.IgnorePatternWhitespace)) {
// Valid username.
} else {
// Invalid username.
}
The code snippet above uses the preferable #"..." string syntax which simplifies the escaping of metacharacters. This original regex erroneously separates the two numbers of the curly brace quantifier using a dash, i.e. .{6-12}. The correct syntax is to separate these numbers with a comma, i.e. .*{6,12}. (Maybe .NET allows using the .{6-12} syntax?) I've also changed the 0x0022 (the " double quote char) to \x22.
That said, yes the original regex can be improved a bit:
if (Regex.IsMatch(usernameString,
#"# Validate username having a digit and/or special char.
^ # Anchor to start of string.
(?=.*?[A-Za-z]) # Assert there is an alpha.
(?: # Group for assertion alternatives.
(?=.*?\d) # Either assert there is a digit
| # or assert there is a special char
(?=.*?[!#$%&'()*+-.:;<=>?#[\\\]^_`{|}~\x22\s]) # or space.
) # End group of assertion alternatives.
.{6,12} # Match any name with length from 6 to 12.
$ # Anchor to end of string.
", RegexOptions.IgnorePatternWhitespace)) {
// Valid username.
} else {
// Invalid username.
}
This regex eliminates the global alternative and instead uses a non-capture group for the "digit or specialchar" assertion alternatives. Also, you can eliminate the non-capture group for the "special char or whitespace" alternatives by simply adding the \s to the list of special chars. I've also added a lazy modifier to the dot-stars in the assertions, i.e. .*? - (this may make the regex match a bit faster.) A bunch of unnecessary escapes were removed from the specialchar character class.
But as Stema cleverly pointed out, you can combine the digit and special char to simplify this even further:
if (Regex.IsMatch(usernameString,
#"# Validate username having a digit and/or special char.
^ # Anchor to start of string
(?=.*?[A-Za-z]) # Assert there is an alpha.
# Assert there is a special char, space
(?=.*?[!#$%&'()*+-.:;<=>?#[\\\]^_`{|}~\x22\s\d]) # or digit.
.{6,12} # Match any name with length from 6 to 12.
$ # Anchor to end of string.
", RegexOptions.IgnorePatternWhitespace)) {
// Valid username.
} else {
// Invalid username.
}
Other than that, there is really nothing wrong with your original regex with regard to accuracy. However, logically, this formula allows a username to end with whitespace which is probably not a good idea. I would also explicitly specify a whitelist of allowable chars in the name rather than using the overly permissive "." dot.
I am not sure if it makes sense what you are doing, but to achieve that, your regex can be simpler
^(?=.*[A-Za-z])(?=.*[\d\s!#$%&'\(\)\*\+-\.:;<=>\?#\[\\\]\^_`\{\|\}~0x0022]).{6,12}$
Why using alternatives? Just Add \d and \s to the character class.
I need to match this string 011Q-0SH3-936729 but not 345376346 or asfsdfgsfsdf
It has to contain characters AND numbers AND dashes
Pattern could be 011Q-0SH3-936729 or 011Q-0SH3-936729-SDF3 or 000-222-AAAA or 011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729 and I want it to be able to match anyone of those. Reason for this is that I don't really know if the format is fixed and I have no way of finding out either so I need to come up with a generic solution for a pattern with any number of dashes and the pattern recurring any number of times.
Sorry this is probably a stupid question, but I really suck at Regular expressions.
TIA
foundMatch = Regex.IsMatch(subjectString,
#"^ # Start of the string
(?=.*\p{L}) # Assert that there is at least one letter
(?=.*\p{N}) # and at least one digit
(?=.*-) # and at least one dash.
[\p{L}\p{N}-]* # Match a string of letters, digits and dashes
$ # until the end of the string.",
RegexOptions.IgnorePatternWhitespace);
should do what you want. If by letters/digits you meant "only ASCII letters/digits" (and not international/Unicode letters, too), then use
foundMatch = Regex.IsMatch(subjectString,
#"^ # Start of the string
(?=.*[A-Z]) # Assert that there is at least one letter
(?=.*[0-9]) # and at least one digit
(?=.*-) # and at least one dash.
[A-Z0-9-]* # Match a string of letters, digits and dashes
$ # until the end of the string.",
RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase);
EDIT:
this will match any of the key provided in your comments:
^[0-9A-Z]+(-[0-9A-Z]+)+$
this means the key starts with the digit or letter and have at leats one dash symbol:
Without more info about the regularity of the dashes or otherwise, this is the best we can do:
Regex.IsMatch(input,#"[A-Z0-9\-]+\-[A-Z0-9]")
Although this will also match -A-0
Most naive implementation EVER (might get you started):
([0-9]|[A-Z])+(-)([0-9]|[A-Z])+(-)([0-9]|[A-Z])+
Tested with Regex Coach.
EDIT:
That does match only three groups; here another, slightly better:
([0-9A-Z]+\-)+([0-9A-Z]+)
Are you applying the regex to a whole string (i.e., validating or filtering)? If so, Tim's answer should put you right. But if you're plucking matches from a larger string, it gets a bit more complicated. Here's how I would do that:
string input = #"Pattern could be 011Q-0SH3-936729 or 011Q-0SH3-936729-SDF3 or 000-222-AAAA or 011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729 but not 345-3763-46 or ASFS-DFGS-FSDF or ASD123FGH987.";
Regex pluckingRegex = new Regex(
#"(?<!\S) # start of 'word'
(?=\S*\p{L}) # contains a letter
(?=\S*\p{N}) # contains a digit
(?=\S*-) # contains a hyphen
[\p{L}\p{N}-]+ # gobble up letters, digits and hyphens only
(?!\S) # end of 'word'
", RegexOptions.IgnorePatternWhitespace);
foreach (Match m in pluckingRegex.Matches(input))
{
Console.WriteLine(m.Value);
}
output: 011Q-0SH3-936729
011Q-0SH3-936729-SDF3
000-222-AAAA
011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729
The negative lookarounds serve as 'word' boundaries: they insure the matched substring starts either at the beginning of the string or after a whitespace character ((?<!\S)), and ends either at the end of the string or before a whitespace character ((?!\S)).
The three positive lookaheads work just like Tim's, except they use \S* to skip whatever precedes the first letter/digit/hyphen. We can't use .* in this case because that would allow it to skip to the next word, or the next, etc., defeating the purpose of the lookahead.