Regular Expression C# IsMatch() - c#

I try to use regular expression to check if a string contains only: 0-9, A-Z, a-z, \, / or -.
I used Regex validator = new Regex(#"[0-9a-zA-Z\-/]*"); and no matter what string I introduce is valid.
The check look like this: if(!validator.IsMatch(myString))
What's wrong?

If I understand what you want. I believe your pattern should be
new Regex(#"^[0-9a-zA-Z\\\-/]*$");
The ^ and $ symbols are anchors that match the beginning and end of the string, respectively. Without those, the pattern would match any strings that contain any character in that class. With them, it matches strings that only contain characters in that class.
You also specified you wanted to include backslash characters, but the original pattern had \- in the character class. This is simply an escape sequence for the hyphen within the character class. To also include backslash in the character class you need to specify that separately (escaped as well). So the resulting character class has \\ (backslash) followed by \- (hyphen).
Now, this will still match empty strings because * means "zero-or-more". if you want to only match non-empty strings use:
new Regex(#"^[0-9a-zA-Z\\\-/]+$");
The + means "one-or-more".

Use + instead of *
new Regex(#"[0-9a-zA-Z\-/]+");

If I write a Regex of the form
"[some character class]*"
it will match every string. Every string contains 0 to many of a character class.
Perhaps you wanted to use
new Regex(#"[0-9a-zA-Z\-/]+")
to specify 1 to many of your character class.

Related

How to match string by using regular expression which will not allow same special character at same time?

I m trying to matching a string which will not allow same special character at same time
my regular expression is:
[RegularExpression(#"^+[a-zA-Z0-9]+[a-zA-Z0-9.&' '-]+[a-zA-Z0-9]$")]
this solve my all requirement except the below two issues
this is my string : bracks
acceptable :
bra-cks, b-r-a-c-ks, b.r.a.c.ks, bra cks (by the way above regular expression solved this)
not acceptable:
issue 1: b.. or bra..cks, b..racks, bra...cks (two or more any special character together),
issue 2: bra cks (two ore more white space together)
You can use a negative lookahead to invalidate strings containing two consecutive special characters:
^(?!.*[.&' -]{2})[a-zA-Z0-9.&' -]+$
Demo: https://regex101.com/r/7j14bu/1
The goal
From what i can tell by your description and pattern, you are trying to match text, which start and end with alphanumeric (due to ^+[a-zA-Z0-9] and [a-zA-Z0-9]$ inyour original pattern), and inside, you just don't want to have any two consecuive (adjacent) special characters, which, again, guessing from the regex, are . & ' -
What was wrong
^+ - i think here you wanted to assure that match starts at the beginning of the line/string, so you don't need + here
[a-zA-Z0-9.&' '-] - in this character class you doubled ' which is totally unnecessary
Solution
Please try pattern
^[a-zA-Z0-9](?:(?![.& '-]{2,})[a-zA-Z0-9.& '-])*[a-zA-Z0-9]$
Pattern explanation
^ - anchor, match the beginning of the string
[a-zA-Z0-9] - character class, match one of the characters inside []
(?:...) - non capturing group
(?!...) - negative lookahead
[.& '-]{2,} - match 2 or more of characters inside character class
[a-zA-Z0-9.& '-] - character class, match one of the characters inside []
* - match zero or more text matching preceeding pattern
$ - anchor, match the end of the string
Regex demo
Some remarks on your current regex:
It looks like you placed the + quantifiers before the pattern you wanted to quantify, instead of after. For instance, ^+ doesn't make much sense, since ^ is just the start of the input, and most regex engines would not even allow that.
The pattern [a-zA-Z0-9.&' '-]+ doesn't distinguish between alphanumerical and other characters, while you want the rules for them to be different. Especially for the other characters you don't want them to repeat, so that + is not desired for those.
In a character class it doesn't make sense to repeat the same character, like you have a repeat of a quote ('). Maybe you wanted to somehow delimit the space, but realise that those quotes are interpreted literally. So probably you should just remove them. Or if you intended to allow for a quote, only list it once.
Here is a correction (add the quote if you still need it):
^[a-zA-Z0-9]+(?:[.& -][a-zA-Z0-9]+)*$
Follow-up
Based on a comment, I suspect you would allow a non-alphanumerical character to be surrounded by single spaces, even if that gives a sequence of more than one non-alphanumerical character. In that case use this:
^[a-zA-Z0-9]+(?:(?:[ ]|[ ]?[.&-][ ]?)[a-zA-Z0-9]+)*$
So here the space gets a different role: it can optionally occur before and after a delimiter (one of ".&-"), or it can occur on its own. The brackets around the spaces are not needed, but I used them to stress that the space is intended and not a typo.

Regex. Check string is in these characters always true

I've searched quite a bit but can't work out why my regex is always returning true.
I need to validate that a whole string contains only numbers, letters, spaces - and _
I have the ^ and $ to match from the start to the end and the + so it's at least one character.
But it always returns true when I test it with #[]<>/., and so on.
Regex rg = new Regex(#"^[a-zA-Z0-9 -_]+$");
return rg.IsMatch(strToCheck);
You need to escape the hyphen since it is at that position inside of the character class.
Regex rg = new Regex(#"^[a-zA-Z0-9 \-_]+$");
Note: Inside of a character class the hyphen has special meaning. You can place it as the first or last character of the class. In some regex implementations, you can also place directly after a range. If you place the hyphen anywhere else you need to precede it with a backslash in order to add it to your character class.
It's because of - symbol present at the middle of the character class. - in the middle acts as a range operator. ie, it allows all characters which falls within the range from space to _ . To avoid - acts as a range operator, you need to put the - symbol at the first or at the last inside the character class or escape it.
#"^[a-zA-Z0-9 _-]+$"
OR
#"^[-a-zA-Z0-9 _]+$

Regex does not match with string containing 4 groups

I want to match a string which I divide into 4 groups:
1.) group has a "-"
2.) group has any char
3.) group has a ":"
4.) group has any char
I have tried this:
Regex regex = new Regex("^[-][.*][:][.*]*$");
bool isMatch = regex.IsMatch("-jobid:3");
isMatch is false.
What is wrong in my pattern?
The error here is that .* should not be enclosed in brackets.
This:
[.*]
Means this:
The dot
or the asterix
This:
.*
Means this:
Any character, zero or more times
Additionally, if there is only 1 legal character in a spot, you generally don't need to enclose it in brackets.
So try this expression instead:
new Regex("^-.*:.*$");
You have misunderstood the character class notion. A character class is only a collection of characters without any order. So when you write something like [.*], that means a literal dot or a literal asterisk.
An important precision, all the regex special characters loose their meaning and are seen as literal characters inside a character class. However some characters may have a special meaning in a character class like ^ at the first place (to negate a class) or - to define a range. Some syntaxes can be used too inside character classes, like shorthand or POSIX character classes, and character classes substractions.
You can write the whole pattern without all these (useless) character classes:
^-.*:.*$
However, to be more efficient, you can use a negated character class before the semi-colon:
^-[^:]*:.*$

Why is giving me true the regular expression [^%()*+-\/=?#[\\]ªº´`¿'.]* with the comma (,)?

I have a problem with that regular expression [^%()*+-\/=?#[\\]ªº´¿'.]*` .
I want to avoid the characters inside. the regular expression it is working but when I set something like DAVID, SC I can save the form because it has a comma but this character it is not inside the regular expression.
Could you help me please?
You are not accounting for the special meaning of - inside a character class [.....].
You must either place the dash at the very end, or else escape it with a backslash:
[^%()*+\/=?#\[\]ªº´¿'.-]*
In your original regex, +-\/ disallows any characters between + and / in the ASCII table; these are the comma, dot and dash. Your example input contains a comma so the regex did not match all of the input at once.
I have also fixed the escaping for the [] characters from [\\] to \[\], which I presume was a mistake.
Because you're using * in [^%()*+\/=?#[\\]ªº´¿'.-]* with line start/end anchors. * means match 0 or more of preceding group/pattern in character class and your regex can even match an empty string.
Use this regex:
^[^%()*+\/=?#[\\-]ªº´¿'.]+$
PS: Hyphen - should be either or first OR at last position in character class to avoid escaping.
Rubular Demo

How do I specify a wildcard (for ANY character) in a c# regex statement?

Trying to use a wildcard in C# to grab information from a webpage source, but I cannot seem to figure out what to use as the wildcard character. Nothing I've tried works!
The wildcard only needs to allow for numbers, but as the page is generated the same every time, I may as well allow for any characters.
Regex statement in use:
Regex guestbookWidgetIDregex = new Regex("GuestbookWidget(' INSERT WILDCARD HERE ', '(.*?)', 500);", RegexOptions.IgnoreCase);
If anyone can figure out what I'm doing wrong, it would be greatly appreciated!
The wildcard character is ..
To match any number of arbitrary characters, use .* (which means zero or more .) or .+ (which means one or more .)
Note that you need to escape your parentheses as \\( and \\). (or \( and \) in an #"" string)
On the dot
In regular expression, the dot . matches almost any character. The only characters it doesn't normally match are the newline characters. For the dot to match all characters, you must enable what is called the single line mode (aka "dot all").
In C#, this is specified using RegexOptions.Singleline. You can also embed this as (?s) in the pattern.
References
regular-expressions.info/The Dot Matches (Almost) Any Character
On metacharacters and escaping
The . isn't the only regex metacharacters. They are:
( ) { } [ ] ? * + - ^ $ . | \
Depending on where they appear, if you want these characters to mean literally (e.g. . as a period), you may need to do what is called "escaping". This is done by preceding the character with a \.
Of course, a \ is also an escape character for C# string literals. To get a literal \, you need to double it in your string literal (i.e. "\\" is a string of length one). Alternatively, C# also has what is called #-quoted string literals, where escape sequences are not processed. Thus, the following two strings are equal:
"c:\\Docs\\Source\\a.txt"
#"c:\Docs\Source\a.txt"
Since \ is used a lot in regular expression, #-quoting is often used to avoid excessive doubling.
References
regular-expressions.info/Metacharacters
MSDN - C# Programmer's Reference - string
On character classes
Regular expression engines allow you to define character classes, e.g. [aeiou] is a character class containing the 5 vowel letters. You can also use - metacharacter to define a range, e.g. [0-9] is a character classes containing all 10 digit characters.
Since digit characters are so frequently used, regex also provides a shorthand notation for it, which is \d. In C#, this will also match decimal digits from other Unicode character sets, unless you're using RegexOptions.ECMAScript where it's strictly just [0-9].
References
regular-expressions.info/Character Classes
MSDN - Character Classes - Decimal Digit Character
Related questions
.NET regex: What is the word character \w
Putting it all together
It looks like the following will work for you:
#-quoting digits_ _____anything but ', captured
| / \ / \
new Regex(#"GuestbookWidget\('\d*', '([^']*)', 500\);", RegexOptions.IgnoreCase);
\/ \/
escape ( escape )
Note that I've modified the pattern slightly so that it uses negated character class instead of reluctance wildcard matching. This causes a slight difference in behavior if you allow ' to be escaped in your input string, but neither pattern handle this case perfectly. If you're not allowing ' to be escaped, however, this pattern is definitely better.
References
regular-expressions.info/An Alternative to Laziness and Capturing Groups

Categories

Resources