regular expression not working: repeated strings of digits - c#

I was trying to create a regular expression to find repeated strings of digits.
eg:
1 -not matching
11 -matching
122 -matching
1234 -not matching
what i used is \d+. Tutorial are telling
the "+" is similar to "*", except it requires at least one repetition.
But when i tried it is matching with any number. Any idea why?
Update
The tutorial i tried : http://www.codeproject.com/Articles/9099/The-Minute-Regex-Tutorial

The repetition constructs in Regular Expressions, +, *, {x}, do not repeat "what you found the first time around", they repeat "the pattern that finds things".
So this:
\d+
Will not find one digit, then match a sequence of that digit, instead it will first find one digit, then try to find another digit, then another, etc.
If you want it to repeat "what it found" you have to explicitly say so:
(\d)\1+
The \1 here says "I will match whatever is in the first group again", this regular expression should match sequences of the same digit, instead of sequences of digits.

^\d*(\d)\1+\d*$
You can use this.See demo.\d+ would match any intergers 1 or more time.You need to use \1 to find repeated digits.
https://regex101.com/r/hI0qP0/4

It works properly. \d+ is not a repetition of a specific digit, it is a repetition of one or more \d. \d+ will match 1 (one or more digit), 12 (one or more digit), 122 (one or more digit)... you see the idea. If you want to see two or more repetitions, you'd need to say \d\d+ or \d{2,} - but this, too, says that you want two or more digits, not two or more of a same digit. To say that, you need backreferences: (\d)\1+ is two or more of a same digit: a digit we remember, then one or more of that remembered thing.

Related

Is there a regular expression for matching a string that has no more than 2 repeating characters? [duplicate]

I want to match strings that do not contain more than 3 of the same character repeated in a row. So:
abaaaa [no match]
abawdasd [match]
abbbbasda [no match]
bbabbabba [match]
Yes, it would be much easier and neater to do a regex match for containing the consecutive characters, and then negate that in the code afterwards. However, in this case that is not possible.
I would like to open out the question to x consecutive characters so that it can be extended to the general case to make the question and answer more useful.
Negative lookahead is supported in this case.
Use a negative lookahead with back references:
^(?:(.)(?!\1\1))*$
See live demo using your examples.
(.) captures each character in group 1 and the negative look ahead asserts that the next 2 chars are not repeats of the captured character.
To match strings not containing a character repeated more than 3 times consecutively:
^((.)\2?(?!\2\2))+$
How it works:
^ Start of string
(
(.) Match any character (not a new line) and store it for back reference.
\2? Optionally match one more exact copies of that character.
(?! Make sure the upcoming character(s) is/are not the same character.
\2\2 Repeat '\2' for as many times as you need
)
)+ Do ad nauseam
$ End of string
So, the number of /2 in your whole expression will be the number of times you allow a character to be repeated consecutively, any more and you won't get a match.
E.g.
^((.)\2?(?!\2\2\2))+$ will match all strings that don't repeat a character more than 4 times in a row.
^((.)\2?(?!\2\2\2\2))+$ will match all strings that don't repeat a character more than 5 times in a row.
Please be aware this solution uses negative lookahead, but not all not all regex flavors support it.
I'm answering this question :
Is there a regular expression for matching a string that has no more than 2 repeating characters?
which was marked as an exact duplicate of this question.
Its much quicker to negate the match instead
if (!Regex.Match("hello world", #"(.)\1{2}").Success) Console.WriteLine("No dups");

Simple phone number regex to match numbers, spaces, etc

I'm trying to modify a fairly basic regex pattern in C# that tests for phone numbers.
The patterns is -
[0-9]+(\.[0-9][0-9]?)?
I have two questions -
1) The existing expression does work (although it is fairly restrictive) but I can't quite understand how it works. Regexps for similar issues seem to look more like this one -
/^[0-9()]+$/
2) How could I extend this pattern to allow brackets, periods and a single space to separate numbers. I tried a few variations to include -
[0-9().+\s?](\.[0-9][0-9]?)?
Although i can't seem to create a valid pattern.
Any help would be much appreciated.
Thanks,
[0-9]+(\.[0-9][0-9]?)?
First of all, I recommend checking out either regexr.com or regex101.com, so you yourself get an understanding of how regex works. Both websites will give you a step-by-step explanation of what each symbol in the regex does.
Now, one of the main things you have to understand is that regex has special characters. This includes, among others, the following: []().-+*?\^$. So, if you want your regex to match a literal ., for example, you would have to escape it, since it's a special character. To do so, either use \. or [.]. Backslashes serve to escape other characters, while [] means "match any one of the characters in this set". Some special characters don't have a special meaning inside these brackets and don't require escaping.
Therefore, the regex above will match any combination of digits of length 1 or more, followed by an optional suffix (foobar)?, which has to be a dot, followed by one or two digits. In fact, this regex seems more like it's supposed to match decimal numbers with up to two digits behind the dot - not phone numbers.
/^[0-9()]+$/
What this does is pretty simple - match any combination of digits or round brackets that has the length 1 or greater.
[0-9().+\s?](\.[0-9][0-9]?)?
What you're matching here is:
one of: a digit, round bracket, dot, plus sign, whitespace or question mark; but exactly once only!
optionally followed by a dot and one or two digits
A suitable regex for your purpose could be:
(\+\d{2})?((\(0\)\d{2,3})|\d{2,3})?\d+
Enter this in one of the websites mentioned above to understand how it works. I modified it a little to also allow, for example +49 123 4567890.
Also, for simplicity, I didn't include spaces - so when using this regex, you have to remove all the spaces in your input first. In C#, that should be possible with yourString.Replace(" ", ""); (simply replacing all spaces with nothing = deleting spaces)
The + after the character set is a quantifier (meaning the preceeding character, character set or group is repeated) at least one, and unlimited number of times and it's greedy (matched the most possible).
Then [0-9().+\s]+ will match any character in set one or more times.

repeat a group of characters

I have the following input to be matched by a regex:
1.1.1.1
1.01.1.1
01.01.091.01
1.10.100.0010
So I have allways four groups consisting of digits. While the first three ones should match, the last one should not.
So I wrote this regex:
^(\d*[1-9]+\.){4}$
In general this regex should return all those strings where any of the digits in any of the groups is not followed by a zero. Or more easily: I want to not match all numbers with trailing zeros.
However this doesn´t match anything. regex1010.com tells this:
A repeated capturing group will only capture the last iteration. Put a
capturing group around the repeated group to capture all iterations or
use a non-capturing group instead if you're not interested in the data
But when I add a further capturing group I get the same message:
^((\d*[1-9]+\.)){4}$
The same applies to a non-capturing group:
^(?:\d*[1-9]+\.){4}$
Of course I could just write the same group four times, but that´s fairly clumsy and hard to read.
As mentioned by others the dot is the point, so we have three identical groups and one without the dot.
So this regex does it for me:
(?:\d*[1-9]\.){3}(?:\d*[1-9])
You never specify the dot in your patterns. What you ask for is, in fact, not a repetition of four, it is a specific single pattern of four numbers separated with dots.
^(\d*[1-9]+\.\d*[1-9]+\.\d*[1-9]+\.\d*[1-9]+)$
The only thing in there you could consider a repetition is the "number + dot" part, but then you repeat that three times and add another number. Then the regex would become this:
^((\d*[1-9]+\.){3}\d*[1-9]+)$
However, your third line contains a space at the end, so you may want to add extra checks to trim those off.
The problem with your regex is by not including the . your regex fails to find four matches of digits because they always have dots in between.'
Try this instead:
(?:(\d*[1-9])\.?){4}

Regular Expression to not allow 3 consecutive characters

I have the following regex:
Regex pattern = new Regex(#"^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])[0-9a-zA-Z]{8,20}/(.)$");
(?=.*\d) //should contain at least one digit
(?=.*[a-z]) //should contain at least one lower case
(?=.*[A-Z]) //should contain at least one upper case
[a-zA-Z0-9]{8,20} //should contain at least 8 characters and maximum of 20
My problem is I also need to check if 3 consecutive characters are identical. Upon searching, I saw this solution:
/(.)\1\1/
However, I can't make it to work if I combined it to my existing regex, still no luck:
Regex(#"^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])[0-9a-zA-Z]{8,20}$/(.)\1\1/");
What did I missed here? Thanks!
The problem is that /(.)\1\1/ includes the surrounding / characters which are used to quote literal regular expressions in some languages (like Perl). But even if you don't use the quoting characters, you can't just add it to a regular expression.
At the beginning of your regex, you have to say "What follows cannot contain a character followed by itself and then itself again", like this: (?!.*(.)\1\1). The (?! starts a zero-width negative lookahead assertion. The "zero-width" part means that it does not consume any characters in the input string, and the "negative lookahead assertions" means that it looks forward in the input string to make sure that the given pattern does not appear anywhere.
All told, you want a regex like this:
new Regex(#"^(?!.*(.)\1\1)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])[0-9a-zA-Z]{8,20}$")
I solved by using trial and error:
Regex pattern = new Regex(#"^(?!.*(.)\1\1)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])[0-9a-zA-Z]{8,20}$");

Regular expression with repetition?

I need a regular expression that accepts only digits and dots, with these conditions:
between digits three must be only one dot '132.632.55'
digits can be repeat in between two dots '.112234563456789.'
the string starts with digits
digits with "." like this '123346547987.' can repeat many times
length of these digits is less than 50 characters
For example: 123456.258469.5467.15546
Given all the information in the question, I think this is the regular expression you need:
^(\d{1,50}\.)*\d{1,50}$
This will:
require that the string begins and ends with a digit
not require that there is a dot in there at all
ensure that each run of digits between dots is no longer than 50 digits
If you need it to have at least one dot, change the * to a +:
^(\d{1,50}\.)+\d{1,50}$
From what I can tell from your requirements, you want something like this:
^(\d{1,50}\.)*\d{1,50}$
That is, from one to 50 digits, optionally preceded by any number of groups of one to 50 digits, each group followed by a fullstop. I can't quite tell if you want something like 1233.456 to be invalid, since your requirement #2 implies that only digit groups between dots can contain repeat digits. In such a case, it'd be much simpler to perform the validation of individual digit groups after the fact.

Categories

Resources