Regex Not Finding All Matches C# [duplicate] - c#

This question already has answers here:
Getting overlapping regex matches in C#
(2 answers)
Closed 4 years ago.
I researched, tested and used RegEx.101 and could not figure this out. I spent hours on it.
I am looking for 16 digit numbers.
Here is my regular expression: "[0-9]{16}"
Here is the code:
Regex ItemRegex = new Regex("[0-9]{16}");
// test with a 29 digit number
foreach (Match ItemMatch in ItemRegex.Matches("564654564553314342340968580654"))
{
i++;
}
I am expecting multiple matches but I am only getting the first 16 digits. How do I get the first 16 digits starting at position one, then the second 16 digits starting at position 2, then the third, etc?
Any thoughts, ideas, suggestions or solutions would be greatly appreciated and +1'd.

As ctwheels alluded to in his/her comment, to get overlapping matches like you want, you need to use a concept called a lookahead assertion, which is an expression that evaluates whether a condition is or is not satisfied without consuming those characters. These are called positive and negative lookahead assertions, respectively. Consider the following expression:
\d(?=(\d{15}))
The first \d will match a single digit and consume that character in the expression. This is followed by a positive lookahead assertion (denoted by (?=expression)) that tests whether that single digit is followed by another 15 digits, without consuming those 15 characters. Not consuming those characters means that the expression can find additional matches starting with the first character after the one matched by the initial \d. So:
var expression = #"\d(?=(\d{15}))";
var testString = "564654564553314342340968580654";
var regex = new Regex(expression);
foreach (Match match in regex.Matches(testString))
{
Console.WriteLine($"{match.Groups[0].Value}{match.Groups[1].Value}");
}
In my Console.WriteLine I am aggregating the contents of the two groups that will appear in each match: the first being the leading digit and the second being the group of 15 digits that follows it. The output of the above code is:
5646545645533143
6465456455331434
4654564553314342
6545645533143423
5456455331434234
4564553314342340
5645533143423409
6455331434234096
4553314342340968
5533143423409685
5331434234096858
3314342340968580
3143423409685806
1434234096858065
4342340968580654

See regex in use here
(?=(\d{16}))
(?=(\d{16})) Positive lookahead ensuring the following follows the current position
(\d{16}) Capture 16 digits into capture group 1
Result:
5646545645533143
6465456455331434
4654564553314342
6545645533143423
5456455331434234
4564553314342340
5645533143423409
6455331434234096
4553314342340968
5533143423409685
5331434234096858
3314342340968580
3143423409685806
1434234096858065
4342340968580654
So how does this work? Well, a lookahead (?=) is a zero-width assertion that checks whether or not the subpattern it contains matches at that specific location in the string. Since we haven't anchored our regex, this will attempt to match every position in the string.
So what does it mean to be a zero-width assertion? A lookaround actually matches characters and then gives up the match, returning only the result: match or no match. In our case, we've also added a capturing group to the positive lookahead assertion, thus allowing it to capture the result. What we end up with are empty matches (only matches at the particular locations where 16 digits follow) and the result (our 16 digits) in the capturing group.

Here is a approach without RegEx
// test with a 29 digit number
string input = "564654564553314342340968580654";
for (int i = 0; i < input.Length - 16+1; i++)
{
string result = string.Concat(input.Skip(i).Take(16));
if (result.All(x => char.IsDigit(x)))
{
Console.WriteLine(result);
}
}
https://dotnetfiddle.net/igGMSL

Related

Regex to match 7 same digits in a number regardless of position

I want to match an 8 digit number. Currently, I have the following regex but It is failing in some cases.
(\d+)\1{6}
It matches only when a number is different at the end such as 44444445 or 54444444. However, I am looking to match cases where at least 7 digits are the same regardless of their position.
It is failing in cases like
44454444
44544444
44444544
What modification is needed here?
It's probably a bad idea to use this in a performance-sensitive location, but you can use a capture reference to achieve this.
The Regex you need is as follows:
(\d)(?:.*?\1){6}
Breaking it down:
(\d) Capture group of any single digit
.*? means match any character, zero or more times, lazily
\1 means match the first capture group
We enclose that in a non-capturing group {?:
And add a quantifier {6} to match six times
You can sort the digits before matching
string input = "44444445 54444444 44454444 44544444 44444544";
string[] numbers = input.Split(' ');
foreach (var number in numbers)
{
number = String.Concat(str.OrderBy(c => c));
if (Regex.IsMatch(number, #"(\d+)\1{6}"))
// do something
}
Still not a good idea to use regex for this though
The pattern that you tried (\d+)\1{6} matches 6 of the same digits in a row. If you want to stretch the match over multiple same digits, you have to match optional digits in between.
Note that in .NET \d matches more digits than 0-9 only.
If you want to match only digits 0-9 using C# without matching other characters in between the digits:
([0-9])(?:[0-9]*?\1){6}
The pattern matches:
([0-9]) Capture group 1
(?: Non capture group
[0-9]*?\1 Match optional digits 0-9 and a backreference to group 1
){6} Close non capture group and repeat 6 times
See a .NET Regex demo
If you want to match only 8 digits, you can use a positive lookahead (?= to assert 8 digits and word boundaries \b
\b(?=\d{8}\b)[0-9]*([0-9])(?:[0-9]*?\1){6}\d*\b
See another .NET Regex demo

Is there a regular expression for matching a string that has no more than 2 repeating characters? [duplicate]

I want to match strings that do not contain more than 3 of the same character repeated in a row. So:
abaaaa [no match]
abawdasd [match]
abbbbasda [no match]
bbabbabba [match]
Yes, it would be much easier and neater to do a regex match for containing the consecutive characters, and then negate that in the code afterwards. However, in this case that is not possible.
I would like to open out the question to x consecutive characters so that it can be extended to the general case to make the question and answer more useful.
Negative lookahead is supported in this case.
Use a negative lookahead with back references:
^(?:(.)(?!\1\1))*$
See live demo using your examples.
(.) captures each character in group 1 and the negative look ahead asserts that the next 2 chars are not repeats of the captured character.
To match strings not containing a character repeated more than 3 times consecutively:
^((.)\2?(?!\2\2))+$
How it works:
^ Start of string
(
(.) Match any character (not a new line) and store it for back reference.
\2? Optionally match one more exact copies of that character.
(?! Make sure the upcoming character(s) is/are not the same character.
\2\2 Repeat '\2' for as many times as you need
)
)+ Do ad nauseam
$ End of string
So, the number of /2 in your whole expression will be the number of times you allow a character to be repeated consecutively, any more and you won't get a match.
E.g.
^((.)\2?(?!\2\2\2))+$ will match all strings that don't repeat a character more than 4 times in a row.
^((.)\2?(?!\2\2\2\2))+$ will match all strings that don't repeat a character more than 5 times in a row.
Please be aware this solution uses negative lookahead, but not all not all regex flavors support it.
I'm answering this question :
Is there a regular expression for matching a string that has no more than 2 repeating characters?
which was marked as an exact duplicate of this question.
Its much quicker to negate the match instead
if (!Regex.Match("hello world", #"(.)\1{2}").Success) Console.WriteLine("No dups");

How to extract numbers from a string using regular expressions?

This little challenge just screams regular expressions to me, but so far I am stumped.
I have an arbitrary string that contains two numbers embedded in it. I need to extract those two numbers, which will be n and m digits long (n,m are unknown in advance). The format of the string is always
FixedWord[n digits]anotherfixedword[m digits]alotmorestuffontheend
The first number is of the format 1.2.3.4 (the number of digits varying) eg 5.3.20 or 5.3.10.1 or 5.4.
and the second is a simpler 'm' digits (eg 25 or 2)
eg "AppName5.2.6dbVer44Oracle.Group"
It shouts 'pattern matching' and hence "extraction using regexes". Can anyone guide me further?
TIA
The following pattern:
(\d+(?>\.\d+)*)\w+?(\d+)
Will match this:
AppName5.2.6dbVer44Oracle.Group
\__________/ <-- match
\___/ \/ <-- captures
Demo
And will capture the two values you're interested in in capture groups.
Use it like this:
var match = Regex.Match(input, #"(\d+(?>\.\d+)*)\w+?(\d+)");
if (match.Success)
{
var first = match.Groups[1].Value;
var second = match.Groups[2].Value;
// ...
}
Pattern explanation:
( # Start of group 1
\d+ # a series of digits
(?> # start of atomic group
\.\d+ # dot followed by digits
)* # .. 0 to n times
)
\w+? # some word characters (as few as possible)
(\d+) # a series of digits captured in group 2
Try this:
\w*?([\d|\.]+)\w*?([\d{1,4}]+).*
You could start from the following:
^[a-zA-Z]+((?:\d+\.)+\d)[a-zA-Z]+(\d+).*$
I assumed that the fixed words are just made of letters and that you want to match the entire string. If you prefer, you could substitute the parts not in parentheses with the actual fixed words or change the character sets as desired. I recommend using a tool like https://regex101.com to fine-tune the expression.
Keep it basic by specifing a match ( ) by looking for a digit \d, then zero or more * digits or periods in a set [\d.] (the set is \d -or- a literal period):
var data = "AppName5.2.6dbVer44Oracle.Group";
var pattern = #"(\d[\d.]*)";
// Outputs:
// 5.2.6
// 44
Console.WriteLine (Regex.Matches(data, pattern)
.OfType<Match>()
.Select (mt => mt.Groups[1].Value));
Each match will be a number within the sentence. So if the total set of numbers change, the pattern will not fail and dutifully report 1 to N numbers.
Simply look for the numbers, since you only care for the numbers and don't want to check the syntax of the whole input string.
Matches matches = Regex.Matches(input, #"\d+(\.\d+)*");
if (matches.Count >= 2) {
string number1 = matches[0].Value;
string number2 = matches[1].Value;
} else {
// Less than two numbers found
}
The expression \d+(\.\d+)* means:
\d+ one or more digits.
( )* repeat zero, one or more times.
\.\d+ one decimal point (escaped with \) followed by one or more digits.
and
\d one digit.
( ) grouping.
+ repeat the expression to the left one or more times.
* repeat the expression to the left zero, one or more times.
\ escapes characters that have a special meaning in regex.
. any character (without escaping).
\. period character (".").

Get a number with exactly x digits from string

im looking for a regex pattern, which matches a number with a length of exactly x (say x is 2-4) and nothing else.
Examples:
"foo.bar 123 456789", "foo.bar 456789 123", " 123", "foo.bar123 " has to match only "123"
So. Only digits, no spaces, letters or other stuff.
How do I have to do it?
EDIT: I want to use the Regex.Matches() function in c# to extract this 2-4 digit number and use it in additional code.
Any pattern followed by a {m,n} allows the pattern to occur m to n times. So in your case \d{m,n} for required values of m and n. If it has to be exactly an integer, use\d{m}
If you want to match 123 in x123y and not in 1234, use \d{3}(?=\D|$)(?<=(\D|^)\d{3})
It has a look ahead to ensure the character following the 3 digits is a non-digitornothing at all and a look behind to ensure that the character before the 3 digits is a non-digit or nothing at all.
You can achieve this with basic RegEx:
\b(\d\d\d)\b or \b(\d{3})\b - for matching a number with exactly 3 digits
If you want variable digits: \b(\d{2,4})\b (explained demo here)
If you want to capture matches next to words: \D(\d{2,4})\D (explained demo here)
\b is a word boundary (does not match anything, it's a zero-match character)
\d matches only digits
\D matches any character that is NOT a digit
() everything in round brackets will capture a match

Regular expression for accepting alphanumeric characters (6-10 chars) .NET, C#

I am building a user registration form using C# with .NET.
I have a requirement to validate user entered password fields.
Validation requirement is as below.
It should be alphanumeric (a-z , A-Z , 0-9)
It should accept 6-10 characters (minimum 6 characters, maximum 10 characters)
With at least 1 alphabet and number (example: stack1over)
I am using a regular expression as below.
^([a-zA-Z0-9]{6,10})$
It satisfies my first 2 conditions.
It fails when I enter only characters or numbers.
Pass it through multiple regexes if you can. It'll be a lot cleaner than those look-ahead monstrosities :-)
^[a-zA-Z0-9]{6,10}$
[a-zA-Z]
[0-9]
Though some might consider it clever, it's not necessary to do everything with a single regex (or even with any regex, sometimes - just witness the people who want a regex to detect numbers between 75 and 4093).
Would you rather see some nice clean code like:
if not checkRegex(str,"^[0-9]+$")
return false
val = string_to_int(str);
return (val >= 75) and (val <= 4093)
or something like:
return checkRegex(str,"^7[5-9]$|^[89][0-9]$|^[1-9][0-9][0-9]$|^[1-3][0-9][0-9][0-9]$|^40[0-8][0-9]$|^409[0-3]$")
I know which one I'd prefer to maintain :-)
Use positive lookahead
^(?=.*[a-zA-Z])(?=.*[0-9])[a-zA-Z0-9]{6,10}$
Look arounds are also called zero-width assertions. They are zero-width just like the start and end of line (^, $). The difference is that lookarounds will actually match characters, but then give up the match and only return the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not.
The syntax for look around:
(?=REGEX) Positive lookahead
(?!REGEX) Negative lookahead
(?<=REGEX) Positive lookbehind
(?<!REGEX) Negative lookbehind
string r = #"^(?=.*[A-Za-z])(?=.*[0-9])[A-Za-z0-9]{6,10}$";
Regex x = new Regex(r);
var z = x.IsMatch(password);
http://www.regular-expressions.info/refadv.html
http://www.regular-expressions.info/lookaround.html

Categories

Resources