Regex pattern infinite number of times except last one different - c#

I'm trying to build a regex to check if a text input is valid.
The pattern is [NumberBetween1And999]['x'][NumberBetween1And999][','][White space Optional] repeated infinite times.
I need this to make an order from a string: the first number is the product id and the second number is the quantity for the product.
Examples: of good texts:
1x1
2x1,3x1
1x3, 4x1
Should not catch:
1x1,
1,1, 1x1,
9999x1
1x1,99999x1
I'm blocked there: ^(([1-9][0-9]{0,2})x([1-9][0-9]{0,2}),)*$
Thanks for helping me

You can use
^[1-9][0-9]{0,2}x[1-9][0-9]{0,2}(?:,\s*[1-9][0-9]{0,2}x[1-9][0-9]{0,2})*$
The pattern matches:
^ Start of string
[1-9][0-9]{0,2}x[1-9][0-9]{0,2} Match a digit 1-9 and 2 optional digits 0-9, then x and again the digits part
(?: Non capture group to repeat as a whole
,\s* Match a comma and optional whitespace char
[1-9][0-9]{0,2}x[1-9][0-9]{0,2} Match the same pattern as at the beginning
)* Close the non capture group and optionally repeat it to also match a single part without a comma
$ End of string
Regex demo

Related

Regex to match 7 same digits in a number regardless of position

I want to match an 8 digit number. Currently, I have the following regex but It is failing in some cases.
(\d+)\1{6}
It matches only when a number is different at the end such as 44444445 or 54444444. However, I am looking to match cases where at least 7 digits are the same regardless of their position.
It is failing in cases like
44454444
44544444
44444544
What modification is needed here?
It's probably a bad idea to use this in a performance-sensitive location, but you can use a capture reference to achieve this.
The Regex you need is as follows:
(\d)(?:.*?\1){6}
Breaking it down:
(\d) Capture group of any single digit
.*? means match any character, zero or more times, lazily
\1 means match the first capture group
We enclose that in a non-capturing group {?:
And add a quantifier {6} to match six times
You can sort the digits before matching
string input = "44444445 54444444 44454444 44544444 44444544";
string[] numbers = input.Split(' ');
foreach (var number in numbers)
{
number = String.Concat(str.OrderBy(c => c));
if (Regex.IsMatch(number, #"(\d+)\1{6}"))
// do something
}
Still not a good idea to use regex for this though
The pattern that you tried (\d+)\1{6} matches 6 of the same digits in a row. If you want to stretch the match over multiple same digits, you have to match optional digits in between.
Note that in .NET \d matches more digits than 0-9 only.
If you want to match only digits 0-9 using C# without matching other characters in between the digits:
([0-9])(?:[0-9]*?\1){6}
The pattern matches:
([0-9]) Capture group 1
(?: Non capture group
[0-9]*?\1 Match optional digits 0-9 and a backreference to group 1
){6} Close non capture group and repeat 6 times
See a .NET Regex demo
If you want to match only 8 digits, you can use a positive lookahead (?= to assert 8 digits and word boundaries \b
\b(?=\d{8}\b)[0-9]*([0-9])(?:[0-9]*?\1){6}\d*\b
See another .NET Regex demo

Regex start new match at specific pattern

Hello im kinda new to regex and have a small, maybe simple question.
I have the given text:
17.11.2020 15:32 typical Pat. seems sleeping
Additional test
17.11.2020 15:32 typical Pat. seems sleeping
Additional test
17.11.2020 15:32 typical Pat. seems sleeping
Additional test
My current regex (\d{2}.\d{2}.\d{4}\s\d{2}:\d{2})\s?(.*)
matches only till sleeping but reates 3 matches correctly.
But i need the Additional test text also in the second group.
i tried something like (\d{2}.\d{2}.\d{4}\s\d{2}:\d{2})\s?([,.:\w\s]*) but now i have only one huge match because the second group takes everything until the end.
How can i match everything until a new line with a date starts and create a new match from there on?
If you are sure there is only one additional line to be matched you can use
(?m)^(\d{2}\.\d{2}\.\d{4}\s\d{2}:\d{2})\s*(.*(?:\n.*)?)
See the regex demo. Details:
(?m) - a multiline modifier
^ - start of a line
(\d{2}\.\d{2}\.\d{4}\s\d{2}:\d{2}) - Group 1: a datetime string
\s* - zero or more whitespaces
(.*(?:\n.*)?) - Group 2: any zero or more chars other than a newline char as many as possible and then an optional line, a newline followed with any zero or more chars other than a newline char as many as possible.
If there can be any amount of lines, you may consider
(?m)^(\d{2}\.\d{2}\.\d{4}[\p{Zs}\t]\d{2}:\d{2})[\p{Zs}\t]*(?s)(.*?)(?=\n\d{2}\.\d{2}\.\d{4}|\z)
See this regex demo. Here,
(?m)^(\d{2}\.\d{2}\.\d{4}[\p{Zs}\t]\d{2}:\d{2}) - matches the same as above, just \s is replaced with [\p{Zs}\t] that only matches horizontal whitespace
[\p{Zs}\t]* - 0+ horizontal whitespace chars
(?s) - now, . will match any chars including a newline
(.*?) - Group 2: any zero or more chars, as few as possible
(?=\n\d{2}\.\d{2}\.\d{4}|\z) - up to the leftmost occurrence of a newline, followed with a date string, or up to the end of string.
You are using \s repeatedly using the * quantifier with the character class [,.:\w\s]* and \s also matches newlines and will match too much.
You can just match the rest of the line using (.*\r?\n.*) which would not match a newline, then match a newline and the next line in the same group.
^(\d{2}.\d{2}.\d{4}\s\d{2}:\d{2})\s?(.*\r?\n.*)
Regex demo
If multiple lines can follow, match all following lines that do not start with a date like pattern.
^(\d{2}\.\d{2}\.\d{4})\s*(.*(?:\r?\n(?!\d{2}\.\d{2}\.\d{4}).*)*)
Explanation
^ Start of the string
( Capture group1
\d{2}\.\d{2}\.\d{4} Match a date like pattern
) Close group 1
\s* Match 0+ whitespace chars (Or match whitespace chars without newlines [^\S\r\n]*)
( Capture group 2
.* Match the whole line
(?:\r?\n(?!\d{2}\.\d{2}\.\d{4}).*)* Optionally repeat matching the whole line if it does not start with a date like pattern
) Close group 2
Regex demo

Trying to capture a decimal number out of a string

In a text file, I'm looking for a part of a document that contains a piece like min ISO 1133 0.2-0.35. What I want to capture is the ranged decimal part of that piece of text (0.2-0.35). Since there are other ranged decimal numbers, I cannot simply use a regular expression to look only for the ranged part. Till now, I could make min.*(\d+)((?:\.)?)(\d*)-(\d+)((?:\.)?)(\d*) but the result is not correct and I'm stuck. Can anyone please help me with this?
Below, you can see the final result (yellow part):
Maybe the following would work for you?
\bmin\s.*?(\d+(?:\.\d+)?)-(\d+(?:\.\d+)?)
See the online demo
The answer is currently based on the assumption (looking at your current attempt) you'd want these ranges in seperate groups. However, if not, this answer can be swiftly transformed to capture the whole substring (or see #TheFourthBird's answer).
\b - Match word boundary.
min - Literally match 'min'.
\s - Match a whitespace character.
.*? - Match any character other than newline up to (lazy):
( - Open 1st capture group
\d+ - At least a single digit.
(?: - Open non-capturing group.
\.\d+ - Match a literal dot and at least a single digit.
)? - Close non-capturing group and make it optional.
) - Close 1st capture group.
- Match a literal hyphen.
( - Open 2nd capture group
\d+ - At least a single digit.
(?: - Open non-capturing group.
\.\d+ - Match a literal dot and at least a single digit.
)? - Close non-capturing group and make it optional.
) - Close 2nd capture group.
You could get the decimal part matching 1+ digit in the optional part and making the quantifier non greedy. The value is in capture group 1.
\bmin [A-Z]+ [0-9]+ ([0-9]+(?:\.[0-9]+)?-[0-9](?:\.[0-9]+)?)\b
Regex demo
Or a bit more specific pattern
\bmin [A-Z]+ [0-9]+ ([0-9]+(?:\.[0-9]+)?-[0-9]+(?:\.[0-9]+)?)\b
Regex demo

C# regex string that is not another string

I want to match an at least 3 letter word, preceded by any character from class [-_ :] any amount of times, that is not this specific 3 letter word string2.
Ex:
if string2="VER"
in
" ODO VER7"
matched " ODO"
or
"_::ATTPQ VER7"
matched "_::ATTPQ"
but if
" VER7"
it shoudn't match " VER"
so I thought about
Regex.Match(inputString, #"[-_:]*[A-Z]{3,}[^(VER)]", RegexOptions.IgnoreCase);
where
[-_:]* checks for any character in class, appearing 0 or more times
[A-Z] the range of letters that could form the word
{3,} the minimum amount of letters to form the word
[^(VER)] the grouping construct that shouldn't appear
I believe however that [A-Z]{3,} results in any letter at least 3 times (not what i want)
and [^(VER)] not sure what it's doing
Using [^(VER)] means a negated character class where you would match any character except ( ) V E or R
For you example data, you could match 0+ spaces or tabs (or use \s to also match a newline).
Then use a negative lookahead before matching 3 or more times A-Z to assert what is on the right is not VER.
If that is the case, match 3 or more times A-Z followed by a space and VER itself.
^[ \t]*[-_:]*(?!VER)[A-Z]{3,} VER
Regex demo
^\s*[-_:]*(?!VER)[A-Z]{3,}
This regex asserts that between the start and end of the string, there's zero or more of your characters, followed by at least 3 letters. It uses a negative lookahead to make sure that VER (or whatever you want) is not present.
Demo
This would match the preceding class characters [-_ :] of 3 or more letters/numbers
that do not start with VER (as in the samples given) :
[-_ :]+(?!VER)[^\W_]{3,}
https://regex101.com/r/wLw23I/1

Using Regular Expression for Phone Number

I know this question is asked like a thousand times in here, but I can't get the hang of it yet. I need help with checking a textbox if it matches a Phone Number format. The format should be likes this :
000-000-000 or (+000)00-000-000. Can anybody help me ?
give this pattern a try,
^(\(\+\d{3}\)|\d)\d{2}(-\d{3}){2}$
ScreenShot:
Generated Explanation:
Assert position at the beginning of a line (at beginning of the string or after a line break character) ^
Match the regular expression below and capture its match into backreference number 1 (\(\+\d{3}\)|\d)
Match either the regular expression below (attempting the next alternative only if this one fails) \(\+\d{3}\)
Match the character “(” literally \(
Match the character “+” literally \+
Match a single digit 0..9 \d{3}
Exactly 3 times {3}
Match the character “)” literally \)
Or match regular expression number 2 below (the entire group fails if this one fails to match) \d
Match a single digit 0..9 \d
Match a single digit 0..9 \d{2}
Exactly 2 times {2}
Match the regular expression below and capture its match into backreference number 2 (-\d{3}){2}
Exactly 2 times {2}
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. {2}
Match the character “-” literally -
Match a single digit 0..9 \d{3}
Exactly 3 times {3}
Assert position at the end of a line (at the end of the string or before a line break character) $
Pattern 1 is \d{3}\-\d{3}\-\d{3}
Pattern 2 is \(\+\d{3}\)\d{2}\-d{3}\-\d{3}
So you need to match for Pattern1 OR Pattern2:
(\d{3}\-\d{3}\-\d{3})|(\(\+\d{3}\)\d{2}\-d{3}\-\d{3})
(?:\d|\(\+\d{3}\))\d{2}(?:-\d{3}){2}
Or, if you're regarding of performance, better change it to:
(?:\(\+\d{3}\)|\d)\d{2}(?:-\d{3}){2}

Categories

Resources