C# regex string that is not another string - c#

I want to match an at least 3 letter word, preceded by any character from class [-_ :] any amount of times, that is not this specific 3 letter word string2.
Ex:
if string2="VER"
in
" ODO VER7"
matched " ODO"
or
"_::ATTPQ VER7"
matched "_::ATTPQ"
but if
" VER7"
it shoudn't match " VER"
so I thought about
Regex.Match(inputString, #"[-_:]*[A-Z]{3,}[^(VER)]", RegexOptions.IgnoreCase);
where
[-_:]* checks for any character in class, appearing 0 or more times
[A-Z] the range of letters that could form the word
{3,} the minimum amount of letters to form the word
[^(VER)] the grouping construct that shouldn't appear
I believe however that [A-Z]{3,} results in any letter at least 3 times (not what i want)
and [^(VER)] not sure what it's doing

Using [^(VER)] means a negated character class where you would match any character except ( ) V E or R
For you example data, you could match 0+ spaces or tabs (or use \s to also match a newline).
Then use a negative lookahead before matching 3 or more times A-Z to assert what is on the right is not VER.
If that is the case, match 3 or more times A-Z followed by a space and VER itself.
^[ \t]*[-_:]*(?!VER)[A-Z]{3,} VER
Regex demo

^\s*[-_:]*(?!VER)[A-Z]{3,}
This regex asserts that between the start and end of the string, there's zero or more of your characters, followed by at least 3 letters. It uses a negative lookahead to make sure that VER (or whatever you want) is not present.
Demo

This would match the preceding class characters [-_ :] of 3 or more letters/numbers
that do not start with VER (as in the samples given) :
[-_ :]+(?!VER)[^\W_]{3,}
https://regex101.com/r/wLw23I/1

Related

Regex pattern infinite number of times except last one different

I'm trying to build a regex to check if a text input is valid.
The pattern is [NumberBetween1And999]['x'][NumberBetween1And999][','][White space Optional] repeated infinite times.
I need this to make an order from a string: the first number is the product id and the second number is the quantity for the product.
Examples: of good texts:
1x1
2x1,3x1
1x3, 4x1
Should not catch:
1x1,
1,1, 1x1,
9999x1
1x1,99999x1
I'm blocked there: ^(([1-9][0-9]{0,2})x([1-9][0-9]{0,2}),)*$
Thanks for helping me
You can use
^[1-9][0-9]{0,2}x[1-9][0-9]{0,2}(?:,\s*[1-9][0-9]{0,2}x[1-9][0-9]{0,2})*$
The pattern matches:
^ Start of string
[1-9][0-9]{0,2}x[1-9][0-9]{0,2} Match a digit 1-9 and 2 optional digits 0-9, then x and again the digits part
(?: Non capture group to repeat as a whole
,\s* Match a comma and optional whitespace char
[1-9][0-9]{0,2}x[1-9][0-9]{0,2} Match the same pattern as at the beginning
)* Close the non capture group and optionally repeat it to also match a single part without a comma
$ End of string
Regex demo

Regex to match 7 same digits in a number regardless of position

I want to match an 8 digit number. Currently, I have the following regex but It is failing in some cases.
(\d+)\1{6}
It matches only when a number is different at the end such as 44444445 or 54444444. However, I am looking to match cases where at least 7 digits are the same regardless of their position.
It is failing in cases like
44454444
44544444
44444544
What modification is needed here?
It's probably a bad idea to use this in a performance-sensitive location, but you can use a capture reference to achieve this.
The Regex you need is as follows:
(\d)(?:.*?\1){6}
Breaking it down:
(\d) Capture group of any single digit
.*? means match any character, zero or more times, lazily
\1 means match the first capture group
We enclose that in a non-capturing group {?:
And add a quantifier {6} to match six times
You can sort the digits before matching
string input = "44444445 54444444 44454444 44544444 44444544";
string[] numbers = input.Split(' ');
foreach (var number in numbers)
{
number = String.Concat(str.OrderBy(c => c));
if (Regex.IsMatch(number, #"(\d+)\1{6}"))
// do something
}
Still not a good idea to use regex for this though
The pattern that you tried (\d+)\1{6} matches 6 of the same digits in a row. If you want to stretch the match over multiple same digits, you have to match optional digits in between.
Note that in .NET \d matches more digits than 0-9 only.
If you want to match only digits 0-9 using C# without matching other characters in between the digits:
([0-9])(?:[0-9]*?\1){6}
The pattern matches:
([0-9]) Capture group 1
(?: Non capture group
[0-9]*?\1 Match optional digits 0-9 and a backreference to group 1
){6} Close non capture group and repeat 6 times
See a .NET Regex demo
If you want to match only 8 digits, you can use a positive lookahead (?= to assert 8 digits and word boundaries \b
\b(?=\d{8}\b)[0-9]*([0-9])(?:[0-9]*?\1){6}\d*\b
See another .NET Regex demo

Regular expression to match 1-5 symbols when start symbol letter and at least one number

I tried this expression:
/([a-z]+[0-9]+[a-z]*){1,5}$/
but it's works for every word that start with letter and contains at least one number and more then two symbol for example "re1111e" when its not supposed to, what am I doing wrong?
One possible way to write your regex uses a positive lookahead to check for a number:
/(?=[^0-9]*[0-9])[a-z][a-z0-9]{0,4}/
This pattern says to:
(?=[^0-9]*[0-9]) assert that a single digit appears somewhere
[a-z] match an initial letter character
[a-z0-9]{0,4} then match zero to four letter or number characters
In your pattern, the quantifier {1,5} apllies to the group repeating this match [a-z]+[0-9]+[a-z]* 1 - 5 times.
If I am not mistaken, you want to match [a-z] from the start followed by 4 chars from which one of them is at least 1 digit so the minimum amount of characters is 2 and the maximum is 5.
You might use:
^(?=.{2,5}$)[a-z][a-z0-9]*[0-9][a-z0-9]*$
About the pattern
^ Start of string
(?=.{2,5}$ Assert string length 2 - 5 characters
[a-z] Match a-z
[a-z0-9]* Repeat 0+ times matching a-z 0-9
[0-9] Match a digit
[a-z0-9]* Repeat 0+ times matching a-z 0-9
$ End of string
Regex demo

Parsing text between quotes with .NET regular expressions

I have the following input text:
#"This is some text #foo=bar #name=""John \""The Anonymous One\"" Doe"" #age=38"
I would like to parse the values with the #name=value syntax as name/value pairs. Parsing the previous string should result in the following named captures:
name:"foo"
value:"bar"
name:"name"
value:"John \""The Anonymous One\"" Doe"
name:"age"
value:"38"
I tried the following regex, which got me almost there:
#"(?:(?<=\s)|^)#(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>[A-Za-z0-9_-]+|(?="").+?(?=(?<!\\)""))"
The primary issue is that it captures the opening quote in "John \""The Anonymous One\"" Doe". I feel like this should be a lookbehind instead of a lookahead, but that doesn't seem to work at all.
Here are some rules for the expression:
Name must start with a letter and can contain any letter, number, underscore, or hyphen.
Unquoted must have at least one character and can contain any letter, number, underscore, or hyphen.
Quoted value can contain any character including any whitespace and escaped quotes.
Edit:
Here's the result from regex101.com:
(?:(?<=\s)|^)#(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)"))
(?:(?<=\s)|^) Non-capturing group
# matches the character # literally
(?<name>\w+[A-Za-z0-9_-]+?) Named capturing group name
\s* match any white space character [\r\n\t\f ]
= matches the character = literally
\s* match any white space character [\r\n\t\f ]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)")) Named capturing group value
1st Alternative: [A-Za-z0-9_-]+
[A-Za-z0-9_-]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
A-Z a single character in the range between A and Z (case sensitive)
a-z a single character in the range between a and z (case sensitive)
0-9 a single character in the range between 0 and 9
_- a single character in the list _- literally
2nd Alternative: (?=").+?(?=(?<!\\)")
(?=") Positive Lookahead - Assert that the regex below can be matched
" matches the characters " literally
.+? matches any character (except newline)
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
(?=(?<!\\)") Positive Lookahead - Assert that the regex below can be matched
(?<!\\) Negative Lookbehind - Assert that it is impossible to match the regex below
\\ matches the character \ literally
" matches the characters " literally
You can use a very useful .NET regex feature where multiple same-named captures are allowed. Also, there is an issue with your (?<name>) capture group: it allows a digit in the first position, which does not meet your 1st requirement.
So, I suggest:
(?si)(?:(?<=\s)|^)#(?<name>\w+[a-z0-9_-]+?)\s*=\s*(?:(?<value>[a-z0-9_-]+)|(?:"")?(?<value>.+?)(?=(?<!\\)""))
See demo
Note that you cannot debug .NET-specific regexes at regex101.com, you need to test them in .NET-compliant environment.
Use string methods.
Split
string myLongString = ""#"This is some text #foo=bar #name=""John \""The Anonymous One\"" Doe"" #age=38"
string[] nameValues = myLongString.Split('#');
From there either use Split function with "=" or use IndexOf("=").

Regex matching recurring patterns

I want to validate input in a C# TextBox by using regular expressions. The expected input is in this format:
CCCCC-CCCCC-CCCCC-CCCCC-CCCCC-CCCCC-C
So I've got six elements of five separated characters and one separated character at the end.
Now my regex matches any character between five and 255 chars: .{5,255}
How do I need to modify it in order to match the format mentioned above?
Update: -
If you want to match any character, then you can use: -
^(?:[a-zA-Z0-9]{5}-){6}[a-zA-Z0-9]$
Explanation: -
(?: // Non-capturing group
[a-zA-Z0-9]{5} // Match any character or digit of length 5
- // Followed by a `-`
){6} // Match the pattern 6 times (ABCD4-) -> 6 times
[a-zA-Z0-9] // At the end match any character or digit.
Note: - The below regex will only match pattern like you posted: -
CCCCC-CCCCC-CCCCC-CCCCC-CCCCC-CCCCC-C
You can try this regex: -
^(?:([a-zA-Z0-9])\1{4}-){6}\1$
Explanation: -
(?: // Non-capturing group
( // First capture group
[a-zA-Z0-9] // Match any character or digit, and capture in group 1
)
\1{4} // Match the same character as in group 1 - 4 times
- // Followed by a `-`
){6} // Match the pattern 6 times (CCCCC-) -> 6 times
\1 // At the end match a single character.
Untested, but I think this will work:
([A-Za-z0-9]{5}-){6}[A-Za-z0-9]
For your example, in general replace C to the character class you want:
^(C{5}-){6}C$
^([a-z]{5}-){6}[a-z]$ # Just letter, use case insensitive modifier
^([a-z0-9]{5}-){6}[a-z0-9]$ # Letters and digits..
Try this:
^(C{5}-){6}C$
The ^ and $ denote the begiining and end of the string repectively and make sure that no additional characters are entered.

Categories

Resources