Regex for football teams representing two participants playing against each other - c#

My task is to have :
EventName is a string representing two participants playing against
each other.
I was thinking how can I do this with Data Annotation. So far I have this:
[RegularExpression("^[^0-9]+$", ErrorMessage = "Name cannot contain numbers")]
[Required, MinLength(10), MaxLength(150)]
I think I should only allow 1 delimiter( either "-" ":" or space) between those two teams. Anything else should not be allowed. No special characters, no numbers.
Could someone point me to the right direction?
Also does anyone know if football teams can have digits in their names?
Something like Villa1874 - Levski1914 ?

To allow a single delimiter (:,- or single white-space, any can be surrounded by white-spaces) and numbers/underscores in the team names:
^\w+ *[ -:] *\w+$
Explanation:
^ asserts position at start of a line
\w+ matches any word character (equal to [a-zA-Z0-9_]) (1 or more)
* matches any white-space (0 or more)
[ -:] Match a single character present in the list
* matches any white-space (0 or more)
\w+ matches any word character (equal to [a-zA-Z0-9_]) (1 or more)
^ asserts position at end of a line
Test: https://regex101.com/r/gBZDLE/6
If you don't want to allow numbers or underscores in the team names you can replace \w by [A-Za-z] instead:
^[A-Za-z]+ *[ -:] *[A-Za-z]+$
Test: https://regex101.com/r/gBZDLE/7
The previous doesn't allow spaces in the team names. To do so, you may want to match [A-Za-z]+(?: [A-Za-z]+)* instead. This will match any name that starts with an alphabetic character and contain one or more words separated by single-spaces (e.g. "Manchester" or "Manchester United"). Full regex:
^[A-Za-z]+(?: [A-Za-z]+)* *[ -:] *[A-Za-z]+(?: [A-Za-z]+)*$
Test: https://regex101.com/r/gBZDLE/5
Explanation of [A-Za-z]+(?: [A-Za-z]+):
[A-Za-z]+ Match a single alphabetic character (+ one or more times)
(?: [A-Za-z]+) Non-capture group (?: ), matches a single white-space followed by an alphabetic character one or more times [A-Za-z]+

Related

C# regex string that is not another string

I want to match an at least 3 letter word, preceded by any character from class [-_ :] any amount of times, that is not this specific 3 letter word string2.
Ex:
if string2="VER"
in
" ODO VER7"
matched " ODO"
or
"_::ATTPQ VER7"
matched "_::ATTPQ"
but if
" VER7"
it shoudn't match " VER"
so I thought about
Regex.Match(inputString, #"[-_:]*[A-Z]{3,}[^(VER)]", RegexOptions.IgnoreCase);
where
[-_:]* checks for any character in class, appearing 0 or more times
[A-Z] the range of letters that could form the word
{3,} the minimum amount of letters to form the word
[^(VER)] the grouping construct that shouldn't appear
I believe however that [A-Z]{3,} results in any letter at least 3 times (not what i want)
and [^(VER)] not sure what it's doing
Using [^(VER)] means a negated character class where you would match any character except ( ) V E or R
For you example data, you could match 0+ spaces or tabs (or use \s to also match a newline).
Then use a negative lookahead before matching 3 or more times A-Z to assert what is on the right is not VER.
If that is the case, match 3 or more times A-Z followed by a space and VER itself.
^[ \t]*[-_:]*(?!VER)[A-Z]{3,} VER
Regex demo
^\s*[-_:]*(?!VER)[A-Z]{3,}
This regex asserts that between the start and end of the string, there's zero or more of your characters, followed by at least 3 letters. It uses a negative lookahead to make sure that VER (or whatever you want) is not present.
Demo
This would match the preceding class characters [-_ :] of 3 or more letters/numbers
that do not start with VER (as in the samples given) :
[-_ :]+(?!VER)[^\W_]{3,}
https://regex101.com/r/wLw23I/1

Regular expression to match 1-5 symbols when start symbol letter and at least one number

I tried this expression:
/([a-z]+[0-9]+[a-z]*){1,5}$/
but it's works for every word that start with letter and contains at least one number and more then two symbol for example "re1111e" when its not supposed to, what am I doing wrong?
One possible way to write your regex uses a positive lookahead to check for a number:
/(?=[^0-9]*[0-9])[a-z][a-z0-9]{0,4}/
This pattern says to:
(?=[^0-9]*[0-9]) assert that a single digit appears somewhere
[a-z] match an initial letter character
[a-z0-9]{0,4} then match zero to four letter or number characters
In your pattern, the quantifier {1,5} apllies to the group repeating this match [a-z]+[0-9]+[a-z]* 1 - 5 times.
If I am not mistaken, you want to match [a-z] from the start followed by 4 chars from which one of them is at least 1 digit so the minimum amount of characters is 2 and the maximum is 5.
You might use:
^(?=.{2,5}$)[a-z][a-z0-9]*[0-9][a-z0-9]*$
About the pattern
^ Start of string
(?=.{2,5}$ Assert string length 2 - 5 characters
[a-z] Match a-z
[a-z0-9]* Repeat 0+ times matching a-z 0-9
[0-9] Match a digit
[a-z0-9]* Repeat 0+ times matching a-z 0-9
$ End of string
Regex demo

Parsing text between quotes with .NET regular expressions

I have the following input text:
#"This is some text #foo=bar #name=""John \""The Anonymous One\"" Doe"" #age=38"
I would like to parse the values with the #name=value syntax as name/value pairs. Parsing the previous string should result in the following named captures:
name:"foo"
value:"bar"
name:"name"
value:"John \""The Anonymous One\"" Doe"
name:"age"
value:"38"
I tried the following regex, which got me almost there:
#"(?:(?<=\s)|^)#(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>[A-Za-z0-9_-]+|(?="").+?(?=(?<!\\)""))"
The primary issue is that it captures the opening quote in "John \""The Anonymous One\"" Doe". I feel like this should be a lookbehind instead of a lookahead, but that doesn't seem to work at all.
Here are some rules for the expression:
Name must start with a letter and can contain any letter, number, underscore, or hyphen.
Unquoted must have at least one character and can contain any letter, number, underscore, or hyphen.
Quoted value can contain any character including any whitespace and escaped quotes.
Edit:
Here's the result from regex101.com:
(?:(?<=\s)|^)#(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)"))
(?:(?<=\s)|^) Non-capturing group
# matches the character # literally
(?<name>\w+[A-Za-z0-9_-]+?) Named capturing group name
\s* match any white space character [\r\n\t\f ]
= matches the character = literally
\s* match any white space character [\r\n\t\f ]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)")) Named capturing group value
1st Alternative: [A-Za-z0-9_-]+
[A-Za-z0-9_-]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
A-Z a single character in the range between A and Z (case sensitive)
a-z a single character in the range between a and z (case sensitive)
0-9 a single character in the range between 0 and 9
_- a single character in the list _- literally
2nd Alternative: (?=").+?(?=(?<!\\)")
(?=") Positive Lookahead - Assert that the regex below can be matched
" matches the characters " literally
.+? matches any character (except newline)
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
(?=(?<!\\)") Positive Lookahead - Assert that the regex below can be matched
(?<!\\) Negative Lookbehind - Assert that it is impossible to match the regex below
\\ matches the character \ literally
" matches the characters " literally
You can use a very useful .NET regex feature where multiple same-named captures are allowed. Also, there is an issue with your (?<name>) capture group: it allows a digit in the first position, which does not meet your 1st requirement.
So, I suggest:
(?si)(?:(?<=\s)|^)#(?<name>\w+[a-z0-9_-]+?)\s*=\s*(?:(?<value>[a-z0-9_-]+)|(?:"")?(?<value>.+?)(?=(?<!\\)""))
See demo
Note that you cannot debug .NET-specific regexes at regex101.com, you need to test them in .NET-compliant environment.
Use string methods.
Split
string myLongString = ""#"This is some text #foo=bar #name=""John \""The Anonymous One\"" Doe"" #age=38"
string[] nameValues = myLongString.Split('#');
From there either use Split function with "=" or use IndexOf("=").

Regular Expression for no repeating special characters (C#)

I am new to regular expressions and need a regular expression for address, in which user cannot enter repeating special characters such as: ..... or ,,,.../// etc and none of the special characters could be entered more than 5 times in the string.
...,,,....// =>No Match
Street no. 40. hello. =>Match
Thanks in advance!
I have tried this:
([a-zA-Z]+|[\s\,\.\/\-]+|[\d]+)|(\(([\da-zA-Z]|[^)^(]+){1,}\))
It selects all alphanumeric n some special character with no empty brackets.
You can use Negative lookahead construction that asserts what is invalid to match. Its format is (?! ... )
For your case you can try something like this:
This will not match the input string if it has 2 or more consecutive dots, commas or slashes (or any combination of them)
(?!.*[.,\/]{2}) ... rest of the regex
This will not match the input string if it has more than 5 characters 'A'.
(?!(.*A.*){5}) ... rest of the regex
This will match everything except your restrictions. Repplace last part (.*) with your regex.
^(?!.*[.,\/]{2})(?!(.*\..*){5})(?!(.*,.*){5})(?!(.*\/.*){5}).*$
Note: This regex may no be optimized. It may be faster if you use loop to iterate over string characters and count their occurences.
You can use this regex:
^(?![^,./-]*([,./-])\1)(?![^,./-]*([,./-])(?:[^,./-]*\2){4})[ \da-z,./-]+$
In C#:
foundMatch = Regex.IsMatch(yourString, #"^(?![^,./-]*([,./-])\1)(?![^,./-]*([,./-])(?:[^,./-]*\2){4})[ \da-z,./-]+$", RegexOptions.IgnoreCase);
Explanation
The ^ anchor asserts that we are at the beginning of the string
The negative lookahead (?![^,./-]*([,./-])\1) asserts that it is not possible to match any number of special chars, followed by one special char (captured to Group 1) followed by the same special char (the \1 backreference)
The negative lookahead (?![^,./-]*([,./-])(?:[^,./-]*\2){4}) ` asserts that it is not possible to match any number of special chars, followed by one special char (captured to Group 2), then any non-special char and that same char from Group 2, four times (five times total)
The $ anchor asserts that we are at the end of the string
A regular expression string to detect invalid strings is:
[^\w \-\r\n]{2}|(?:[\w \-]+[^\w \-\r\n]){5}
As C# string literal (regular and verbatim):
"[^\\w \\-\\r\\n]{2}|(?:[\\w \\-]+[^\\w \\-\\r\\n]){5}"
#"[^\w \-\r\n]{2}|(?:[\w \-]+[^\w \-\r\n]){5}"
It is much easier to find a string than to validate if a string does not contain ...
It can be checked with this expression if the string entered by the user is invalid because of a match of 2 special characters in sequence OR 5 special characters used in the string.
Explanation:
[^...] ... a negative character class definition which matches any character NOT being one of the characters listed within the square brackets.
\w ... a word character which is either a letter, a digit or an underscore.
The next character is simply a space character.
\- ... the hyphen character which must be escaped with a backslash within square brackets as otherwise the hyphen character would be interpreted as "FROM x TO z" (except when being the first or the last character within the square brackets).
\r ... carriage return
\n ... line-feed
Therefore [^\w \-\r\n] finds a character which is NOT a letter, NOT a digit, NOT an underscore, NOT a space, NOT a hyphen, NOT a carriage return and also NOT a line-feed.
{2} ... the preceding expression must match 2 such characters.
So with the expression [^\w \-\r\n]{2} it can be checked if the string contains 2 special characters in a sequence which makes the string invalid.
| ... OR
(?:...) ... none marking group needed here for applying the expression inside with the multiplier {5} at least 5 times.
[...] ... a positive character class definition which matches any character being one of the characters listed within the square brackets.
[\w \-]+ ... find a word character, or a space, or a hyphen 1 or more times.
[^\w \-\r\n] ... and next character being NOT a word character, space, hyphen, carriage return or line-feed.
Therefore (?:[\w \-]+[^\w \-\r\n]){5} finds a string with 5 "special" characters between "standard" characters.

What regex for matching words with keyword '('?

In my c# code I need to get a word if the words before match specific words:
var match= Regex.Match(someLine, #"^(FIRST WORDS) (\w+) (SECOND WORDS | PROBLEM KEYWORD \() (\w+)", RegexOptions.IgnoreCase);
var neededWord= match.Groups[4].Value;
If the string equals "FIRST WORDS SOME WORDS PROBLEM KEYWORD (SOMETHING AGAIN)", I would like to get 'SOMETHING' as my needed word. But this does not work. It returns an empty string.
What am I doing wrong?
RegEx Demo
^FIRST WORDS[^\(]+\(([^\)]+)\)
Debuggex Demo
Description
^ assert position at start of the string
FIRST WORDS matches the characters FIRST WORDS literally (case sensitive)
[^\(]+ match a single character not present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\( matches the character ( literally
\( matches the character ( literally
1st Capturing group ([^\)]+)
[^\)]+ match a single character not present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\) matches the character ) literally
\) matches the character ) literally
Note: if you need only the word SOMETHING I can edit the RegEx, also Group 1 will contain your requested results.

Categories

Resources