This seems like it should be easy, but I'm not so good with regex, and this doesn't seem to be easy to find on google.
I need a regex that starts with the string 'SP-multiple digits' and ends with the string '- multiple digits'
For example i have to match '-12' in "Sp-1234-12".
My attempt was: [^*-]*$ -> This case matches everything after the minus but i need the minus included.
For that digit and hyphen format, you could use a capture group for the part of the string that you want:
^Sp(?:-\d+)*(-\d+)$
Explanation
^ Start of string
Sp Match literally
(?:-\d+)* Optionally repeat - and 1+ digits
(-\d+) Capture group 1, match - and 1+ digits
$ End of string
Regex demo
Note that in C# you can use [0-9] instead of \d to match only digits 0-9
Related
How can I get the string after the last comma or last number using regex for this examples:
"Flat 1, Asker Horse Sports", -- get string after "," result: "Asker
Horse Sports"
"9 Walkers Barn" -- get string after "9" result:
Walkers Barn
I need that regex to support both cases or to different regex rules, each / case.
I tried /,[^,]*$/ and (.*),[^,]*$ to get the strings after the last comma but no luck.
You can use
[^,\d\s][^,\d]*$
See the regex demo (and a .NET regex demo).
Details
[^,\d\s] - any char but a comma, digit and whitespace
[^,\d]* - any char but a comma and digit
$ - end of string.
In C#, you may also tell the regex engine to search for the match from the end of the string with the RegexOptions.RightToLeft option (to make regex matching more efficient. although it might not be necessary in this case if the input strings are short):
var output = Regex.Match(text, #"[^,\d\s][^,\d]*$", RegexOptions.RightToLeft)?.Value;
You were on the right track the capture group in (.*),[^,]*$, but the group should be the part that you are looking for.
If there has to be a comma or digit present, you could match until the last occurrence of either of them, and capture what follows in the capturing group.
^.*[\d,]\s*(.+)$
^ Start of string
.* Match any char except a newline 0+ times
[\d,] Match either , or a digit
\s* Match 0+ whitespace chars
(.+) Capture group 1, match any char except a newline 1+ times
$ End of string
.NET regex demo | C# demo
I need to find a Regex that gets hold of the
81.03
part (varies, but always has the structure XX.XX) in following string variations:
Projects/75100/75120/75124/AR1/75124_AR1_HM2_81.03-testing-b405.tgz
Projects/75100/75130/75138/LM1/75138_LM1_HM2_81.03.tgz
I´ve come up with:
var regex = new Regex("(.*_)(.*?)-");
but this only matches up to the first example string whereas
var regex = new Regex("(.*_)(.*?)(.*\.)");
only matches the second string.
The path to the file constantly changes as does the "-testing..." postfix.
Any ideas to point me out in the right direction?
You can use
var result = Regex.Match(text, #".*_(\d+\.\d+)")?.Groups[1].Value;
Or, if the string can have more dot+number parts:
var result = Regex.Match(text, #".*_(\d+(?:\.\d+)+)")?.Groups[1].Value;
See the regex demo.
In general, the regex will extract dot-separated digit chunks after the last _.
Details
.* - any 0 or more chars other than a newline, as many as possible
_ - a _ char
(\d+(?:\.\d+)+) - Group 1: one or more digits followed with one or more occurrences of a dot followed with one or more digits
\d+\.\d+ - one or more digits, . and one or more digits.
To match the value 81.03 another option is to match the digits with optional decimal part after the last forward slash after the first underscore.
_(\d+(?:\.\d+)?)[^/\r\n]*$
Regex demo
Explanation
_ Match literally
(\d+(?:\.\d+)?) Capture group 1, match 1+ digits with an optional decimal part
[^/\r\n]* Match 0+ chars except / or a newline
$ End of string
I'm scanning a string and a period is allowed but if there is a period it has to be in the following format alphanumber.numeric or numeric.numeric. Here are some possible acceptable formats:
5555.1312
ajfdkd.555
Here is what i have so far:
private const string containsPeroidRegularExpress = #"([a-zA-Z]+\.[0-9]+)|([0-9]+\.[0-9]+)";;
validator.RuleFor(x => x.myString)
.Matches(containsPeroidRegularExpress)
.When(x => x.myString.Contains("."), ApplyConditionTo.CurrentValidator)
When you have an example like this it works fine:
This is my example 1 555.1212
But in this example it does not
This is my example 2 555.1212 .
You can see the extra period at the end of the 2nd example. It should fail validation because the extra peroid is not in the specified format stated above. The 1st example should pass validation. Both pass the validation though.
Your pattern is still capturing exactly what you want, however it doesn't "know" that it needs to keep going.
private const string containsPeroidRegularExpress =
#"^([a-zA-Z]+\.[0-9]+)$|^([0-9]+\.[0-9]+)$";
The $ tells it to check right up until the end of the line (I also added ^ to tell it to start at the beginning for completeness so that ". 555.1212" doesn't pass as well).
I definitely won't say this is the best solution. As others mention, you can definitely simplify it. However regex isn't my forte...
I also noticed you mention that the pattern could be alphanumber.numeric. Your pattern does not allow both alpha and numeric characters mixed in the first part. You could use the following:
private const string containsPeroidRegularExpress =
#"^([a-zA-Z0-9]+\.[0-9]+)$|^([0-9]+\.[0-9]+)$";
You might check that after matching the value, there is no space followed by a dot on the right.
You can shorten the pattern a bit by either matching 1+ digits or 1 chars a-zA-Z, and then match a dot and 1+ digits
(?<!\.[^\S\r\n]+)\b[a-zA-Z0-9]+\.[0-9]+\b(?![^\S\r\n]+\.)
The pattern matches
(?<! Negative lookbehind, assert what is on the left is not
\.[^\S\r\n]+ Match a dot and 1+ whitespace chars without a newline
) Close lookbehind
\b Word boundary
(?: Non capture group
[a-zA-Z]+|[0-9]+ Match either 1+ chars a-zA-Z or 1+ digits
) Close group
\.[0-9]+ Match a dot and 1+ digits 0-9
\b Word boundary
(?! Negative lookahead, assert that on the right is not
[^\S\r\n]+\. Match 1+ whitespaces without newlines followed by a dot
) Close lookahead
Regex demo
If you want to match mixed char a-zA-Z and digits:
(?<!\.[^\S\r\n]+)\b[a-zA-Z0-9]+\.[0-9]+\b(?![^\S\r\n]+\.)
Regex demo
I have a phone number field with the following regex:
[RegularExpression(#"^[0-9]{10,10}$")]
This checks input is exactly 10 numeric characters, how should I change this regex to allow spaces to make all the following examples validate
1234567890
12 34567890
123 456 7890
cheers!
This works:
^(?:\s*\d\s*){10,10}$
Explanation:
^ - start line
(?: - start noncapturing group
\s* - any spaces
\d - a digit
\s* - any spaces
) - end noncapturing group
{10,10} - repeat exactly 10 times
$ - end line
This way of constructing this regex is also fairly extensible in case you will have to ignore any other characters.
Use this:
^([\s]*\d){10}\s*$
I cheated :) I just modified this regex here:
Regular expression to count number of commas in a string
I tested. It works fine for me.
Use this simple regex
var matches = Regex.Matches(inputString, #"([\s\d]{10})");
EDIT
var matches = Regex.Matches(inputString, #"^((?:\s*\d){10})$");
explain:
^ the beginning of the string
(?: ){10} group, but do not capture (10 times):
\s* whitespace (0 or more times, matching the most amount possible)
\d digits (0-9)
$ before an optional \n, and the end of the string
Depending on your problem, you might consider using a Match Evaluator delegate, as described in http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchevaluator.aspx
That would make short work of the issue of counting digits and/or spaces
Something like this i think ^\d{2}\s?\d\s?\d{3}\s?\d{4}$
There are variants : 10 digits or 2 digits space 8 digits or 3 digits space 3 digits space 4 digits.
But if you want only this 3 variants use something like this
^(?:\d{10})|(?:\d{2}\s\d{8})|(?:\d{3}\s\d{3}\s\d{4})$
I need to match this string 011Q-0SH3-936729 but not 345376346 or asfsdfgsfsdf
It has to contain characters AND numbers AND dashes
Pattern could be 011Q-0SH3-936729 or 011Q-0SH3-936729-SDF3 or 000-222-AAAA or 011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729 and I want it to be able to match anyone of those. Reason for this is that I don't really know if the format is fixed and I have no way of finding out either so I need to come up with a generic solution for a pattern with any number of dashes and the pattern recurring any number of times.
Sorry this is probably a stupid question, but I really suck at Regular expressions.
TIA
foundMatch = Regex.IsMatch(subjectString,
#"^ # Start of the string
(?=.*\p{L}) # Assert that there is at least one letter
(?=.*\p{N}) # and at least one digit
(?=.*-) # and at least one dash.
[\p{L}\p{N}-]* # Match a string of letters, digits and dashes
$ # until the end of the string.",
RegexOptions.IgnorePatternWhitespace);
should do what you want. If by letters/digits you meant "only ASCII letters/digits" (and not international/Unicode letters, too), then use
foundMatch = Regex.IsMatch(subjectString,
#"^ # Start of the string
(?=.*[A-Z]) # Assert that there is at least one letter
(?=.*[0-9]) # and at least one digit
(?=.*-) # and at least one dash.
[A-Z0-9-]* # Match a string of letters, digits and dashes
$ # until the end of the string.",
RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase);
EDIT:
this will match any of the key provided in your comments:
^[0-9A-Z]+(-[0-9A-Z]+)+$
this means the key starts with the digit or letter and have at leats one dash symbol:
Without more info about the regularity of the dashes or otherwise, this is the best we can do:
Regex.IsMatch(input,#"[A-Z0-9\-]+\-[A-Z0-9]")
Although this will also match -A-0
Most naive implementation EVER (might get you started):
([0-9]|[A-Z])+(-)([0-9]|[A-Z])+(-)([0-9]|[A-Z])+
Tested with Regex Coach.
EDIT:
That does match only three groups; here another, slightly better:
([0-9A-Z]+\-)+([0-9A-Z]+)
Are you applying the regex to a whole string (i.e., validating or filtering)? If so, Tim's answer should put you right. But if you're plucking matches from a larger string, it gets a bit more complicated. Here's how I would do that:
string input = #"Pattern could be 011Q-0SH3-936729 or 011Q-0SH3-936729-SDF3 or 000-222-AAAA or 011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729 but not 345-3763-46 or ASFS-DFGS-FSDF or ASD123FGH987.";
Regex pluckingRegex = new Regex(
#"(?<!\S) # start of 'word'
(?=\S*\p{L}) # contains a letter
(?=\S*\p{N}) # contains a digit
(?=\S*-) # contains a hyphen
[\p{L}\p{N}-]+ # gobble up letters, digits and hyphens only
(?!\S) # end of 'word'
", RegexOptions.IgnorePatternWhitespace);
foreach (Match m in pluckingRegex.Matches(input))
{
Console.WriteLine(m.Value);
}
output: 011Q-0SH3-936729
011Q-0SH3-936729-SDF3
000-222-AAAA
011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729-011Q-0SH3-936729
The negative lookarounds serve as 'word' boundaries: they insure the matched substring starts either at the beginning of the string or after a whitespace character ((?<!\S)), and ends either at the end of the string or before a whitespace character ((?!\S)).
The three positive lookaheads work just like Tim's, except they use \S* to skip whatever precedes the first letter/digit/hyphen. We can't use .* in this case because that would allow it to skip to the next word, or the next, etc., defeating the purpose of the lookahead.