Regex pattern mismatches while checking time format - c#

I would love to accept the following formats:
"23:59", "2:3:04", "02:00:09", "23:07:00"
with this regex pattern:
return Regex.IsMatch(timeIn, #"^([0-2]?[0-9]:[0-5]?[0-9])|([0-2]?[0-9]:[0-5]?[0-9]:[0-5]?[0-9])$") ? true : false;
unfortunately it accepts other formats as well, e.g.: 00:00:99
what am I doing wrong? Thank you.

You are missing a set of enclosing parenthesis for the whole expression, between the start of line and end of line anchors:
^(([0-2]?[0-9]:[0-5]?[0-9])|([0-2]?[0-9]:[0-5]?[0-9]:[0-5]?[0-9]))$
Without the enclosing parenthesis, your regex was essentially saying:
match ^([0-2]?[0-9]:[0-5]?[0-9])
or ([0-2]?[0-9]:[0-5]?[0-9]:[0-5]?[0-9])$
So, 00:00:99 would be valid, matching against the first part of the regex. And, so would something like: 99:00:00:00, which would match against the second part.
That said, your regex is still going to match some unwanted patterns, such as: 29:00
An improved version would be:
^((([0-1]?[0-9])|(2[0-3]):[0-5]?[0-9])|(([0-1]?[0-9])|(2[0-3]):[0-5]?[0-9]:[0-5]?[0-9]))$

Although it does not answer the question directly, I would like to say that in such a standard cases, I would rather use built-in function rather than creating real scary regular expressions:
DateTime.TryParseExact(input, "H:m:s", CultureInfo.InvariantCulture, DateTimeStyles.None, out d);

Related

Regular expression for matching both HH:MM and HH:MM:SS time format

I'd like to write a regular expression for matching both HH:MM:SS and MM:SS.
Match
99:43:22
1:43:22
01:43:22
1:43:22
43:22
so I've tried
(([0-5]?[0-9]):([0-5]?[0-9]))|(([0-9]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9]))
What I wanted to add was just a single OR(|) syntax with two time regex.
But it doesn't match for HH:MM:SS
what am I missing?
I've already looked into those articles:
Regular expression for matching HH:MM time format
https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch04s06.html
http://regexlib.com/Search.aspx?k=time&AspxAutoDetectCookieSupport=1
Your question is unclear. From whatever little I understand, I would suggest the following regular expression:
^([01]\d?|2[0-4]):[0-5]\d(:[0-5]\d)?$
It makes sure that HH is between 00-24
It makes sure that MM is between 00-59
SS is optional (so it can match both HH:MM:SS and HH:MM), and if it
is there, it is between 00-59
There might be a more efficient method out there, but I can only think of this!
The expression before the alternation operator is matched before the expression after the alternation operator can even be tested.
If you anchor the start and end of these expressions, like so:
(^([0-5]?[0-9]):([0-5]?[0-9]))$|^(([0-9]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])$)
... you should get the behavior you expect.
If you are matching inside of a longer string, then you could put the longest match first:
((([0-9]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])|([0-5]?[0-9]):([0-5]?[0-9])))
or rewrite the regex a bit, as follows:
((\d{1,2}:)?[0-5]?\d:[0-5]?\d)

YYYY/MM/DD date format regular expression

I want to use regular expression for matching these date formats as below in C#.
YYYY/MM/DD 2013/11/12
YYYY/M/DD 2013/5/11
YYYY/MM/D 2013/10/5
YYYY/M/D 2013/5/6
I have tried some regular expressions but they can't match the 4 date formats.
such as
^(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])
check this to get an idea of the compexity of regex and validating dates. so i would use
\d{4}(?:/\d{1,2}){2}
then in c# do whatever to validate the match. while it can be done, you'll be spending a lot of time trying to achieve it, though there is a regex in that post that with a bit of fiddling supposedly will validate dates in regex, but it is a scary looking regex
Try
^\d{4}[-/.]\d{1,2}[-/.]\d{1,2}$
The curly braces {} give the number allowed. E.g., \d{1,2} means either one or two digits.
You may need more than that to match date. Try this:
(19|20)\d\d([-/.])(0?[1-9]|1[012])\2(0?[1-9]|[12][0-9]|3[01])
Ajit's regex is nearer to perfect but leaks the evaluation of the leap years that end with 12 and 16. Here is the correction to be just perfect
((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-((0[13578])|(1[02]))-((0[1-9])|([12][0-9])|(3[01])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-((0[469])|11)-((0[1-9])|([12][0-9])|(30)))|(((000[48])|([0-9]0-9)|([0-9][1-9][02468][048])|([1-9][0-9][02468][048])|([0-9]0-9)|([0-9][1-9][13579][26])|([1-9][0-9][13579][26]))-02-((0[1-9])|([12][0-9])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-02-((0[1-9])|([1][0-9])|([2][0-8])))
((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))\-((0[13578])|(1[02]))\-((0[1-9])|([12][0-9])|(3[01])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))\-((0[469])|11)\-((0[1-9])|([12][0-9])|(30)))|(((000[48])|([0-9][0-9](([13579][26])|([2468][048])))|([0-9][1-9][02468][048])|([1-9][0-9][02468][048]))\-02\-((0[1-9])|([12][0-9])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))\-02\-((0[1-9])|([1][0-9])|([2][0-8])))
This is the regex for yyyy-MM-dd format.
You can replace - with \/ for yyyy/MM/dd...
Tested working perfect..
Try this. This accepts all four patterns
#"\d{4}[- /.]([1-9]|0[1-9]|1[012])[- /.]([1-9]|0[1-9]|[12][0-9]|3[01])"

parsing a method Signature using regular expressions

I am trying to use regular expressions to parse a method in the following format from a text:
mvAddSell[value, type1, reference(Moving, 60)]
so using the regular expressions, I am doing the following
tokensizedStrs = Regex.Split(target, "([A-Za-z ]+[\\[ ][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[\\( ][A-Za-z0-9 ]+[, ].+[\\) ][\\] ])");
It is working, but the problem is that it always gives me an empty array at the beginning if the string started with a method in the given format and the same happens if it comes at the end. Also if two methods appeared in the string, it catches only the first one! why is that ?
I think what is causing the parser not to catch two methods is the existance of ".+" in my patern, what I wanted to do is that I want to tell it that there will be a number of a date in that location, so I tell it that there will be a sequence of any chars, is that wrong ?
it woooorked with ,e =D ... I replaced ".+" by ".+?" which meant as few as possible of any number of chars ;)
Your goal is quite unclear to me. What do you want as result? If you split on that method pattern, you will get the part before your pattern and the part after your pattern in an array, but not the method itself.
Answer to your question
To answer your concrete question: your .+ is greedy, that means it will match anything till the last )] (in the same line, . does not match newline characters by default).
You can change this behaviour by adding a ? after the quantifier to make it lazy, then it matches only till the first )].
tokensizedStrs = Regex.Split(target, "([A-Za-z ]+[\\[ ][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[\\( ][A-Za-z0-9 ]+[, ].+?[\\) ][\\] ])");
Problems in your regex
There are several other problems in your regex.
I think you misunderstood character classes, when you write e.g. [\\[ ]. this construct will match either a [ or a space. If you want to allow optional space after the [ (would be logical to me), do it this way: \\[\\s*
Use a verbatim string (with a leading #) to define your regex to avoid excessive escaping.
tokensizedStrs = Regex.Split(target, #"([A-Za-z ]+\[\s*[A-Za-z0-9 ]+\s*,\s*[A-Za-z0-9 ]+\s*,\s*[A-Za-z0-9 ]+\(\s*[A-Za-z0-9 ]+\s*,\s*.+?\)s*\]\s*)");
You can simplify your regex, by avoiding repeating parts
tokensizedStrs = Regex.Split(target, #"([A-Za-z ]+\[\s*[A-Za-z0-9 ]+(?:\s*,\s*[A-Za-z0-9 ]+){2}\(\s*[A-Za-z0-9 ]+\s*,\s*.+?\)s*\]\s*)");
This is an non capturing group (?:\s*,\s*[A-Za-z0-9 ]+){2} repeated two times.

Regex.Replace(string, MatchEvaluator) not working as expected

I've got the following string:
[global::System.CodeDom.Compiler.GeneratedCodeAttribute("System.Data.Design.TypedDataSetGenerator", "2.0.0.0")]
I need to alter it to look like the following:
[global::System.CodeDom.Compiler.GeneratedCodeAttribute("myClass", "myVersion")]
The simplest way to achieve this, obviously, is to use a Regex to capture the pieces that I want from that string, and then concatenate the results with my extra text. However I'm looking to use the Regex.Replace() method to make the code a bit cleaner:
Regex generatedCodeAttributeRegex = new Regex("\\[[?:global::|]System.CodeDom.Compiler.GeneratedCodeAttribute\\((\"System.Data.Design.TypedDataSetGenerator\",[\\s+]\"2.0.0.0\")\\)\\]");
inputFileContent = generatedCodeAttributeRegex.Replace(inputFileContent, delegate(Match m)
{
return string.Format("\"{0}\", \"{1}\"",
this.GetType(),
Assembly.GetExecutingAssembly().GetName().Version);
});
From my understanding, this should replace the captured group with the text specified in the delegate... the problem is that it doesn't. What am I doing wrong? And is it possible to achieve this with the Regex.Replace(string, string) overload?
The way i would do this is with a look behind #"(?<=)" and look ahead #"(?=)" like so:
"(?<=\[global::System\.CodeDom\.Compiler\.GeneratedCodeAttribute\()([^\)]*)(?=\)\])"
Then your replacement string should work as is.
Your regex does not match because you wrote [?:global::|], which is a character range containing the characters ?, :, g, l, o, b, a, :, and |. You probably meant (?:global::|) which is the same as (?:global::)?, i.e. "global:: or nothing".
Also note that by not escaping the dots, they will match anything - not just literal dots. Though that is unlikely to cause problems.
If you fix that, it will work, but not quite as you want, since Regex.Replace replaces the whole match, not just the part in the capturing group.
Not sure about C# specialties, but I'd guess that the regular expression is plain wrong. The square brackets are used in a wrong way. They can only define character classes and cannot be used for capturing of subpatterns. All this assuming C# uses regular perl-style regexps.
The correct regular expression for this would be (attention, not escaped):
\[(global::)?System\.CodeDom\.Compiler\.GeneratedCodeAttribute\("([^"]+)", "([^"]+)"\)\]
My advice: go to http://regexpal.com/ to test your regular expressions first, then implement them in your code. Saves a lot of trouble.

Difficulty with Simple Regex (match prefix/suffix)

I'm try to develop a regex that will be used in a C# program..
My initial regex was:
(?<=\()\w+(?=\))
Which successfully matches "(foo)" - matching but excluding from output the open and close parens, to produce simply "foo".
However, if I modify the regex to:
\[(?<=\()\w+(?=\))\]
and I try to match against "[(foo)]" it fails to match. This is surprising. I'm simply prepending and appending the literal open and close brace around my previous expression. I'm stumped. I use Expresso to develop and test my expressions.
Thanks in advance for your kind help.
Rob Cecil
Your look-behinds are the problem. Here's how the string is being processed:
We see [ in the string, and it matches the regex.
Look-behind in regex asks us to see if the previous character was a '('. This fails, because it was a '['.
At least thats what I would guess is causing the problem.
Try this regex instead:
(?<=\[\()\w+(?=\)\])
Out of context, it is hard to judge, but the look-behind here is probably overkill. They are useful to exclude strings (as in strager's example) and in some other special circumstances where simple REs fail, but I often see them used where simpler expressions are easier to write, work in more RE flavors and are probably faster.
In your case, you could probably write (\b\w+\b) for example, or even (\w+) using natural bounds, or if you want to distinguish (foo) from -foo- (for example), using \((\w+)\).
Now, perhaps the context dictates this convoluted use (or perhaps you were just experimenting with look-behind), but it is good to know alternatives.
Now, if you are just curious why the second expression doesn't work: these are known as "zero-width assertions": they check that what is following or preceding is conform to what is expected, but they don't consume the string so anything after (or before if negative) them must match the assertion too. Eg. if you put something after the positive lookahead which doesn't match what is asserted, you are sure the RE will fail.

Categories

Resources