.Net Regex for mm-dd-yy and others + yyymmdd - c#

I feel like I am chasing my tail.
I am trying to arrive at a .Net regex that will match on the following:
mm-dd-yy
m-dd-yy
mm-d-yy
m-d-yy
and (no dashes)
yyyymmdd

One or two digits followed by a dash, followed by one or two digits, followed by a dash, follows by to digits or eight digits:
(\d{1,2}-\d{1,2}-\d{2})|(\d{8})

A very simple RegEx that just matches to the digits being in the correct places and matches all your formats:
^[0-9]{1,2}-[0-9]{1,2}-[0-9]{2}$|^[0-9]{8}$
This does not validate those dates as actual possible dates, to do this you would be better off using DateTime.TryParse

Do you need to use Regex, or do you just care that it is a valid date?
DateTime result;
if(DateTime.TryParse(input, out result))
{
// you have your date in result
}

Well it all comes down to how strict you want the matching to be.
[0-9]+\-[0-9]+\-[0-9]+
will match all the four top ones. But you can enter "99-99-99".
You can be a bit more strict with:
([0-1][0-2]|[0-9])\-(3[0-1]|[0-2]?[0-9]+)\-[0-9]{1,2}
This will only match dates where each component is in a valid range, but only to an extent. It would still match a February with 31 days (which it won't have in the Gregorian calendar). Also it will match 13-04-18 from 3 and onwards. You can use anchors to make it match the whole text (add ^ at the beginning and $ at the end of the regex), but then it won't be able to find the dates inside a text.
You can add a precondition to make sure there's no weird digits around it though. Negative look-behind and negative look-ahead.
(?<![0-9])([0-1][0-2]|[0-9])\-(3[0-1]|[0-2]?[0-9]+)\-[0-9]{1,2}(?![0-9])
And so forth, but this regex has already become a proper behemoth. I would go with the second or third version and use DateTime.Parse to validate it.
There's only so far you can go with regex for dates before they become write-once and insane :) (what about leap years for example, etc. etc.)

Related

How to Allow Characters in Certain Spot in String

I'm trying to parse a date. The problem, is my regular expression omits any letter because I want to avoid 01-28-2019 UTC or any letters outside of the main date. Now, it works fine when the date is formatted like I just listed, however it fails when we get a date formatted like 28-JAN-19.
var sourceValue = Regex.Replace("28-JAN-19", #"[A-Za-z]", "");
var parsed = DateTime.Parse(sourceValue);
The date I need to parse can be in a few different formats. Can a regular expression be used to handle this? If so, what tweaks are needed to trim any letters outside of the xx-xx-xx part of the string?
28-JAN-19
28-01-19
28-JAN-19 13:15:00
28-01-19 13:15:00
28-01-2019 13:15:00
This RegEx should match all the examples you provided:
[0-9]{2}-([A-Za-z]{3}|[0-9]{2})-[0-9]{2,4}( [0-9][0-9]?:[0-9][0-9]?:[0-9][0-9])?
It does make a couple of assumptions though, based on your examples. First, it assumes all your dates will always start with a 2-digit day. It also assumes that your month abbreviations will be 3 letters long. It also assumes that your hours, minutes and seconds will all be 2 digits long. Let me know if any of these assumptions are incorrect.
Here is a fiddle
Regular expressions are likely not your best bet. If you know the full set of formats you might encounter then you can use the regular DateTime.ParseExact with a format string. Check for a FormatException to know if you've successfully parsed the date. If your months are using English abbreviations then be sure to pass in an English culture
DateTime.ParseExact("28-JAN-19", "dd-MMM-yy", new CultureInfo("en"));

Ignore date in a string with numbers using regular expression

I have a little Problem.
i use [0-9\,.]*
to finde a decimal in a string.
And ([^\s]+) to find the text behind the first number.
The string looks normally like this. 1 number a text and than a date:
1.023,45 stück
24.05.10
but sometimes I had just the date and then i become 240510 as decimal.
And sometimes I had just the decimal.
How should I modify the regex to find the date if existing and remove it?
And then look for a decimal an select this if existing.
Thanks in advance.
Divide and conquer
Check for the date first and remove the match from the string
([0-9]{1,2}\.){2}[0-9]{1,2}
Find the number using your original regex
[0-9\,.]*
If you need it find the unit of quantity (assuming that you will only have it as lower case with u Umlaut)
([a-zü]+)
See http://regexe.de/ (German) and http://www.regexr.com/ (English) for some useful information and tools for dealing with regex.
I suggest matching the number in a more restricted way (1-3 digits, then . + 3 digits groups if any, and a decimal separator with digits, optional).
(?s)(?<number>\d{1,3}(?:\.\d{3})*(?:,\d+)?)\s+(.*?)(?:$|\n|(?<date>\d{2}\.?`\d{2}\.?(?:\d{4}|\d{2})))
See demo
The number will be held in ${number}, and the date in ${date}. If the string starts with something very similar to a date (6 or 8 digits with optional periods), it won't be captured. If the date format is known (say, the periods are always present), remove the ?s from \.?s.
(?s) at the beginning will force the period . to match a new line (maybe it is not necessary).

Validating Positive number with comma and period

I need a regular expression validation expression that will
ALLOW
positive number(0-9)
, and .
DISALLOW
letter(a-z)
any other letter or symbol except . and ,
for example, on my asp.net text box, if I type anything#!#--, the regular expression validation will disallow it, if I type 10.000,50 or 10,000.50 it should allowed.
I've been trying to use this regex:
^\d+(\.\d\d)?$
but my textbox also must allow , symbol and I tried using only integer regex validation, it did disallow if I type string, but it also disallow . and , symbol while it should allow number(0-9) and also . and , symbol
Don't Use \d to match [0-9] in .NET
First off, in .NET, \d will match any digits in any script, such as:
654۳۲١८৮੪૯୫୬१७੩௮௫౫೮൬൪๘໒໕២៧៦᠖
So you really want to be using [0-9]
Incomplete Spec
You say you want to only allow "digits, commas and periods", but I don't think that's the whole spec. That would be ^[0-9,.]+$, and that would match
...,,,
See demo.
Tweaking the Spec
It's hard to guess what you really want to allow: would 10,1,1,1 be acceptable?
We could start with something like this, to get some fairly well-formed strings:
^(?:[0-9]+(?:[.,][0-9]+)?|[1-9][0-9]{0,2}(?:(?:\.[0-9]{3})*|(?:,[0-9]{3})*)(?:\.[0-9]+)?)$
Play with the demo, see what should and shouldn't match... When you are sure about the final spec, we can tweak the regex.
Sample Matches:
0
12
12.123
12,12
12,123,123
12,123,123.12456
12.125.457.22
Sample Non-Matches:
12,
123.
1,1,1,1
Your regex would be,
(?:\d|[,\.])+
OR
^(?:\d|[,\.])+$
It matches one or more numbers or , or . one or more times.
DEMO
Maybe you can use this one (starts with digit, ends with digit):
(\d+[\,\.])*\d+
If you need more sophisticated price Regex you should use:
(?:(?:[1-9]\d?\d?([ \,\.]?\d{3})*)|0)(?:[\.\,]\d+)?
Edit: To make it more reliable (and dont get 00.50) you can add starting and ending symbol check:
(^|\s)(?:(?:[1-9]\d?\d?([ \,\.]?\d{3})*)|0)(?:[\.\,]\d+)($|\s)?
I think the best regex for your condition will be :
^[\d]+(?:,\d+)*(?:\.\d+)?$
this will validate whatever you like
and at the same time:
not validate:
numbers ending in ,
numbers ending in .
numbers having . before comma
numbers having more than one decimal points
check out the demo here : http://regex101.com/r/zI0mJ4
Your format is a bit strange as it is not a standard format.
My first thought was to put a float instead of a string and put a Range validation attribute to avoid negative number.
But because of formatting, not sure it would work.
Another way is the regex, of course.
The one you propose means :
"some numbers then possibly a group formed by a dot and two numbers exactly".
This is not what you exepected.
Strictly fitted your example of a number lower than 100,000.99 one regex could be :
^[0-9]{1-2}[\.,][0-9]{3}([\.,][0-9]{1-2})?$
A more global regex, that accept all positive numbers is the one posted by Avinash Raj : (?:\d|[,\.])+

YYYY/MM/DD date format regular expression

I want to use regular expression for matching these date formats as below in C#.
YYYY/MM/DD 2013/11/12
YYYY/M/DD 2013/5/11
YYYY/MM/D 2013/10/5
YYYY/M/D 2013/5/6
I have tried some regular expressions but they can't match the 4 date formats.
such as
^(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])
check this to get an idea of the compexity of regex and validating dates. so i would use
\d{4}(?:/\d{1,2}){2}
then in c# do whatever to validate the match. while it can be done, you'll be spending a lot of time trying to achieve it, though there is a regex in that post that with a bit of fiddling supposedly will validate dates in regex, but it is a scary looking regex
Try
^\d{4}[-/.]\d{1,2}[-/.]\d{1,2}$
The curly braces {} give the number allowed. E.g., \d{1,2} means either one or two digits.
You may need more than that to match date. Try this:
(19|20)\d\d([-/.])(0?[1-9]|1[012])\2(0?[1-9]|[12][0-9]|3[01])
Ajit's regex is nearer to perfect but leaks the evaluation of the leap years that end with 12 and 16. Here is the correction to be just perfect
((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-((0[13578])|(1[02]))-((0[1-9])|([12][0-9])|(3[01])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-((0[469])|11)-((0[1-9])|([12][0-9])|(30)))|(((000[48])|([0-9]0-9)|([0-9][1-9][02468][048])|([1-9][0-9][02468][048])|([0-9]0-9)|([0-9][1-9][13579][26])|([1-9][0-9][13579][26]))-02-((0[1-9])|([12][0-9])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))-02-((0[1-9])|([1][0-9])|([2][0-8])))
((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))\-((0[13578])|(1[02]))\-((0[1-9])|([12][0-9])|(3[01])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))\-((0[469])|11)\-((0[1-9])|([12][0-9])|(30)))|(((000[48])|([0-9][0-9](([13579][26])|([2468][048])))|([0-9][1-9][02468][048])|([1-9][0-9][02468][048]))\-02\-((0[1-9])|([12][0-9])))|((([0-9][0-9][0-9][1-9])|([1-9][0-9][0-9][0-9])|([0-9][1-9][0-9][0-9])|([0-9][0-9][1-9][0-9]))\-02\-((0[1-9])|([1][0-9])|([2][0-8])))
This is the regex for yyyy-MM-dd format.
You can replace - with \/ for yyyy/MM/dd...
Tested working perfect..
Try this. This accepts all four patterns
#"\d{4}[- /.]([1-9]|0[1-9]|1[012])[- /.]([1-9]|0[1-9]|[12][0-9]|3[01])"

Regex match zero or one time a string

I'm trying to make a Regex that matches this string {Date HH:MM:ss}, but here's the trick: HH, MM and ss are optional, but it needs to be "HH", not just "H" (the same thing applies to MM and ss). If a single "H" shows up, the string shouldn't be matched.
I know I can use H{2} to match HH, but I can't seem to use that functionality plus the ? to match zero or one time (zero because it's optional, and one time max).
So far I'm doing this (which is obviously not working):
Regex dateRegex = new Regex(#"\{Date H{2}?:M{2}?:s{2}?\}");
Next question. Now that I have the match on the first string, I want to take only the HH:MM:ss part and put it in another string (that will be the format for a TimeStamp object).
I used the same approach, like this:
Regex dateFormatRegex = new Regex(#"(HH)?:?(MM)?:?(ss)?");
But when I try that on "{Date HH:MM}" I don't get any matches. Why?
If I add a space like this Regex dateFormatRegex = new Regex(#" (HH)?:?(MM)?:?(ss)?");, I have the result, but I don't want the space...
I thought that the first parenthesis needed to be escaped, but \( won't work in this case. I guess because it's not a parenthesis that is part of the string to match, but a key-character.
(H{2})? matches zero or two H characters.
However, in your case, writing it twice would be more readable:
Regex dateRegex = new Regex(#"\{Date (HH)?:(MM)?:(ss)?\}");
Besides that, make sure there are no functions available for whatever you are trying to do. Parsing dates is pretty common and most programming languages have functions in their standard library - I'd almost bet 1k of my reputation that .NET has such functions, too.
In your edit you mention an unwanted leading space in the result… to check a leading or trailing condition together with your regex without including this to the result you can use lookaround feature of regex.
new Regex(#"(?<=Date )(HH)?:?(MM)?:?(ss)?")
(?<=...) is a lookbehind pattern.
Regex test site with this example.
For input Date HH:MM:ss, it will match both regexes (with or without lookbehind).
But input FooBar HH:MM:ss will still match a simple regex, but the lookbehind will fail here. Lookaround doesn't change the content of the result, but it prevents false matches (e.g., this second input that is not a Date).
Find more information on regex and lookaround here.

Categories

Resources