Parsing dates without all values specified - c#

I'm using free-form dates as part of a search syntax. I need to parse dates from strings, but only preserve the parts of the date that are actually specified. For instance, "november 1, 2010" is a specific date, but "november 2010" is the range of dates "november 1, 2010" to "november 30, 2010".
Unfortunately, DateTime.Parse and friends parse these dates to the same DateTime:
DateTime.Parse("November 1, 2010") // == {11/1/2010 12:00:00 AM}
DateTime.Parse("November, 2010") // == {11/1/2010 12:00:00 AM}
I need to know which parts of the DateTime were actually parsed and which were guessed by the parser. Essentially, I need DateTime.Parse("November, 2010") == {11/-1/2010 -1:-1:-1}; I can then see that the day portion is missing and calculate the range of dates covering the whole month.
(Internally, C# has the DateTimeParse and DateTimeResult classes that parse the date and preserve exactly the information I need, but by the time the date gets back to the public interfaces it's been stripped off. I'd rather avoid reflecting into these classes, unless that's really the only route.)
Is there some way to get DateTime.Parse to tell me which format it used to parse the date? Or can the returned DateTime have placeholders for unspecified parts? I'm also open to using another date parser, but I'd like it to be as reliable and locale-flexible as the internal one. Thanks in advance.
EDIT: I've also tried ParseExact, but enumerating all of the formats that Parse can handle seems nearly impossible. Parse actually accepts more formats than are returned by DateTimeFormatInfo.GetAllDateTimePatterns, which is about as canonical a source as I can find.

You could try using TryParseExact(), which will fail if the data string isn't in the exact format specified. Try a bunch of different combinations, and when one succeeds you know the format the date was in, and thus you know the parts of the date that weren't there and for which the parser filled in defaults. The downside is you have to anticipate how the user will want to enter dates, so you can expect exactly that.
You could also use a Regex to digest the date string yourself. Again, you'll need different regexes (or a REALLY complex single one), but it is certainly possible to pull the string apart this way as well; then you know what you actually have.

Parse parses a whole lot of stuff that no sane person would enter as a date, like "January / 2010 - 21 12: 00 :2". I think you'll have to write your own date parser if you want to know what exactly the user entered.
Personally I would do it like KeithS suggested: Parse the string with Parse and only call your own parse function if there's a 0 in one of the fields of the DateTime object. There are not that that possibilities you need to check for, because if the day is 0, the time will be 0, too. So start checking year, month, day, etc..
Or simply instruct the user to use specific formats you recognize.

Essentially, I need
DateTime.Parse("November, 2010") ==
{11/-1/2010 -1:-1:-1}; I can then see
that the day portion is missing and
calculate the range of dates covering
the whole month.
What you want is an illegal DateTime because you cannot have a negative hours/seconds/minute/day values. If you want to return something else other then a legal DateTime you have to write your own method which does NOT return a DateTime.
Is there some way to get
DateTime.Parse to tell me which format
it used to parse the date? Or can the
returned DateTime have placeholders
for unspecified parts? I'm also open
to using another date parser, but I'd
like it to be as reliable and
locale-flexible as the internal one.
Take a look here http://msdn.microsoft.com/en-us/library/w2sa9yss.aspx
You are going to have to manually keep track of what is entered to do this task. The only solution is to make sure the input is in the correct format.

I used this method that goes back to the original string in order to check for existence of the day and the year:
For days, the original string must contain a 1 as integer if the day was specified. So, split the string and look for a 1. The only exception occurs when the month is January (#1 month), so you should check for two 1s or a 1 and "January" or "Jan" in the original string.
For years, the original string must contain a number that can be a year (say, from 1900 to 2100). Other possibilities may be the use of an apostrophe, or things like 02-10-16, which you can recognize by the fact that there are exactly three numbers.
I know that this is pretty heuristic, but it's a fast and simple solution that works in most cases. I coded this algorithm in C# in the DateFinder.DayExists() and DateFinder.YearExists() methods in the sharp-datefinder library.

Related

FormatException of DateTime.Parse. Check which part is wrong in my date string

I'd like to validate a date string in my TextBox.
If I have 2018-06-07 string in my TextBox, I can parse it with success with DateTime.TryParse or DateTime.Parse.
But if I have 2018-12-35 in my TextBox, the DateTime.Prase throws FormatException that is absolutely right.
How can I determine which part of DateTime is wrong. For Example how to determine if Day or Month is wrong.
DateTime.Parse tries a number of formats - some to do with the current culture, and some more invariant ones. It looks like it's including an attempt to parse with the ISO-8601 format of "yyyy-MM-dd" - which is valid for your first example, but not for your second. (There aren't 35 days in December.)
As it's trying to parse multiple formats, it doesn't necessarily make sense to isolate which part is "wrong" - different parts could be invalid for different formats.
How you should tackle this depends on where your data comes from. If it's meant to be machine-readable data, it's best to enforce a culture-invariant format (ideally ISO-8601) rather than using DateTime.Parse: specify the format you expect using DateTime.ParseExact or DateTime.TryParseExact. You still won't get information about which part is wrong, but it's at least easier to reason about. (That will also make it easier to know which calendar system is being used.)
If the data is from a user, ideally you'd present them with some form of calendar UI so they don't have to enter the text at all. At that point, only users who are entering the text directly would produce invalid input, and you may well view it as "okay" for them to get a blanket error message of "this value is invalid".
I don't believe .NET provides any indication of where the value was first seen to be invalid, even for a single value. Noda Time provides more information when parsing via the exception, but not in a machine-readable way. (The exception message provides diagnostic information, but that's probably not something to show the user.)
In short: you probably can't do exactly what you want to do here (without writing your own parser or validator) so it's best to try to work out alternative approaches. Writing a general purpose validator for this would be really hard.
If you only need to handle ISO-8601 dates, it's relatively straightforward:
Split the input by dashes. If there aren't three dashes, it's invalid
Parse each piece of the input as an integer. That can fail (on each piece separately)
Validate that the year is one you're happy to handle
Validate that the month is in the range 1-12
Validate that the day is valid for the month (use DateTime.DaysInMonth to help with that)
Each part of that is reasonably straightforward, but what you do with that information will be specific to your application. We don't really have enough context to help write the code without making significant assumptions.

convert string in unknown format to date in c#

I have searched stackoverflow for an answer but no luck. I am developing a windows application and I have some strings in different date formats,
eg.
dd/MM/yyyy
MM/dd/yyyy
MM-dd-yyyy
dd-MM-yyyy
dd/MM/yyyy hh:mm::ss
MM/dd/yyyy hh:mm::ss
etc...
But I need to convert in to a common format - dd/MM/yyyy. The application can run in any windows machines in different culture.
What is the correct way to do it?
EDIT: One more thing I may not know what the format of incoming string.
Thanks in advance.
Use DateTime.ParseExact with the different patterns as formats.
If after parsing you really need to use a string representation, use the ToString method of the DateTime with the explicit format that you're interested in (so that it is culture-invariant). It's better however to keep the DateTime because this is format-agnostic.
You could distinguish between those formats that use different separators (i.e. "/" vs "-"). But how would you know if date such as 10/11/2010 represents 10th of November or 11th of October? If one number is not bigger than 12, there is no reliable way to do this without knowing an exact format.
As others have pointed out, if you do know the exact format, then you can use DateTime.ParseExact.
If you are processing some import file with a lot of dates in the same unknown format, you could try different formats and hope there is exactly one that doesn't give format errors.
Or to put it another way: split the "dates" into three numbers and check the range of values for each of those numbers. Values > 1900 will be years. If you find values from 1 to 31, those will be days.
Values from 1 to 12 might be months, but could also be days. Try and identify each of the parts.
The best way is to ask the supplier of those dates for the format.
To run this program on different culture, i think you should creat a function to indentify the culture of this string format and then use Datetime.Parse

Change date to Universal Time Format changes my date wrongly I can't get error

I am trying to simply change the date format from the datatable to universal time format but it formats it wrongly as if I have date for August 7 it changed it to August 8 after formatting it to universal date time. My code for formatting date is,
DateVar[runs] = DateTime.Parse(Convert.ToString(output.Tables[0].Rows[runs][0])).ToUniversalTime().ToString();
Don't get in to code its correct and its a part of loop so "run" is loop and output is data set having one table I have first data in table is "Sunday, August 07, 2011 10:52 PM" and it was converted to "8/8/2011 5:52:00 AM" after implementing universal time format.
Hopes for your suggestions
Universal time isn't a format - it's a time zone, effectively. It's not clear what you're trying to do, but converting a "local" DateTime to "universal" DateTime will usually change the time. If you don't want that to happen, don't call ToUniversalTime.
It's a pity that the .NET date/time API isn't as clear as it could be - the DateTime type itself has some horrible ambiguities about it. I'm trying to improve the situation with my Noda Time project, but you will need to understand what time zones are about etc.
Personally I would suggest not using simply DateTime.Parse or just calling ToString unless you're absolutely sure that the default format is what you want. I usually call DateTime.ParseExact and specify the expected format (and usually CultureInfo.InvariantCulture unless it's a user-entered string) - and likewise I provide a format string to the ToString call.
In your code you're simply converting a string to a string - what are you attempting to accomplish? If you're just trying to change the format (e.g. to dd/MM/yyyyTHH:mm:ss) then you don't need to call ToUniversalTime but you do need to provide the format string.
I suggest you split your code out into several statements to help you debug this (and for general code clarity):
Fetch the string from the DataTable, if you really need to (if it's already a DateTime, there's no point in converting it to a string and then back again)
Parse the string (again, assuming you need to)
Perform any conversions you need to
Format the DateTime with an explicit format string
Now if any single operation is causing a problem, you can isolate it more easily.
If I run ToUniversalTime() from Greenwich it will give same time but if i do it while I live some where else it will get an offset date time object of + or - hours depending on position.

How to format dates in format yyddd having no leading zeros

I am dealing with dates coming from the AS/400 that are a form of julian date. January 1st 2000 comes back as a string value of "1". If the date were in true julian form it would look like 2000001. The date 12/31/2049 is comes back from the AS/400 as "49365". Is there a way to format these dates in my C# code to look like standard short dates?
What does January 1, 2001 look like?
If it looks like "1001", you can pad on the left with zeros to 5 digits and then extract the 2-digit year as the first two digits and the day-of-year number as the last 3. It should then be a simple matter to convert the day-of-year number to a month and day; if nothing else you can do it with a bunch of if statements on day ranges.
If it looks like "11" because there are no leading zeros in the day number, you're just out of luck as there is no way to differentiate between many dates, such as January 1, 2001 and January 11, 2000.
P.S. These aren't Julian dates, they're a variation on ordinal dates.
IF your dates are always of the format 'yyddd':
If you can write your SQL statement directly, the following will work...
SELECT CAST('20' || julianDate as date)
FROM table
If you don't, consider writing a view that incorporates that behaviour (that's one of the reasons views exist...).
For obvious reasons, all dates will be considered year 2000 and later...
IF (for whatever reason) it's removing any leading zeros in each part (as pointed out in comments for #Anomie's answer), you are indeed simply toast. Frankly, the entire dataset is probably a loss, as I'm not sure how even RPG would be able to differentiate between certain dates properly at that point.
IBM defines *JUL date format as yy/ddd. It is not commonly used, but is is an available standard format supported on the AS/400. You say you have a string, so the assumption here is that it is stored as CHAR(5), or 5A in DDS.
If you column is called jdt, get the right number of digits in your string, in SQL, with:
RIGHT(('00000' || TRIM(jdt)),5)
Now put the slash in:
INSERT( RIGHT(('00000'||TRIM(jdt)),5) ,3,0,'/')
DB2/400 can cast this to a real date field, but it will only work properly if you can SET OPTION DATFMT=*JUL. How to do this from C# on Windows would depend on how you are connecting.
Let's suppose you can't find the way to set the date format on your connection.
Solution: Create a user defined function [UDF] in DB2.
First, choose an appropriate library to store the function in, and set that as your current library. In OS/400 CL, CHGCURLIB yourlib, or in SQL, SET SCHEMA = yourlib. Thus by default anything you create will go into this library.
I recommend storing the SQL definition for your UDF in a source member. (Ask if unfamiliar) You can execute the source with the RUNSQLSTM command.
Your function definition could look something like this:
CREATE FUNCTION CvtJul2Date( jdtin char(5) ) RETURNS DATE
LANGUAGE SQL
SPECIFIC CVTJUL2DT
DETERMINISTIC NO EXTERNAL ACTION
CONTAINS SQL
RETURNS NULL ON NULL INPUT
SET OPTION DATFMT = *JUL
BEGIN
RETURN(
date( insert( right(('00000'||trim(jdtin)),5) ,3,0,'/') )
);
END
The *JUL option is scoped to the UDF. Any SQL query that runs on the AS/400 should be able to do this conversion, regardless of the DATFMT of the job (assuming you have put this function in a library which is on that job's library list).
Oops... My bad. A method will still probably have to be written.
Based on your description, an increase of 1 is a new day? Looks like you will have to do some math to calculate the date. Maybe create a function like
public DateTime ConvertDate(int julianDate)
{
}
This is untested and may need some changes. But this would be my suggestion.

Strategy for Incomplete Dates

Working on an application where we would like the user to be able to enter incomplete dates.
In some cases there will only be a year - say 1854, or there might be a year and a month, for example March 1983, or there may be a complete date - 11 June 2001.
We'd like a single 'date' attribute/column - and to be able to sort on date.
Any suggestions?
Store the date as an integer -- yyyymmdd.
You can then zero out any month or day component that has not been entered
Year only: 1954 => 19540000
Year & Month: April 2004 => 20040400
January 1st, 2011 => 20110101
Of course I am assuming that you do not need to store any time of day information.
You could then create a struct to encapsulate this logic with useful properties indicating which level of granularity has been set, the relevant System.DateTime, etc
Edit: sorting should then work nicely as well
I can't think of a good way of using a single date field.
A problem you would get if you used January as the default month and 1 as the default day like others have suggested is, what happens when they actually pick January? How would you track if it's a selected January or a defaulted January.
I think you're going to have to store a mask along with the date.
You would only need a bit per part of the date, which would only be 6 bits of data.
M|D|Y|H|Min|S
Month Only 1|0|0|0|0|0 = 32
Year Only 0|0|1|0|0|0 = 8
Month+Year 1|0|1|0|0|0 = 40
AllButMinSec 1|1|1|1|0|0 = 60
You could put this into a Flag Enum to make it easier to use in code.
Well, you could do it via a single column and field that says 'IsDateComplete'.
If you only have the date field, then you'll need to encode the "incompleteness" in the date format itself, such that if the date is, say, < 1900, it's considered "Incomplete".
Personally, I'd go with an field on the side, that marks it as such. Easier to follow, easier to make decisions on, and allows for any dates.
It goes without saying, perhaps, that you can just create a date from DateTime.MinValue and then set what you "know".
Of course, my approach doesn't allow you to "know" what you don't know. (That is, you don't know that they've set the month). You could perhaps use a date-format specifier to mask that, and store it alongside as well, but it's potentially getting cumbersome.
Anyway, some thoughts for you.
One option is to use January as the default month, 1 as the default day, and 1900 or something like that as the default year. Incomplete dates would get padded out with those defaults, and incomplete dates would sort before complete ones in the same year.
Another, slightly more complex option is to use -1 for default day and year, and -1, 'NoMonth', or some such as the default month. Pad incomplete dates as above. This may make sorting a little hard depending on how you do it, but it gives you a way of telling which parts of the date are valid.
I know you'd rather have 1 column but, Instead of a single column one can always have a separate column for day, month and year. Not very difficult to do queries against, and it allways any of the components to be null.
Smehow encoding these states in the datetime itself will be harder to query.
What I did when last solving this problem, was to create a custom date type that kept track of which date parts was actually set and provided conversions to and from a DateTime. For storing in database i used one date field and then one boolean/bit to keep track of which date components that were actually set by the user.

Categories

Resources