Quick question about precision with DateTime.Parse in c#?
I have multiple input files that I am consolidating into one list of lines, which I will then order by DateTime.
This is fine, except the Parse function seems to handle the date incorrectly when the milliseconds is only 2 digits, in the case below it treats 09:57:44.84 as 09:57:44.840 instead of 09:57:44.084
List<DateTime> lstUnOrdered = new List<DateTime>();
lstUnOrdered.Add(DateTime.Parse("04/09/2020 09:57:44.573", CultureInfo.InvariantCulture));
lstUnOrdered.Add(DateTime.Parse("04/09/2020 09:57:44.84", CultureInfo.InvariantCulture));
var Ordered = lstUnOrdered.OrderBy(x => x.TimeOfDay);
foreach (var item in Ordered)
{
Console.WriteLine(item.ToString("dd/MM/yyyy HH:mm:ss.fff"));
}
When run, you get the following Output
09/04/2020 09:57:44.573
09/04/2020 09:57:44.840
I Expected this Output
09/04/2020 09:57:44.084
09/04/2020 09:57:44.573
Any suggestions on where I may be going wrong?
Thanks
EDIT:
Based on the comments below, just a few updates:
"Fundamentally: whatever is generating these values is broken, and should be fixed. If that's impossible for you, you should patch the data before parsing it, by inserting extra 0s where necessary. "
-- This is correct, I have no control over the data. But I do know that the order in the input file is :
09/04/2020 09:57:44.84
09/04/2020 09:57:44.573
This tells me the date should be 084 and not 840, I'm not disputing Parse is incorrect, just looking for alternatives to parse these dates with more precision, rather then having to write another method to sanitize the date string first.
I can of course split the string on the . and add 1 or 2 zeroes if needed, was hoping .Net had an inbuilt way for doing this with DateTime.Parse or alternative.
Thanks
Just an Update, This will fix the issue with the bad input string in case anyone is wondering:
static void Main(string[] args)
{
List<DateTime> lstUnOrdered = new List<DateTime>();
lstUnOrdered.Add(DateTime.Parse(MakeCorrectDate("04/09/2020 09:57:44.573"),
CultureInfo.InvariantCulture));
lstUnOrdered.Add(DateTime.Parse(MakeCorrectDate("04/09/2020 09:57:44.84"), CultureInfo.InvariantCulture));
var Ordered = lstUnOrdered.OrderBy(x => x.TimeOfDay);
foreach (var item in Ordered)
{
Console.WriteLine(item.ToString("dd/MM/yyyy HH:mm:ss.fff"));
}
}
private static string MakeCorrectDate(string strDate)
{
string[] milli = strDate.Split('.');
return milli[0] + "." + milli[1].PadLeft(3, '0');
}
Related
I have a timestamp from a server of the form 20220505 17:36:29 - it has 2 whitespaces, and I do not trust the sender to always send the same number of whitespaces in future revisions - ideally would like to handle any number of whitespaces the same.
I tried this with DateTime.ParseExact but failed:
var horribleTimestamp = "20220505 17:36:29";
var timestamp = DateTime.ParseExact(horribleTimestamp, "yyyyMMdd hh:mm:ss", CultureInfo.InvariantCulture)
// throws `System.FormatException: String '20220505 17:36:29' was not recognized as a valid DateTime.`
To save my headaches with timezones later how can I achieve this with Nodatime as i think makes sense to switch to that already.
The time is local from my PC and I would like to convert this to a global timestamp (which I believe should be Instant?) for a given local timezone?
If you want to handle any amount of whitespace, there are two options:
Use a regular expression (or similar) to get it into a canonical format with a single space
Split on spaces and then parse the first and last parts separately. (Or split on spaces, recombine the first and last parts and parse...)
In Noda Time, the value you've got represents a LocalDateTime, so that's what you should parse it to. Here's a complete example using the regex approach:
using NodaTime;
using NodaTime.Text;
using System.Text.RegularExpressions;
// Lots of spaces just to check the canonicalization
string text = "20220505 17:36:29";
// Replace multiple spaces with a single space.
string canonicalized = Regex.Replace(text, " +", " ");
// Note: patterns are immutable; you should generally store them in
// static readonly fields. Note that "uuuu" represents an absolute year number,
// whereas "yyyy" would be "year of era".
LocalDateTimePattern pattern =
LocalDateTimePattern.CreateWithInvariantCulture("uuuuMMdd HH:mm:ss");
ParseResult<LocalDateTime> result = pattern.Parse(canonicalized);
// Note: if you're happy with an exception being thrown on a parsing failure,
// juse use result.Value unconditionally. The approach below shows what to do
// if you want to handle parse failures without throwing an exception (or with
// extra behavior).
if (result.Success)
{
LocalDateTime value = result.Value;
Console.WriteLine(value);
}
else
{
// You can also access an exception with more information
Console.WriteLine("Parsing failed");
}
You can pass multiple formats to ParseExact as an array
var horribleTimestamp = "20220505 17:36:29";
var formats = new[]{"yyyyMMdd HH:mm:ss","yyyyMMdd HH:mm:ss","yyyyMMdd HH:mm:ss"};
var timestamp = DateTime.ParseExact(horribleTimestamp, formats, CultureInfo.InvariantCulture, 0);
dotnetfiddle
You have an error in your format. use HH instead of hh. See updated code below
var horribleTimestamp = "20220505 17:36:29";
var timestamp = DateTime.ParseExact(horribleTimestamp, "yyyyMMdd HH:mm:ss", CultureInfo.InvariantCulture)
Here is y link that explains what you can use in a format -> https://www.c-sharpcorner.com/blogs/date-and-time-format-in-c-sharp-programming1
You can solve the problem with the whitespaces in this way:
var horribleTimestamp = "20220505 17:36:29";
var date = horribleTimestamp.Substring(0, 8);
var index = horribleTimestamp.LastIndexOf(' ') + 1;
var time = horribleTimestamp.Substring(index, horribleTimestamp.Length - index);
var timestamp = DateTime.ParseExact($"{date} {time}", "yyyyMMdd HH:mm:ss", CultureInfo.InvariantCulture);
I suppose that date has always 8 characters and that space is always present. In other case, check index == -1.
New C# learner here. I've scanned through many questions that have already been posted here; I'm sorry if I missed a question like this that has already been asked.
Background
A program I use produces Excel files which have names that contain the date in which they are created. Thousands of Excel files are produced which need to be sorted. My mission here is to extract information from these file names so I am able to move the file to its appropriate location upon confirmation. I am working with a program that successfully finds all associated files with a particular string. I have stored the names of these files within an array.
Example File Name: IMPORTANT_NAME_LISTED (TEXT) [xx-xx-xx] [HH_MM].xlsx
What Is Known
The date is stored within "[ ]" in month/day/year format and it 100 % consistent (meaning that every file will produce the same format, size and location of the date).
I have been trying to develop a solution which targets "." before the file extension and extract the date, but I am struggling.
My Strategy
I have an initial decision, making sure the array that has all of the file names stored contains values.
//code that extracts file names exists above
//file names which interest me are stored within "fileNameArray"
//Determine if the array that collected file names contains values
if (fileNameArray.Length > 1)
{
for (int k = 0; k <= fileNameArray.Length; k++)
{
//Extract date from "[xx-xx-xx] [HH-MM]"
//Transform MM/DD/YY to YY/MM/DD and temporarily store
//Compare each date value that exist within the string
//Target the most recent file - find the array index
//(Ex: 20180831 - today's date)
}
}
My problems stem from properly parsing these individual array items while retaining the array index.
Do any of you recommend a method to use?
LINQ?
Array.FindAll functionality?
I greatly appreciate the help.
-Chris
Edit: Further Information about my situation...
I have a directory of Excel files, which can be in excess of ~1-3k files. I have a program which reads the file names of all of Excel files. A lot of the heavy filtering/sorting takes place before the code I have above which I want to implement.
I have been struggling with solving the issue with respect to handling files with the same name. For example:
I have 4 files that contain the same partial name "DILITHIUM_CRYSTYAL_FUEL_TIME"
My program must be able to filter/search file names through the core name "DILITHIUM_CRYSTYAL_FUEL_TIME". If I have more than one file with the same name, I need to be able to parse the file names in a way which isolates the time stamp within the file name and finds the most recent file.
My files will always show the time stamp, to the left of the file extension, in a 100% consistent manner.
I need to be able to extract this time stamp, and run comparisons against the other files, and isolate the file which is most up-to-date.
LINQ is a good choice for this, combined with Regex for parsing.
var dateRE = new Regex(#"\[(\d\d-\d\d-\d\d)\] \[(\d\d-\d\d)\](?=.xlsx)", RegexOptions.Compiled);
if (fileNameArray.Length > 0) {
var ans = fileNameArray.Select((n, i) => {
var dtMatch = dateRE.Match(n);
return new { Filename = n, Index = i, Filedate = DateTime.ParseExact(dtMatch.Groups[1].Value+" "+dtMatch.Groups[2].Value, "MM-dd-yy HH-mm", CultureInfo.InvariantCulture) };
})
.OrderByDescending(nid => nid.Filedate)
.First();
}
If you want to process the filenames differently, you can replace First() with some other LINQ operation.
I would also go for regex, string parsing and linq:
Working example here: https://dotnetfiddle.net/veUq2N
using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
private static Random random = new Random();
private static Regex fileNameFragmentPattern = new Regex(#"\[(.*?)\]\.xlsx");
private const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
public static void Main()
{
var fileNames = new List<string>();
// Generate random file names
for (var i = 0; i < 10000; i++) {
fileNames.Add(RandomString(random.Next(8,10)) + "_" + RandomString(random.Next(4,5)) + "_" + "(TEXT) [" + RandomDate().ToString("MM-dd-yyyy") + "].xlsx");
}
// sort files by parsed dates
var dateSortedFileNames = fileNames.OrderByDescending( f => ExtractDate(f));
foreach (var fileName in dateSortedFileNames) {
// you can do anything with sorted files here (or anywhere else below :)
Console.WriteLine(fileName);
}
}
public static DateTime ExtractDate(string fileName) {
var fragment = fileNameFragmentPattern.Match(fileName).Value;
var month = int.Parse(fragment.Substring(1,2));
var day = int.Parse(fragment.Substring(4,2));
var year = int.Parse(fragment.Substring(7,4));
return new DateTime(year, month, day);
}
public static string RandomString(int length)
{
return new string(Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)]).ToArray());
}
public static DateTime RandomDate(int min = -9999, int max = 9999)
{
return DateTime.Now.AddDays(random.Next(min,max));
}
}
Here's a non-regex solution.
var files = new List<string>
{
"IMPORTANT_NAME_LISTED (TEXT) [05-26-92].xlsx",
"IMPORTANT_NAME_LISTED (TEXT) [11-02-89].xlsx",
"IMPORTANT_NAME_LISTED (TEXT) [02-21-96].xlsx"
};
foreach (var fileName in files)
{
var nameOnly = Path.GetFileNameWithoutExtension(fileName);
var dateStr = nameOnly.Substring(nameOnly.Length - 9, 8);
if (DateTime.TryParseExact(dateStr, "MM-dd-yy", CultureInfo.InvariantCulture, DateTimeStyles.None, out DateTime date))
Console.WriteLine(date.ToShortDateString());
}
Since you mention that the 'date' part of the file name is 100% consistent, and we know that the length of your 'date' will always be 8. So using that knowledge,
nameOnly.Substring(nameOnly.Length - 9, 8);
will extract the string starting right after the first [, and will extract 8 characters ending before ].
And if you're 100% positive that the file extension will always be .xlsx, then you can shorten the code even more.
foreach (var fileName in files)
{
var dateStr = fileName.Substring(fileName.Length - 14, 8);
if (DateTime.TryParseExact(dateStr, "MM-dd-yy", CultureInfo.InvariantCulture, DateTimeStyles.None, out DateTime date))
Console.WriteLine(date.ToShortDateString());
}
I wanted to post here again, with what I used to solve my programming problem. It has been a busy past week or so, I apologize for the delay.
Here is a snippet from my code that solved my problem beautifully:
string scanToolDateFinalStgT1 = "";
DateTime scanToolDateFinalT1 = new DateTime(2000, 1, 1, 1, 1, 00);
for (int k = 0; k < scanToolT1Pass.Count(); k++)
{
string scanToolPassNameOnly = Path.GetFileNameWithoutExtension(scanToolT1Pass[k].ToString());
string scanToolDateStr = scanToolPassNameOnly.Substring(scanToolPassNameOnly.IndexOf("[") + 1, 8);
string scanToolTimeStr = scanToolPassNameOnly.Substring(scanToolPassNameOnly.LastIndexOf("[") + 1, 5);
DateTime currentScanToolDate = DateTime.ParseExact(scanToolDateStr + " " + scanToolTimeStr, "MM-dd-yy HH_mm", null);
if (currentScanToolDate > scanToolDateFinalT1)
{
scanToolDateFinalT1 = currentScanToolDate;
scanToolDateFinalStgT1 = scanToolT1Pass[k].ToString();
}
}
Information:
This snippet is aimed at targeting '[xx-xx-xx] [xx-xx].', which is a partial unique identifier for a file name.
The program is passing in 'scanToolT1Pass', which is an array of file names. My task is to take this array and parse the file names, finding the most recent one.
'DateTime scanToolDateFinalT1' has a generic date of 1/01/200, 1:01:00, which is strictly used as a base comparison point. I am certain my data will never require dates which happen before the year 2000. I tried to have a reference date reading all zeros to compare to, but VisualStudio did not approve of that.
Explanation:
Are there more advanced and/or proper methods to parse this data? I'm sure there is. But, for a beginner programmer, this method makes a lot or sense to me and I aim to perfect it in the future. It was most important to me to have a program that functions first, than to invest a lot of study into polishing it.
I was able to implement similar for loops throughout my program, filtering through large amounts of data at a very rapid pace.
Thanks again to the community and to #Sach & #It Man, whose responses I was able to craft into my solution.
Chris
Simpler alternative :
var regex = new Regex(".*\[(.*)-(.*)] \[(.*)].*");
string latest = fileNameArray.OrderBy(s => regex.Replace(s, "$2$1$3")).Last();
Demo and explanation of the pattern can be seen on https://regex101.com/r/Ldh0sa
public struct DatedExcelOutput
{
public string FullName { get; }
public string Name { get; }
public DateTime CreationDate { get; }
public DatedExcelOutput(string fileName)
{
FullName = fileName;
Name = getName();
CreationDate = parseDate();
}
}
It could be called like this:
IEnumerable<string> fileNames = GetFiles();
var datedFiles = fileNames.Select(f => new DatedExcelOutput(f))
.OrderBy(d => d.CreationDate);
You'll likely end up needing to sort these ascending/descending in a UI right? So I don't think it makes sense to throw the date information away.
Edit: Removed unnecessary IO calls as NetMage pointed out.
I have a date range come like this,
string ActualReleaseDates ="7/8/2016, 7/9/2016, 7/11/2016,7/3/2016,7/10/2016,7/17/2016,7/24/2016,7/31/2016";
string NewsReleasedDate ="07/11/2016";
I want to check NewsReleaseDate is inside the ActualReleaseDates
But in the following code it return as a false.
if (ActualReleaseDates.Split(',').Contains(NewsReleasedDate.TrimStart(new Char[] { '0' })))
{
//some code here
}
The immediate problem is that after splitting your ActualReleaseDates string, there isn't an entry of "7/11/2016"... instead, there's an entry of " 7/11/2016"... note the space.
But more fundamentally, just trimming the start of NewsReleasedDate won't help if the value is something like "07/08/2016"... what you should be doing is handling these values as dates, rather than as strings:
Split ActualReleaseDates by comma, then parse each value (after trimming whitespace) in an appropriate format (which I suspect is M/d/yyyy) so that you get a List<DateTime>.
Parse NewsReleasedDate in the appropriate format, which I suspect is MM/dd/yyyy, so you get a DateTime.
See whether the parsed value from the second step occurs in the list from the first step.
(I'd personally recommend using Noda Time and parsing to LocalDate values, but I'm biased...)
Fundamentally, you're trying to see whether one date occurs in a list of dates... so make sure you get your data into its most appropriate representation as early as possible. Ideally, avoid using strings for this at all... we don't know where your data has come from, but if it started off in another representation and was converted into text, see if you can avoid that conversion.
The white space problem. You can use trim() and ' 7/11/2016' will be '7/11/2016'
var ActualReleaseDates = "7/8/2016, 7/9/2016, 7/11/2016,7/3/2016,7/10/2016,7/17/2016,7/24/2016,7/31/2016";
var NewsReleasedDate = "07/11/2016";
var splitActualReleaseDates = ActualReleaseDates.Split(',').Select(x => x.Trim());
if (splitActualReleaseDates.Contains(NewsReleasedDate.TrimStart(new Char[] { '0' })))
{
}
You can use linq to convert your strings into DateTime objects and compare them instead of strings
string ActualReleaseDates ="7/8/2016,7/9/2016,7/11/2016,7/3/2016,7/10/2016,7/17/2016,7/24/2016,7/31/2016";
string NewsReleasedDate ="07/11/2016";
var releaseDates = ActualReleaseDates.Split(',').Select(x => DateTime.Parse(x));
var newsReleased = DateTime.Parse(NewsReleaseDate);
if (releaseDates.Contains(newsReleased))
{
//some code here
}
please note that DateTime is parsed respectively to the current Culture. You can use DateTime.ParseExact if you want to specify exact date format.
You can Prase to DateTime before doing the query like this:
(I think this is the most accurate and guaranteed way to compare dates)
Func<string, DateTime> stringToDate = s => DateTime.ParseExact(s.Trim(), "M/d/yyyy",
CultureInfo.InvariantCulture);
DateTime newReleaseDateTime = stringToDate(NewsReleasedDate);
bool result = ActualReleaseDates.Split(',').Select(x => stringToDate(x))
.Contains(newReleaseDateTime);
It returns false because of the date 07/11/2016 stored in NewsReleasedDate is stored as string with a '0' at the begining. And in the ActualReleaseDates string you have white spaces between the ',' and numbers.
Try to rewrite theese strings like this :
ActualReleaseDates ="7/8/2016,7/9/2016,7/11/2016,7/3/2016,7/10/2016,7/17/2016,7/24/2016,7/31/2016"; // white spaces removed.
and the variable like this :
NewsReleasedDate ="7/11/2016"; // 0 removed
This is my code example :
string ActualReleaseDates = "7/8/2016,7/9/2016,7/11/2016,7/3/2016,7/10/2016,7/17/2016,7/24/2016,7/31/2016";
string NewsReleasedDate = "7/11/2016";
string[] dates = ActualReleaseDates.Split(',');
Console.WriteLine(dates.Contains(NewsReleasedDate));
This is not the best way to compare dates, you can use Date class which is usefull to do this kind of comparations.
I am trying to query MongoDB using the following -
List<BsonDocument> list = NoSqlBusinessEntityBase.LoadByWhereClause("peoplecounting",
string.Concat("{siteid:\"", siteid, "\", locationid:\"", location._id ,"\",
StartTime: {$gte:ISODate(\"",old.ToString("yyyy-mm-dd hh:mm:ss"),"\")},
EndTime: {$lte:ISODate(\"",current.ToString("yyyy-MM-dd hh:mm:ss"), "\"\")}}"));
The LoadByWhereClause() function is as follows -
public static List<BsonDocument> LoadDataByWhere(string table, string whereClause)
{
var collection = db.GetCollection(table);
QueryDocument whereDoc = new QueryDocument(BsonDocument.Parse(whereClause));
var resultSet = collection.Find(whereDoc);
List<BsonDocument> docs = resultSet.ToList();
if (resultSet.Count() > 0)
{
foreach (BsonDocument doc in docs)
{
doc.Set("_id", doc.GetElement("_id").ToString().Split('=')[1]);
}
return docs;
}
else
{
return null;
}
}
Even though the query runs fine in MongoDB console and returns documents
db.peoplecounting.find({siteid:"53f62abf66455068373665ff", locationid:"53f62bb86645506837366603",
StartTime:{$gte:ISODate("2012-12-03 02:40:00")}, EndTime:{$lte:ISODate("2013-12-03 07:40:00")}}
I get the error when I try to load in C# using the LoadByWhereClause function. The error is String was not recognized as a valid DateTime. while parsing the whereClause.
How can I possibly fix this? I am unable to determine what is going wrong here.
It's not entirely clear, but I suspect the problem may well be how you're formatting the date. This:
old.ToString("yyyy-mm-dd hh:mm:ss")
should almost certainly be this:
old.ToString("yyyy-MM-dd HH:mm:ss")
or possibly
old.ToString("yyyy-MM-dd'T'HH:mm:ss")
Because:
mm means minutes. You don't want the minutes value between your year and day-of-month; you want the month (MM)
hh means "hour of half-day" (i.e. 1-12). You want the hour of full day, 0-23 (HH)
ISO-8601 uses a T literal to separate the date frmo the time.
I note that your current.ToString is better, but not correct - it gets the month right, but not the hour. The fact that these are inconsistent is a problem to start with - I would advise you to write a separate method to convert a DateTime appropriately.
I have a C# list collection that I'm trying to sort. The strings that I'm trying to sort are dates "10/19/2009","10/20/2009"...etc. The sort method on my list will sort the dates but the problem is when a day has one digit, like "10/2/2009". When this happens the order is off. It will go "10/19/2009","10/20/2009","11/10/2009","11/2/2009","11/21/2009"..etc. This is ordering them wrong because it sees the two as greater than the 1 in 10. How can I correct this?
thanks
The problem is they're strings, but you want to sort them by dates. Use a comparison function that converts them to dates before comparing. Something like this:
List<string> strings = new List<string>();
// TODO: fill the list
strings.Sort((x, y) => DateTime.Parse(x).CompareTo(DateTime.Parse(y)));
Assuming all your strings will parse:
MyList.OrderBy(d => DateTime.Parse(d));
Otherwise, you might need to use ParseExact() or something a little more complicated.
write a compare method to convert "10/2/2009" to a date then compare
I wanted to see how well I could outperform Chris's solution with my own IComparer. The difference was negligible. To sort the same list of one million dates, my solution took 63.2 seconds, and Chris's took 66.2 seconds.
/// <summary>
/// Date strings must be in the format [M]M/[D]D/YYYY
/// </summary>
class DateStringComparer : IComparer<string>
{
private static char[] slash = { '/' };
public int Compare(string Date1, string Date2)
{
// get date component strings
string[] strings1 = Date1.Split(slash);
string[] strings2 = Date2.Split(slash);
// get date component numbers
int[] values1 = { Convert.ToInt32(strings1[0]),
Convert.ToInt32(strings1[1]),
Convert.ToInt32(strings1[2]) };
int[] values2 = { Convert.ToInt32(strings2[0]),
Convert.ToInt32(strings2[1]),
Convert.ToInt32(strings2[2]) };
// compare year, month, day
if (values1[2] == values2[2])
if (values1[0] == values2[0])
return values1[1].CompareTo(values2[1]);
else
return values1[0].CompareTo(values2[0]);
else
return values1[2].CompareTo(values2[2]);
}
}
As for sorting the dates as pre-existing DateTime instances, that took 252 milliseconds.
You need to either use a sort specific for dates, or use something like Natural Sort.
Parse the strings to DateTime objects and use DateTime.Compare.
Chris beat me to it!
If you care about performance and if that is possible for you, you would preferably sort your dates before you generate the strings. You would then use the date objects directly for the sort.
You would then save time manipulating strings back and forth.