I have a series of files that I am attempting to parse the date out of the file name. Here is an example of the files that I am currently trying to parse:
AC SCR063_6.8.15.xlsx
AC SCR064_6.22.15_REVISED.xlsx
AccentCare July 2015 Rent Report 06.26.15 Final.xlsx
AccentCare June 2015 Rent Report 05.26.15 Final.xlsx
In these files, the date will most likely always be in a format of dd.mm.yy or dd.mm.yyyy. I've tried to devise a regex expression to match these dates within the string and I've gotten as far as:
^(\d{1,2})\.(\d{1,2})\.(\d{2,4})$
But due to the variability in the file name and my limited knowledge of regex, I am not sure what else I need to do in order to get this regex to match all of these file name cases. Do I need to create an optional capture group before the date portion of the regex to match anything proceeding it and an optional capture group after it as well to exclude the Final.xlsx or the _REVISED.xlsx etc?
EDIT: I should also note these filenames would also have the proceeding pathing information within the string I would be evaluating, although I am sure I could just get the straight filename another way if it would be easier to evaluate the string that way.
EDIT 2: Desired output would be 6.8.15 or 06.26.15 etc, just the date portion that is in dd.mm.yy format. That way I could cast it to a date time within my application.
So the allowed formats are M.d.yyyyand M.d.yy(not dd.mm.yyyy as stated), i would use DateTime.TryParseExact. For example with this LINQ query:
var fileNames = new string[] { "AC SCR063_6.8.15.xlsx", "AC SCR064_6.22.15_REVISED.xlsx", "AccentCare July 2015 Rent Report 06.26.15 Final.xlsx", "AccentCare June 2015 Rent Report 05.26.15 Final.xlsx" };
string[] allowedFormats = { "M.d.yyyy", "M.d.yy" };
DateTime[] dates = fileNames
.Select(fn => Path.GetFileNameWithoutExtension(fn).Split(' ', '_'))
.Select(arr => arr.Select(s => s.TryGetDateTime(null, allowedFormats))
.FirstOrDefault(dt => dt.HasValue))
.Where(nullableDate => nullableDate.HasValue)
.Select(nullableDate => nullableDate.Value)
.ToArray();
which uses this handy extension method to parse strings to DateTime?:
public static DateTime? TryGetDateTime(this string item, DateTimeFormatInfo dfi, params string[] allowedFormats)
{
if (dfi == null) dfi = DateTimeFormatInfo.InvariantInfo;
DateTime dt;
bool success = DateTime.TryParseExact(item, allowedFormats, dfi, DateTimeStyles.None, out dt);
if (success) return dt;
return null;
}
Result is:
08.06.2015 00:00:00 System.DateTime
22.06.2015 00:00:00 System.DateTime
26.06.2015 00:00:00 System.DateTime
26.05.2015 00:00:00 System.DateTime
That roughly looks correct, but you have a start of line and end of line check in your regex (the ^ at the start and the $ at the end).
Try this: (\d{1,2})\.(\d{1,2})\.(\d{2,4})
This works with your examples :
[a-zA-Z\d\s]+(?:_|\s)(\d{1,2}\.\d{1,2}\.\d{2,4})
Demo here : https://regex101.com/r/hA6dQ3/1
Related
I have a string like this:
30/04/2018 o/p=300418
01/03/2017 o/p=010317
10/11/2018 o/p=101118
12/11/2123 o/p=121123
1/1/2018 o/p =010118
code tried but can't get the last one 1/1/2018
string a = "31/04/2018";
string b = a.Replace("/","");
b = b.Remove(4, 2);
You should parse to a DateTime and then use the ToString to go back to a string. The following works with your given input.
var dateStrings = new []{"30/04/2018", "01/03/2017","10/11/2018","12/11/2123","1/1/2018"};
foreach(var ds in dateStrings)
{
Console.WriteLine(DateTime.ParseExact(ds, "d/M/yyyy", System.Globalization.CultureInfo.InvariantCulture).ToString("ddMMyy"));
}
The only change I made is to the first date as that is not a valid date within that month (April has 30 days, not 31). If that is going to be a problem then you should change it to TryParse instead, currently I assumed your example was faulty and not your actual data.
Your structure varies, all of the examples above use two digit month and day, while the bottom only uses a single digit month and day. Your current code basically will replace the slash with an empty string, but when you remove index four to two your output would deviate.
The simplest approach would be:
var date = DateTime.Parse("...");
var filter = $"o/p = {date:MMddyyyy}";
Obviously you may have to validate and ensure accuracy of your date conversion, but I don't know how your applications works.
If you can reasonably expect that the passed in dates are actual dates (hint: there are only 30 days in April) you should make a function that parses the string into DateTimes, then uses string formats to get the output how you want:
public static string ToDateTimeFormat(string input)
{
DateTime output;
if(DateTime.TryParse(input, out output))
{
return output.ToString("MMddyy");
}
return input; //parse fails, return original input
}
My example will still take "bad" dates, but it will not throw an exception like some of the other answers given here (TryParse() vs Parse()).
There is obviously a small bit of overhead with parsing but its negligible compared to all the logic you would need to get the proper string manipulation.
Fiddle here
Parse the string as DateTime. Then run ToString with the format you desire.
var a = "1/1/2018";
var date = DateTime.Parse(a);
var result = date.ToString("ddMMyyyy");
You can use ParseExact to parse the input, then use ToString to format the output.
For example:
private static void Main()
{
var testData = new List<string>
{
"31/04/2018",
"01/03/2017",
"10/11/2018",
"12/11/2123",
"1/1/2018",
};
foreach (var data in testData)
{
Console.WriteLine(DateTime.ParseExact(data, "d/m/yyyy", null).ToString("ddmmyy"));
}
GetKeyFromUser("\nDone! Press any key to exit...");
}
Output
You didn't specify whether these are DateTime values or just strings that look like date time values. I'll assume these are DateTime values.
Convert the string to a DateTime. Then use a string formatter. It's important to specify the culture. In this case dd/mm/yyyy is common in the UK.
var culture = new CultureInfo("en-GB");//UK uses the datetime format dd/MM/yyyy
var dates = new List<string>{"30/04/2018", "01/03/2017","10/11/2018","12/11/2123","1/1/2018"};
foreach (var date in dates)
{
//TODO: Do something with these values
DateTime.Parse(date, culture).ToString("ddMMyyyy");
}
Otherwise, running DateTime.Parse on a machine with a different culture could result in a FormatException. Parsing dates and times in .NET.
I need to do a simple parse the strip out the actual word March (or whatever month it is) from a data saved to a string like this: "March 03/12/2016".
The ending results need to be a string such as: "03/12/2016".
I have been looking through date time formatters and I am not finding a simple method to strip out a month. I was thinking of just cutting the string down to count 11 characters from right to left and then just trimming out the rest but I feel like that is sloppy and there is probably a date format option out there that I'm just not finding.
Any Suggestions?
Just do this:
string input = "March 03/12/2016";
string output = input.Substring(input.IndexOf(' ') + 1);
Another approach:
string result = input.Split(' ')[1];
string input = "March 03/12/2016";
string output;
int index = input.IndexOf(' ');
if(index >= 0) //Checks if there exists a space
{
output = input.Substring(input.IndexOf(' ') + 1);
}
You need to first check if there will always be a space, because if it does not exist it will present problems since input.IndexOf does not have error handling.
Assuming that month name is proper English name for the month, you can use MMMM to extract month name. Then, you can just format the date however you wish.
var date = "March 03/12/2016";
var parsedDate = DateTime.ParseExact(date, "MMMM MM/dd/yyyy", new CultureInfo("en-US"));
Console.WriteLine(parsedDate.ToString("MM/dd/yyyy"));
See in dotnetfiddle.net.
Bear in mind that if months parsed from the date will be different, eg. October 03/12/2016, exception will be thrown.
I'm testing a piece of code to see if the rules will work each time, so I just made a short console application that has 1 string as an input value which I can replace at any time.
string titleID = "document for the period ended 31 March 2014";
// the other variation in the input will be "document for the period
// ended March 31 2014"
What I'm doing is I take a specific part from it (depending if it contains a specific word - nvm the details, there is a consistency so I don't worry about this condition). Afterwards I'm taking the rest of the string after a specific position in order to do a DateTime.ParseExact
Ultimately I need to figure out how to check if the first DateTime.ParseExact has failed
to then perform a second attempt with a different custom format.
This is how it looks like:
if(titleID.Contains("ended "))
{
// take the fragment of the input string after the word "ended"
TakeEndPeriod = titleID.Substring(titleID.LastIndexOf("ended "));
// get everything after the 6th char -> should be "31 March 2014"
GetEndPeriod = TakeEndPeriod.Substring(6);
format2 = "dd MMMM yyyy"; // custom date format which is mirroring
// the date from the input string
// parse the date in order to convert it to the required output format
try
{
DateTime ModEndPeriod = DateTime.ParseExact(GetEndPeriod, format2, System.Globalization.CultureInfo.InvariantCulture);
NewEndPeriod = ModEndPeriod.ToString("yyyy-MM-ddT00:00:00Z");
// required target output format of the date
// and it also needs to be a string
}
catch
{
}
}
// show output step by step
Console.WriteLine(TakeEndPeriod);
Console.ReadLine();
Console.WriteLine(GetEndPeriod);
Console.ReadLine();
Console.WriteLine(NewEndPeriod);
Console.ReadLine();
Everything works fine until I try a different input string, f.eg. "document for the period ended March 31 2014"
So in this case if wanted to parse "March 31 2014" I'd have to switch my custom format to
"MMMM dd yyyy" and I do that and it works, but I cannot figure out how to check if the first parse fails in order to perform the second one.
First parse - > success -> change format and .ToString
|-> check if failed , if true do second parse with different format -> change format and .ToString
I've tried
if (String.IsNullOrEmpty(NewEndPeriod))
{ do second parse }
Or
if (NewEndPeriod == null)
{ do second parse }
But I get a blank result at Console.WriteLine(NewEndPeriod);
Any ideas how to approach this?
** EDIT: **
Adding here an alternative answer I got which is using Parse instead of TryParseExact, since Parse will handle both of the format variations without the need to specify them
DateTime DT = DateTime.Parse(GetEndPeriod);
//The variable DT will now have the parsed date.
NewEndPeriod = DT.ToString("yyyy-MM-ddT00:00:00Z");
but I cannot figure out how to check if the first parse fails in order
to perform the second one
Instead of DateTime.ParseExact use DateTime.TryParseExact that will return a bool indicating if parsing was successful or not.
DateTime ModEndPeriod;
if (!DateTime.TryParseExact(GetEndPeriod,
format,
CultureInfo.InvariantCulture,
DateTimeStyles.None,
out ModEndPeriod))
{
//parsing failed
}
You can also use multiple formats for parsing using the DateTime.TryParse overload which takes an array of formats:
string[] formatArray = new [] { "dd MMMM yyyy","MMMM dd yyyy"};
DateTime ModEndPeriod;
if (!DateTime.TryParseExact(GetEndPeriod,
formatArray,
CultureInfo.InvariantCulture,
DateTimeStyles.None,
out ModEndPeriod))
{
//parsing failed
}
I am trying to find the best possible way to extract a Date and Time string that is stored in a very very strange format out of a file name (string) that was retrieved from an FTP file listing.
The string is as follows:
-rwxr-xr-x 1 ftp ftp 267662 Jun 06 09:13 VendorInventory_20130606_021303.txt\r
The specific data I am trying to extract is 20130606_021303. 021303 is formatted as hours, seconds and milliseconds. DateTime.Parse and DateTime.ParseExact are not willing to cooperate. Any idea on how to get this up and running?
Looks like you've got the entire row of the file listing, including permissions, user, owner, file size, timestamp and filename.
The data you're asking for appears to be just part of the filename. Use some basic string manipulation (Split, Substring, etc...) first. Then when you have just the datetime portion, you can then call DateTime.ParseExact.
Give it a try yourself first. If you run into problems, update your question to show the code you are attempting, and someone will help you further.
...
Oh, fine. What the heck. I'm feeling generous. Here's a one-liner:
string s = // your string as in the question
DateTime dt = DateTime.ParseExact(string.Join(" ", s.Split('_', '.'), 1, 2),
"yyyyMMdd HHmmss", null);
But please, next time, try something on your own first.
UPDATE I assume there is a fixed structure to the file display of the FTP listing, so you could simply use String.Substring to extract the datetime string, and then parse with DateTime.ParseExact:
var s = "-rwxr-xr-x 1 ftp ftp 267662 Jun 06 09:13 VendorInventory_20130606_021303.txt\r";
var datetime = DateTime.ParseExact(s.Substring(72,15),"yyyyMMddHHmmss",null);
Original Answer
Use a regular expression. Try the following:
var s = "-rwxr-xr-x 1 ftp ftp 267662 Jun 06 09:13 VendorInventory_20130606_021303.txt\r";
/*
The following pattern means:
\d{8}) 8 digits (\d), captured in a group (the parentheses) for later reference
_ an underscore
(\d{6}) 6 digits in a group
\. a period. The backslash is needed because . has special meaning in regular expressions
.* any character (.), any number of times (*)
\r carriage return
$ the end of the string
*/
var pattern = #"(\d{8})_(\d{6})\..*\r$";
var match = Regex.Match(s, pattern);
string dateString = matches.Groups[1].Value;
string timeString = matches.Groups[2].Value;
and parse using ParseExact:
var datetime = DateTime.ParseExact(dateString + timeString,"yyyyMMddHHmmss",null);
This might work:
string s = "-rwxr-xr-x 1 ftp ftp 267662 Jun 06 09:13 VendorInventory_20130606_021303.txt\r";
// you might need to adjust the IndexOf method a bit - if the filename/string ever changes...
// or use a regex to check if there's a date in the given string
// however - the first thing to do is extract the dateTimeString:
string dateTimeString = s.Substring(s.IndexOf("_") + 1, 15);
// and now extract the DateTime (you could also use DateTime.TryParseExact)
// this should save you the trouble of substringing and parsing loads of ints manually :)
DateTime dt = DateTime.ParseExact(dateTimeString, "yyyyMMdd_hhmmss", null);
I have product list and every product has create date in DateTime type. I want to take some products that created after my entering time in string type.
I enter EnteredDate in string type, like this format : 05/16/2012
1. var dates = from d in Products
2. where d.CreateDate >= DateTime.ParseExact( EnteredDate, "mm/dd/yy", null )
3. select d;
In second line I got error as String was not recognized as a valid DateTime for "mm/dd/yy".
I also tried DateTime.Parse(), Convert.ToDateTime() and got same error.
How can I filter this product list by create date?
"mm" is minutes, and your year is 4 digits, not 2. You want "MM/dd/yyyy", if your format is really always that. How confident are you on that front? (In particular, if it's entered by a user, you should probably make your code culture-sensitive...)
I would suggest pulling the parsing part out of the query though, and also probably using the invariant culture for parsing if you've really got a fixed format:
DateTime date = DateTime.ParseExact(EnteredDate, "MM/dd/yyyy",
CultureInfo.InvariantCulture);
var dates = Products.Where(d => d.CreateDate >= date);
Call
DateTime.ParseExact(EnteredDate, "MM/dd/yyyy", CultureInfo.InvariantCulture);