Regex and date matching - c#

I want to search for all possible dates in a string using Regex.
In my code i have this:
String dateSearchPattern = #"(?<Day>\d{2}).(?<Month>\d{2}).(?<Year>\d{4})|(?<Day>\d{2}).(?<Month>\d{2}).(?<Year>\d{2})";
// date format: dd.mm.yyyy or d.m.yyyy or dd.mm.yy or d.m.yy
String searchText = "20.03.2010.25.03.10";
Regex.Matches(searchText, dateSearchPattern); // the matching SHOULD give a count of 2
The above code gives only 1 match where it should give 2. Also i need to have a patthern when the date format is like d.m.yyyy or d.m.yy.

The pattern seems perfectly ok. It is giving two match. By any chance have you used the following line to check the count?
var match = Regex.Matches(searchText, dateSearchPattern);
Console.WriteLine(match.Count);
I used SD 3 on .Net 3.5 (w/o sp1) and your code is giving your desired result.

You can change your pattern to this:
"(?<Day>\d{1,2}).(?<Month>\d{1,2}).(?:(?<Year>\d{4})|(?<Year>\d{2}))"

Related

Wrong DateTime format

When I run the following code in a DotNetFiddle, I get the output I expect:
var myDate = new DateTime(2019, 6, 1);
Console.WriteLine(myDate.ToString("MM/dd/yy"));
// Output is: 06/01/19
But when I run the exact same code in a brand new C# Console application (.NET Framework 4.5.2) or in the C# Interactive window, I get this output:
06-01-19
Why does the Console app and the C# Interactive window replace / with - in the output? Based on this answer and the documentation, I would expect the date to be delimited with / not -. In this example from Microsoft's documentation, it shows the output containing /:
Console.WriteLine("The current date and time: {0:MM/dd/yy H:mm:ss zzz}",
thisDate2);
// The example displays the following output:
// The current date and time: 06/10/11 15:24:16 +00:00
When working with datetime format strings, the / character is special, just like d, M, or y. It means use the system-defined date separator character.
So the system date separator at DotNetFiddle is /, but the separator on your system is -.
If you really always need the / character, this excerpt from the linked documentation will help:
To change the date separator for a particular date and time string, specify the separator character within a literal string delimiter. For example, the custom format string mm'/'dd'/'yyyy produces a result string in which "/" is always used as the date separator.
Be careful with this. Over-riding the system and user's choices should not be done lightly. For example, I sometimes see people want to do this in order for format a date for use in an SQL command, and that is never okay; if you're formatting dates as strings for SQL, rather than using query parameters, you're doing something very wrong.

Get Date From Filename Using ParseExact

I am trying to find the file that has the highest date in a single directory. The problem is that the dates are attached to filenames. I am using the following code to try to pull the max date but am running into trouble with the ParseExact.
//Gather all of the files in the local directory
var files = Directory.EnumerateFiles(r.getLeadLocalFile());
returnDateTime = files.Max(f => DateTime.ParseExact(f, "MMddyyXXXX.csv", CultureInfo.InvariantCulture));
I continue to get the following error:
String was not recognized as a valid DateTime.
I can tell that the value of the file path is being passed in because the value of 'f' is below:
\\\\vamarnas02\\users\\meggleston\\User Files\\Leads\\110716ENH9.csv
The value of ENH9 can change depending on the file.
How can I get the DateTime from my filename?
Here's another approach. No need to split out anything. But one bad filename (as with your current approach) will ruin it:
//Gather all of the files in the local directory
var files = new DirectoryInfo(r.getLeadLocalFile()).GetFiles("*.csv");
returnDateTime = files.Max(f => DateTime.ParseExact(f.Name.Substring(0, 6), "MMddyy", CultureInfo.InvariantCulture));
You need to split out the date text before parsing. The following code snippet should help.
Assume the variable f is the filename.
DateTime.ParseExact(f.Substring( f.LastIndexOf("\\") + 1, 6), "MMddyy", CultureInfo.InvariantCulture);
Do you really need to use ParseExact here? Because it seems that you just need to get Int32 values and compare them afterwards.
So another approach: you can extract your date parts with some regex, from the path provided. For example you can use this one:
\\\d{6} // 2 slashes and 6 digits. I'm not an expert in regex, but seems that this one is enough for your task.
And trim the \\ part afterwards. So something like this in the loop:
private string ExtractDateFromFilename(string filename) {
var m = Regex.Match(filename, #"\\\d{6}");
if (!string.IsNullOrEmpty(m.Value))
return m.Value.Substring(1);
return "";
}
Try only passing the filename "110716ENH9.csv" instead of the full path of the file.
From MSDN DateTime.ParseExact Documentation:
Converts the specified string representation of a date and time to its DateTime equivalent using the specified format and culture-specific format information. The format of the string representation must match the specified format exactly.
From what you've provided, your format does not match exactly.
--
Only pass the first 6 characters of the filename to the ParseExact function and amend your format to be "MMddyy."

replace items in a string while also confirming format

I have a string input in the format of "string#int" and I want to convert it to "string-int" for web friendliness reasons for an api i am using.
To do this I could obviously just replace the single character # with a - using string.replace, but ideally I'd like to do a check that the input (which is user provided by the way) is in the correct format (string#int) while or before converting to the web friendly version with a "-" instead. Essentially I'm wondering if there is a method in C# that I could use to check that this input is in the correct format and convert it to the required result format.
There is no built-in way obviously, since the format you request is quite specific. Also, a string can contain anything, also a hastag, #, so I guess you need to narrow that down.
You could use regular expressions to check if the string is in the correct format. This would be possible expression:
[A-Za-z ]+#[0-9]+
Which matches for:
this is a string#123
There's nothing built in, but you could do the following:
var parts = input.Split(new char[] { '#' });
if (parts.Length != 2) incorrect format
int result;
if (!int.TryParse(parts[1], out result) incorrect format
output = String.Join("-", parts);
This takes the input and splits it on the "#" character. If the result isn't two parts then the string is invalid. You then check that the second part is an integer - if the TryParse fails it's not valid. The last step is to rejoin the two parts, but this time with a - as the separator.

Simple Double Split [duplicate]

This question already has answers here:
Getting required information from Log file using Split
(4 answers)
Closed 9 years ago.
I am reading a text file to upload it into database. The text file contains like this with no headers...
[10-10-2013 11:20:33.444 CDF] 1000020 Incident T This is the error message
[10-10-2013 11:20:33.445 CDF] 1000020 Incident T This is the second error message
How can I store "10-10-2013 11:20:33" in Date Column and milliseconds 444 in integer column of database. Here if I try to use split with space first, it will split date into 3 parts. I want to get date between the brackets and then get the rest with split spaces.
Two points to mention here.
1. Here we have spaces in between date column.
2. Also I should be able to get other columns
The real simplest way to do this is to use regular expressions, not gobs of split and indexof operations.
Regular expressions allow you to specify a pattern out of which pieces of a string can be extracted in a straightforward fashion. If the format changes, or there is some subtlety not initially accounted for, you can fix the problem by adjusting the expression, rather than rewriting a bunch of code.
Here's some documentation for regular expressions in .NET: http://msdn.microsoft.com/en-us/library/az24scfc.aspx
This is some sample code that'll probably do what you want. You may need to tweak a little to get the desired results.
var m = Regex.Match(currentLine, #"^\[(?<date>[^\]]*)\]\s+(?<int>[0-9]+)\s+(?<message>.*)\s*$");
if(m.Success) {
// may need to do something fancier to parse the date, but that's an exercise for the reader
var myDate = DateTime.Parse(m.Groups["date"].Value);
var myInt = int.Parse(m.Groups["int"].Value);
var myMessage = m.Groups["message"].Value;
}
The simplest way to do this is to just use String.Split and String.Substring
Generically I would do this:
//find the indices of the []
var leftIndex = currentLine.IndexOf("[");
var rightIndex = currentLine.IndexOf("]");
//this get's the date portion of the string
var dateSubstring = currentLine.Substring(leftIndex, rightIndex - leftIndex);
var dateParts = dateSubstring.Split(new char[] {'.'});
// get the datetime portion
var dateTime = dateParts[0];
var milliseconds = Int16.Parse(dateParts[1]);
EDIT
Since the date portion is fixed width you could just use Substring for everything.

DotNet DateTime.ToString strange results

Why does:
DateTime.Now.ToString("M")
not return the month number? Instead it returns the full month name with the day on it.
Apparently, this is because "M" is also a standard code for the MonthDayPattern. I don't want this...I want to get the month number using "M". Is there a way to turn this off?
According to MSDN, you can use either "%M", "M " or " M" (note: the last two will also include the space in the result) to force M being parsed as the number of month format.
What's happening here is a conflict between standard DateTime format strings and custom format specifiers. The value "M" is ambiguous in that it is both a standard and custom format specifier. The DateTime implementation will choose a standard formatter over a customer formatter in the case of a conflict, hence it is winning here.
The easiest way to remove the ambiguity is to prefix the M with the % char. This char is way of saying the following should be interpreted as a custom formatter
DateTime.Now.ToString("%M");
Why not use
DateTime.Now.Month?
You can also use System.DateTime.Now.Month.ToString(); to accomplish the same thing
You can put an empty string literal in the format to make it a composite format:
DateTime.Now.ToString("''M")
It's worth mentioning that the % prefix is required for any single-character format string when using the DateTime.ToString(string) method, even if that string does not represent one of the built-in format string patterns; I came across this issue when attempting to retrieve the current hour. For example, the code snippet:
DateTime.Now.ToString("h")
will throw a FormatException. Changing the above to:
DateTime.Now.ToString("%h")
gives the current date's hour.
I can only assume the method is looking at the format string's length and deciding whether it represents a built-in or custom format string.

Categories

Resources