Region Agnostic Char.IsSeparator(ch)? - c#

I have a function that parses a string containing a date(and/or time) e.g. "2009-12-10". I get the order of year-month-day from the Short Date pattern. When going through the string I use Char.IsSeparator(ch) to figure out when the numbers end.
Now however in the case of Korean it seems the Char.IsSeparator(ch) returns false on separator characters. Is there any way to know whether the chars in between the numbers are separator regardless of region setting?
(I also parse strings that are more free containing things like "*20 May 200*9" so doing Char.IsAlphaNum() on the separator will not work either as I don't know the content basically)
Example inputs: "20.10.2009" "2009-05-20" "20 May 2009" "20.05.2009 10:00 AM" "1/1/2009" (in Singapore its D/M/Y in US it is M/D/Y") "Tisdag, 1 Januari 1962" (all strings localized)
Output would be an equivalent of a DateTime instance filled as much as possible (although we use our own types).
Korean seems to have a couple of characters in front of the time and as separator it looks like the symbols are different depending on position in the string.

If you pick up the format using the current short format, you could perhaps also be able to pick up the separator through DateTimeFormatInfo.CurrentInfo.DateSeparator.

Is there any reason why you need to parse the string manually?
If you used the built-in date/time parsing methods - Parse, ParseExact, TryParse or TryParseExact - then you could pass in the required culture-specific format info and let the framework worry about separators etc.

Related

Float.Parse() ignoring decimal comma

I am getting some percent values from a database and I need to format them to have the correct thousands seperator, number of decimal places and a percent sign on the end.
I tried this:
string text = "105,3"; //example, formatting like database input
string format = "#,##0.##";
e.Row.Cells[i].Text = double.Parse(text).ToString(format);
Weirdly this returns 1053,00%. How do I make it so it returns 105,30%? (The decimal comma is because the system locale is german, so it's how it is supposed to be)
edit: replacing the comma with a period results in 10530.00%. Nothing makes sense to me anymore.
edit2: the float.Parse() actually works just fine. the ToString() messes everything up. I played around with using different cultural settings and format strings (switching comma and period) but it only makes it worse again.
Pass the current Culture to the Parse method: double.Parse( text, CultureInfo.CurrentCulture )
However, this only works on systems that use a locale that has the comma as a decimal separator.
If you want this to work on other locales you should replace CurrentCulture with the specific CultureInfo instance that used when inputting data in the first place.
The title is misleading. The actual problem was the ToString() function. In the format string I added the % sign, which, to be fair, I didn't add in the original post because I forgot about it. It automatically multiplies the number by 100. So my format string is now "#,##0.00\%".

Surprising int.ToString output

I have been working on a project, and found an interesting problem:
2.ToString("TE"+"000"); // output = TE000
2.ToString("TR"+"000"); // output = TR002
I also have tried with several strings other than "TE" but all have the same correct output.
Out of curiosity, I am wondering how come this could have happened?
Simply based on Microsoft's documentation, Custom Numeric Format Strings, your strings "TE000" and "TR000" are both custom format strings, but clearly they are parsed differently.
2.ToString("TE000") is just a bug in the formatter; it's going down a buggy path because of the unescaped "E". So it's unexpectedly assuming the whole thing is a literal.
2.ToString("TR000") is being interpreted as an implied "TR" literal plus 3 zero-filled digits for an integer value; therefore, you get "TR002".
If you truly want TE and TR verbatim, the expressions 2.ToString("\"TE\"000") and 2.ToString("\"TR\"000") will accomplish that for you by specifying TE and TR as explicit literals, instead of letting the formatter guess if they are valid format specifiers (and getting it wrong).
The ToString needs to PARSE the format string and understand what to do with it.
Let's take a look to the following examples:
2.ToString("TE000"); //output TE000
2.ToString("E000"); //output 2E+000
2.ToString("0TE000); //output 2TE000
2.ToString("T"); //throws exception
2.ToString("TT"); //output TT
This shows that if the ToString parser can understand at least part of the format, it will assume that the rest is just extra characters to print with it. If the format is invalid for the given number (like when you use a DateTime string format on a number), it will throw an exception. If it can not make sense of the format, it will return the format string itself as the result.
You cannot use a numeric format to achieve a custom format, instead use something like this:
int i = 2;
String.Format("TE{0:X3}", i);
See Custom Numeric Format Strings. The E means the exponent part of the scientific notation of the number. Since 2 is 2E000 in exponential notation, that might explain it.

string date needs format

I have a string date like so:
var sDate = '3/3/2012'
It eventually goes into a DateTime.ParseExact(sDate, "MM/dd/yyyy")
and it fails because of the missing leading zeros.
What's the best way to add the leading zeros?
I know TryParse would have worked but can't refactor at the moment.
What's the best way to add the leading zeros?
Why would you do that? Just use ParseExact with the format it's actually got, which is M/d/yyyy.
The whole point of the format string is to let you declare the format of your data - not to make you change the format of your data.
Note that you can specify multiple patterns with this overload, so you could always pass in both M/d/yyyy and MM/dd/yyyy. I believe M/d/yyyy will work with zero-padded ones anyway though...

A regular expression to validate .NET time format

Background
I need to validate user input in some fields, where these are defining how to show time in some views.
Requirements
Time format must be expressed in Microsoft .NET way (check this MSDN Library article if you want to learn more about framework's date and time formatting: http://msdn.microsoft.com/en-us/library/8kb3ddd4.aspx)
Keep in mind I'm looking to validate the format instead of an actual time string.
For example, user may input:
HH:mm
hh:mm
ss
hh:ss
mm:ss
... and so on.
In fact, it should validate from the shortest to longest time format available.
Another point is I need to do it in client-side using JavaScript. In other words, any given regular expression by you should work in browsers JavaScript regular expressions' engine.
I'll appreciate any self-taylored one, any link or pasted expression!
Thank you in advance.
NOTE (Update)
I can't use ASP.NET validation engine, or any other. Because of project's requirements, I need to avoid that.
As far as I understand, there is no much options - sort of 20, as maximum. Why not just enumerate them all in one big regex without much special symbols? Like
'hh:mm|hh:mm:ss|yyyy-MM-dd hh:mm|<etc>'
you could than make it case sensitive to differentiate between M for month and m for minute, and for hours make it [hH], then make it [:-/] there where you allow for different separators, and lots of other similar things. But the main idea is to simply enumerate all options separated by | with just little amount of regex syntax between | and |.
What is your definition of a "valid" format string? Only once you know that can it be possible to validate a format string.
"K" is also a valid format string
"zz" is also a valid format string
"e" is also a valid format (it would fall into the "The character is copied to the result string unchanged." case)
I'm not even sure what formats would actually cause .NET .ToString() to throw an exception (if that's what you are trying to avoid).

DotNet DateTime.ToString strange results

Why does:
DateTime.Now.ToString("M")
not return the month number? Instead it returns the full month name with the day on it.
Apparently, this is because "M" is also a standard code for the MonthDayPattern. I don't want this...I want to get the month number using "M". Is there a way to turn this off?
According to MSDN, you can use either "%M", "M " or " M" (note: the last two will also include the space in the result) to force M being parsed as the number of month format.
What's happening here is a conflict between standard DateTime format strings and custom format specifiers. The value "M" is ambiguous in that it is both a standard and custom format specifier. The DateTime implementation will choose a standard formatter over a customer formatter in the case of a conflict, hence it is winning here.
The easiest way to remove the ambiguity is to prefix the M with the % char. This char is way of saying the following should be interpreted as a custom formatter
DateTime.Now.ToString("%M");
Why not use
DateTime.Now.Month?
You can also use System.DateTime.Now.Month.ToString(); to accomplish the same thing
You can put an empty string literal in the format to make it a composite format:
DateTime.Now.ToString("''M")
It's worth mentioning that the % prefix is required for any single-character format string when using the DateTime.ToString(string) method, even if that string does not represent one of the built-in format string patterns; I came across this issue when attempting to retrieve the current hour. For example, the code snippet:
DateTime.Now.ToString("h")
will throw a FormatException. Changing the above to:
DateTime.Now.ToString("%h")
gives the current date's hour.
I can only assume the method is looking at the format string's length and deciding whether it represents a built-in or custom format string.

Categories

Resources