C# thousand separator issue with decimal.tryparse - c#

I am not sure how this is able to be parsed correctly in C# but I would like it to fail where the case where the comma is not separated every repeatable three value. Example: 1,123.23 should pass but 11,23.23 should fail in my sense. But the actual output is that tryparse seems to always return true regardless of where the position of comma is before decimal.
Edit: Answer with regex is being accepted since it is found that this is a bug. Thank you.
string price = "1,1,2,3.23";
decimal outputValue = 0;
var allowedStyles = (NumberStyles.AllowDecimalPoint | NumberStyles.AllowThousands);
if (Decimal.TryParse(price, allowedStyles, CultureInfo.GetCultureInfo("EN-us"), out outputValue))
{
Console.WriteLine("Pass");
}

As you noted NumberStyles.AllowThousands doesn't enforce the comma to be on the correct place. So I think a regular expression can help you here:
Regex.IsMatch("11,23.23", "^[+-]?[0-9]{1,3}(,[0-9]{3})*(.[0-9]*)?$");

I don't know if this helps but, yeah I think I should try. I think my answer is a little but straight forward, just if the concern is the format, I made it compare on a .ToString("format specified"); and compare it to your "price" string. Just my 2 cents.
string price = "1,1,2,3.23";
decimal priceParse = 0;
if (decimal.TryParse(price, out priceParse))
{
string shouldBeFormat = Convert.ToDecimal(priceParse).ToString("#,##0.00");
if (price == shouldBeFormat)
{
// your good
}
else
{
// no good
}
}

You have two acceptable formats, so you can check if the number is parseable and, if so, check it is in an acceptable format:
string price = "1,123.23";
decimal outputValue = 0;
var allowedStyles = (NumberStyles.AllowDecimalPoint | NumberStyles.AllowThousands);
var cul = CultureInfo.GetCultureInfo("EN-us");
if (decimal.TryParse(price, allowedStyles, cul, out outputValue))
{
if (outputValue.ToString("N", cul) == price || outputValue.ToString("G", cul) == price)
{
Console.WriteLine("Pass");
}
}

What you discovered is clearly a bug. I strongly recommend do not stuck here, instead implement a workaround. (and also apply KISS).
Unless this code part executed zillion ad zillion times in a high math algorithm's core or any other way is performance critical, here is a simple workaround.
(Supposing the strings are using ',' (comma) as thousand separator. (and they are not decimal separator as it could be some culture)):
price = price.Replace(",",""); // This will not change the value when comma is thousand separator.
// Go forward to parsing

I ran a few different codes and i realized when you apply AllowThousands, the only constraint on the place of ',' is that it should be on the integer part of the number.
some results:
"123,,3.12" => pass
"123,,3.1,3" => fail

Related

C# string to Decimal On All style or Culture

Hi I want to find if there is any better way to parse the string to Decimal which covers various format
$1.30
£1.50
€2,50
2,50  €
2.500,00  €
I see a lot of examples using culture to convert . & ,. But in my case, I don't have anything to identify the culture.
This display field I get from the client and I need to extract the value.
I tried following (which didn't work for all scenario) but would like to know if we have any best way to handle this.
Decimal.Parse(value,NumberStyles.Currency |
NumberStyles.Number|NumberStyles.AllowThousands |
NumberStyles.AllowTrailingSign | NumberStyles.AllowCurrencySymbol)
I also tried to use Regex to remove the currency sign but unable to convert both 1.8 or 1,8 in one logic.
Well, assuming you always get a valid currency format, and it's only the culture that changes, you could guess which character is used as a decimal point and which is used as a thousands separator by checking which appears the last in the number. Then remove all the thousand separators and parse it like its culture was invariant.
The code would look like the following:
// Replace with your input
var numberString = "2.500,00 €";
// Regex to extract the number part from the string (supports thousands and decimal separators)
// Simple replace of all non numeric and non ',' '.' characters with nothing might suffice as well
// Depends on the input you receive
var regex = new Regex"^[^\\d-]*(-?(?:\\d|(?<=\\d)\\.(?=\\d{3}))+(?:,\\d+)?|-?(?:\\d|(?<=\\d),(?=\\d{3}))+(?:\\.\\d+)?)[^\\d]*$");
char decimalChar;
char thousandsChar;
// Get the numeric part from the string
var numberPart = regex.Match(numberString).Groups[1].Value;
// Try to guess which character is used for decimals and which is used for thousands
if (numberPart.LastIndexOf(',') > numberPart.LastIndexOf('.'))
{
decimalChar = ',';
thousandsChar = '.';
}
else
{
decimalChar = '.';
thousandsChar = ',';
}
// Remove thousands separators as they are not needed for parsing
numberPart = numberPart.Replace(thousandsChar.ToString(), string.Empty);
// Replace decimal separator with the one from InvariantCulture
// This makes sure the decimal parses successfully using InvariantCulture
numberPart = numberPart.Replace(decimalChar.ToString(),
CultureInfo.InvariantCulture.NumberFormat.CurrencyDecimalSeparator);
// Voilá
var result = decimal.Parse(numberPart, NumberStyles.AllowDecimalPoint | NumberStyles.Number, CultureInfo.InvariantCulture);
It does look a bit of complicated for a simple decimal parsing, but I think should do the work for all the input numbers you get or at least the most of them.
If you do this in some sort of loop, you might want to use compiled regex.
The problem here is that in one case . means decimal point but in other it is a thousnads separator. And then you have , as decimal separator. Clearly, it is impossible for the parser to "guess" what is meant, so the only thing you can do is to decide on some rules on how to handle which case.
If you have control over the UI the best approach would be to validate user input and just reject any value that can't be parsed with an explanation on which format is expected.
If you have no control over the UI, the second best option would be to check for some "rules" and then devise which culture is appropriate for that given input and try to run it through decimal.TryParse for that given culture.
For the given input you have, you could have the following rules:
input.StartsWith("$") -> en-US
input.StartsWith("£") -> en-GB
input.StartsWith("€") || input.EndsWith("€") -> de-DE
These could reasonably handle all cases.
In code:
static void Main(string[] args)
{
string[] inputs =
{
"$1.30",
"£1.50",
"€2,50",
"2,50 €",
"2.500,00 €"
};
for (int i = 0; i < inputs.Length; i++)
{
Console.Write((i + 1).ToString() + ". ");
if (decimal.TryParse(inputs[i], NumberStyles.Currency,
GetAppropriateCulture(inputs[i]), out var parsed))
{
Console.WriteLine(parsed);
}
else
{
Console.WriteLine("Can't parse");
}
}
}
private static CultureInfo GetAppropriateCulture(string input)
{
if (input.StartsWith("$"))
return CultureInfo.CreateSpecificCulture("en-US");
if (input.StartsWith("£"))
return CultureInfo.CreateSpecificCulture("en-GB");
if (input.StartsWith("€") || input.EndsWith("€"))
return CultureInfo.CreateSpecificCulture("de-DE");
return CultureInfo.InvariantCulture;
}
Output:
1.30
1.50
2.50
2.50
2500.00
The only way you could do that is just strip string from symbols and change . and , to decimal separator. Something like:
public decimal UniversalConvertDecimal(string str)
{
char currentDecimalSeparator = Convert.ToChar(Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator);
str = str.Replace('.', currentDecimalSeparator);
str = str.Replace(',', currentDecimalSeparator);
StringBuilder builder = new StringBuilder(str.Length);
foreach(var ch in str)
{
if(Char.IsDigit(ch) || ch == currentDecimalSeparator)
builder.Add(ch);
}
string s = builder.ToString();
return Convert.ToDecimal(s);
}
First you have to get current decimal separator from your system.
Then you have to replace . and , with current decimal separator.
Next, you will have to strip the string from any other char than a digit or decimal separator. At the end you can be sure that Convert.ToDecimal is going to work. But I don't know if it is something you want to achieve.
If you need some mechanism to save currency to database, there is a far simpler solution. Just convert this currency to least currency part. For example instead of $1, save 100 cents.
So if you have $1.99, just multiply it by 100 and you will get: 199 cents. And this integer can be saved to db.

Convert string to decimal with format

I need convert a String to a decimal in C#, but this string have different formats.
For example:
"50085"
"500,85"
"500.85"
This should be convert for 500,85 in decimal. Is there is a simplified form to do this convertion using format?
Some cultures use a comma to indicate the floating point. You can test this with the following code on an aspx page:
var x = decimal.Parse("500,85");
Response.Write(x + (decimal)0.15);
This gives the answer 501 when the thread culture has been set to a culture that uses the comma as floating point. You can force this like so:
var x = decimal.Parse("500,85", new NumberFormatInfo() { NumberDecimalSeparator = "," });
While decimal.Parse() is the method you are looking for, you will have to provide a bit more information to it. It will not automatically pick between the 3 formats you give, you will have to tell it which format you are expecting (in the form of an IFormatProvider). Note that even with an IFormatProvider, I don't think "50085" will be properly pulled in.
The only consistent thing I see is that it appears from your examples that you always expect two decimal places of precision. If that is the case, you could strip out all periods and commas and then divide by 100.
Maybe something like:
public decimal? CustomParse(string incomingValue)
{
decimal val;
if (!decimal.TryParse(incomingValue.Replace(",", "").Replace(".", ""), NumberStyles.Number, CultureInfo.InvariantCulture, out val))
return null;
return val / 100;
}
This will work, depending on your culture settings:
string s = "500.85";
decimal d = decimal.Parse(s);
If your culture does not by default allow , instead of . as a decimal point, you will probably need to:
s = s.Replace(',','.');
But will need to check for multiple .'s... this seems to boil down to more of an issue of input sanitization. If you are able to validate and sanitize the input to all conform to a set of rules, the conversion to decimal will be a lot easier.
Try this code below:
string numValue = "500,85";
System.Globalization.CultureInfo culInfo = new System.Globalization.CultureInfo("fr-FR");
decimal decValue;
bool decValid = decimal.TryParse(numValue, System.Globalization.NumberStyles.Number, culInfo.NumberFormat, out decValue);
if (decValid)
{
lblDecNum.Text = Convert.ToString(decValue, culInfo.NumberFormat);
}
Since I am giving a value of 500,85 I will assume that the culture is French and hence the decimal separator is ",". Then decimal.TryParse(numValue, System.Globalization.NumberStyles.Number, culInfo.NumberFormat,out decValue);
will return the value as 500.85 in decValue. Similarly if the user is English US then change the culInfo constructor.
There are numerous ways:
System.Convert.ToDecimal("232.23")
Double.Parse("232.23")
double test;
Double.TryParse("232.23", out test)
Make sure you try and catch...
This is a new feature called Digit Grouping Symbol.
Steps:
Open Region and Language in control panel
Click on Additional setting
On Numbers tab
Set Digit Grouping Symbol as custom setting.
Change comma; replace with (any character as A to Z or {/,}).
Digit Grouping Symbol=e;
Example:
string checkFormate = "123e123";
decimal outPut = 0.0M;
decimal.TryParse(checkFormate, out outPut);
Ans: outPut=123123;
Try This
public decimal AutoParse(string value)
{
if (Convert.ToDecimal("3.3") == ((decimal)3.3))
{
return Convert.ToDecimal(value.Replace(",", "."));
}
else
{
return Convert.ToDecimal(value.Replace(".", ","));
}
}

Get characters behind the dot in of a double

I feel like this is a very noob question.. but I just can't get the right statement for it.
For display purposes, I want to split a double in two: the part before the dot and the first two digits after the dot. I need it as a string. Target language: C#.
E.g.: 2345.1234 becomes "2345" and "12"
I know how to get the part before the dot, that's simply:
Math.Floor(value).ToString()
...but what is the right way to get the part "behind the dot"?
There must be some nice way to do that in a simple way...
I can't think of anything else then:
Math.Round(100 * (value - Math.Floor(value))).ToString("00");
I'm sure there is a better way, but I just can't think of it. Anyone?
Regular expressions (regex) is probably you best bet, but using the mod operator may be another valuable solution...
stuffToTheRight = value % 1
Cheers.
//
//Use the Fixed point formatting option. You might have a bit more work to do
//if you need to handle cases where "dot" is not the decimal separator.
//
string s = value.ToString("F2", CultureInfo.InvariantCulture);
var values = s.Split(".");
string v1 = values[0];
string v2 = values[1];
See this link for more about formatting: http://msdn.microsoft.com/en-us/library/dwhawy9k.aspx
Here is some untested code that tries to take current culture into account:
//
//Use the Fixed point formatting option.
//
string s = value.ToString("F2", CultureInfo.CurrentCulture);
var values = s.Split(CultureInfo.NumberFormat.NumberDecimalSeparator);
string v1 = values[0];
string v2 = values[1];
use regex ".[0-9][0-9]"
In one line it will be:
string[] vals = value.ToString("f2").Split(CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator.ToCharArray());
vals[0] : before point.
vals[1] : after point.

universal parsing of strings into float

In the environment that my program is going to run, people use ',' and '.' as decimal separators randomly on PCs with ',' and '.' separators.
How would you implements such a floatparse(string) function?
I tried this one:
try
{
d = float.Parse(s);
}
catch
{
try
{
d = float.Parse(s.Replace(".", ","));
}
catch
{
d = float.Parse(s.Replace(",", "."));
}
}
It doesn't work. And when I debugg it turns out that it parses it wrong the first time thinking that "." is a separator for thousands (like 100.000.000,0).
I'm noob at C#, so hopefully there is less overcomplicated solution then that :-)
NB: People a going to use both '.' and ',' in PCs with different separator settings.
If you are sure nobody uses thousand-separators, you can Replace first:
string s = "1,23"; // or s = "1.23";
s = s.Replace(',', '.');
double d = double.Parse(s, CultureInfo.InvariantCulture);
The general pattern would be:
first try to sanitize and normalize. Like Replace(otherChar, myChar).
try to detect a format, maybe using RegEx, and use a targeted conversion. Like counting . and , to guess whether thousand separators are used.
try several formats in order, with TryParse. Do not use exceptions for this.
Parsing uses the settings of the CultureInfo.CurrentCulture, which reflects the language and the Number format selected by the user through Regional Settings. If your users type the decimals that correspond to the language they chose for their computers, you should have no problem using plain old double.Parse(). If a user sets his locale to Greek and types "8,5", double.Parse("8,5") will return 8.5. If he types "8.5" parse will return 85.
If a user sets his locale to one setting and then starts using the wrong decimal, you face a problem. There is no clean way to separate such wrong entries instead of entries that really wanted to enter the grouping character. What you can do is to warn the user when a number is too short to include a grouping character and use Masked or numerical text boxes to prevent wrong entries.
Another, somewhat stricter option, is to disallow the grouping character for number entry in your application by cancelling it in the KeyDown event of your textboxes. You can get the numeric and grouping characters from CultureInfo.CurrentCulture.NumberFormat property.
Trying to replace the decimal and grouping characters is doomed to fail as it depends on knowing during compile time what kind of separator the user is going to use. If you knew that, you could just parse the number using the CultureInfo in the user's mind. Unfortunately, User.Brain.CultureInfo is not yet part of the .NET framework :P
I would do someting like this
float ConvertToFloat(string value)
{
float result;
var converted = float.TryParse(value, out result);
if (converted) return result;
converted = float.TryParse(value.Replace(".", ",")),
out result);
if (converted) return result;
return float.NaN;
}
In this case the following would return correct data
Console.WriteLine(ConvertToFloat("10.10").ToString());
Console.WriteLine(ConvertToFloat("11,0").ToString());
Console.WriteLine(ConvertToFloat("12").ToString());
Console.WriteLine(ConvertToFloat("1 . 10").ToString());
Returns
10,1
11
12
NaN
In this case if it is not possible to convert it, you will at least know that it is not a number. It's a safe way to convert.
You can also use the following overload
float.TryParse(value,
NumberStyles.Currency,
CultureInfo.CurrentCulture,
out result)
On this test-code:
Console.WriteLine(ConvertToFloat("10,10").ToString());
Console.WriteLine(ConvertToFloat("11,0").ToString());
Console.WriteLine(ConvertToFloat("12").ToString());
Console.WriteLine(ConvertToFloat("1 . 10").ToString());
Console.WriteLine(ConvertToFloat("100.000,1").ToString());
It returns the following
10,1
11
12
110
100000,1
So depending on how "nice" you want to be to the user, you can always replace the last step, if it is not a number, try converting it this way aswell, otherwsie it really isn't a number.
It would the look like this
float ConvertToFloat(string value)
{
float result;
var converted = float.TryParse(value,
out result);
if (converted) return result;
converted = float.TryParse(value.Replace(".", ","),
out result);
if (converted) return result;
converted = float.TryParse(value,
NumberStyles.Currency,
CultureInfo.CurrentCulture,
out result);
return converted ? result : float.NaN;
}
Where the following
Console.WriteLine(ConvertToFloat("10,10").ToString());
Console.WriteLine(ConvertToFloat("11,0").ToString());
Console.WriteLine(ConvertToFloat("12").ToString());
Console.WriteLine(ConvertToFloat("1 . 10").ToString());
Console.WriteLine(ConvertToFloat("100.000,1").ToString());
Console.WriteLine(ConvertToFloat("asdf").ToString());
Returns
10,1
11
12
110
100000,1
NaN

Parsing amount strings into numbers

I am working on a system that is recognizing paper documents using OCR engines. These documents are invoices containing amounts such as total, vat and net amounts. I need to parse these amount strings into numbers, but they are coming in many formats and flavors using different symbols for decimal and thousands separation in the number in each invoice. If I am trying to use the normal double.tryparse and double.parse methods in .NET then they normally fail for some of the amounts
These are some of the examples I receive as amount
"3.533,65" => 3533.65
"-133.696" => -133696
"-33.017" => -33017
"-166.713" => -166713
"-5088,8" => -5088.8
"0.423" => 0.423
"9,215,200" => 9215200
"1,443,840.00" => 1443840
I need some way to guess what the decimal separator and the thousand separator is in the number and then present the value to the user to decide if this is correct or not.
I am wondering how to solve this problem in an elegant way.
I'm not sure you'll be able to get an elegant way of figuring this out, because it's always going to be ambigious if you can't tell it where the data is from.
For example, the numbers 1.234 and 1,234 are both valid numbers, but without establishing what the symbols mean you won't be able to tell which is which.
Personally, I would write a function which attempted to do a "best guess" based on some rules...
If the number contains , BEFORE ., then , must be for thousands and . must be for decimals
If the number contains . BEFORE ,, then . must be for thousands and , must be for decimals
If there are >1 , symbols, the thousand separator must be ,
If there are >1 . symbols, the thousand separator must be .
If there is only 1 , how many numbers follow it? If it's NOT 3, then it must be
the decimal separator (same rule for .)
If there are 3 numbers separating it (e.g. 1,234 and 1.234), perhaps you could put this number aside and parse other numbers on the same page to try and figure out if they use different separators, then come back to it?
Once you've figured out the decimal separate, remove any thousand separators (not needed for parsing the number) and ensure the decimal separator is . in the string you are parsing. Then you can pass this into Double.TryParse
I would probably set up a list of rules that are specified in order of preference, this way you can plug rules in by precedence. You can then parse the list based on regex matches returning the correct rule.
A quick prototype would be very easy to set up similar to:
public class FormatRule
{
public string Pattern { get; set; }
public CultureInfo Culture { get; set; }
public FormatRule(string pattern, CultureInfo culture)
{
Pattern = pattern;
Culture = culture;
}
}
Now a list of of FormatRule used to store your rules in order of precedence:
List<FormatRule> Rules = new List<FormatRule>()
{
/* Add rules in order of precedence specifying a culture
* that can handle the pattern, I've chosen en-US and fr-FR
* for this example, but equally any culture could be swapped
* in for various formats you may need to use */
new FormatRule(#"^0.\d+$", CultureInfo.GetCultureInfo("en-US")),
new FormatRule(#"^0,\d+$", CultureInfo.GetCultureInfo("fr-FR")),
new FormatRule(#"^[1-9]+.\d{4,}$", CultureInfo.GetCultureInfo("en-US")),
new FormatRule(#"^[1-9]+,\d{4,}$", CultureInfo.GetCultureInfo("fr-FR")),
new FormatRule(#"^-?[1-9]{1,3}(,\d{3,})*(\.\d*)?$", CultureInfo.GetCultureInfo("en-US")),
new FormatRule(#"^-?[1-9]{1,3}(.\d{3,})*(\,\d*)?$", CultureInfo.GetCultureInfo("fr-FR")),
/* The default rule */
new FormatRule(string.Empty, CultureInfo.CurrentCulture)
}
You should then be able to iterate your list looking for the correct rule to apply:
public CultureInfo FindProvider(string numberString)
{
foreach(FormatRule rule in Rules)
{
if (Regex.IsMatch(numberString, rule.Pattern))
return rule.Culture;
}
return Rules[Rules.Count - 1].Culture;
}
This setup allows you to easily manage rules and set precedence on when something should be handled one way or another. It also allows you to be able to specify different cultures to handle one format one way and a different format another.
public float ParseValue(string valueString)
{
float value = 0;
NumberStyles style = NumberStyles.Any;
IFormatProvider provider = FindCulture(valueString).NumberFormat;
if (float.TryParse(numberString, style, provider, out value))
return value;
else
throw new InvalidCastException(string.Format("Value '{0}' cannot be parsed with any of the providers in the rule set.", valueString));
}
Finally, call your ParseValue() method to convert the string value you have to a float:
string numberString = "-123,456.78"; //Or "23.457.234,87"
float value = ParseValue(numberString);
You may decide to use a dictionary to save on the extra FormatRule class; the concept is the same... I used a list in the example because it makes it easier to query use LINQ. Also, you could easily replace the float type I've used for single, double or decimal if needed.
You will have to create your own function to guess what is the decimal separator and the thousand separator. Then you will be able to double.Parse but with the corresponding CultureInfo.
I recommend to do something like this (just an i.e. this is not a production tested function):
private CultureInfo GetNumbreCultureInfo(string number)
{
CultureInfo dotDecimalSeparator = new CultureInfo("En-Us");
CultureInfo commaDecimalSeparator = new CultureInfo("Es-Ar");
string[] splitByDot = number.Split('.');
if (splitByDot.Count() > 2) //has more than 1 . so the . is the thousand separator
return commaDecimalSeparator; //return a cultureInfo where the thousand separator is the .
//the same for the ,
string[] splitByComma = number.Split(',');
if (splitByComma.Count() > 2)
return dotDecimalSeparator;
//if there is no , or . return an invariant culture
if (splitByComma.Count() == 1 && splitByDot.Count() == 1)
return CultureInfo.InvariantCulture;
//if there is only 1 . or 1 , lets check witch is the last one
if (splitByComma.Count() == 2)
if (splitByDot.Count() == 1)
if (splitByComma.Last().Length != 3) // , its a decimal separator
return commaDecimalSeparator;
else// here you dont really know if its the dot decimal separator i.e 100.001 this can be thousand or decimal separator
return dotDecimalSeparator;
else //here you have something like 100.010,00 ir 100.010,111 or 100,000.111
{
if (splitByDot.Last().Length > splitByComma.Last().Length) //, is the decimal separator
return commaDecimalSeparator;
else
return dotDecimalSeparator;
}
else
if (splitByDot.Last().Length != 3) // . its a decimal separator
return dotDecimalSeparator;
else
return commaDecimalSeparator; //again you really dont know here... i.e. 100,101
}
you can do a quick test like this:
string[] numbers = { "100.101", "1.000.000,00", "100.100,10", "100,100.10", "100,100.100", "1,00" };
decimal n;
foreach (string number in numbers)
{
if (decimal.TryParse(number, NumberStyles.Any, GetNumbreCultureInfo(number), out n))
MessageBox.Show(n.ToString());//the decimal was parsed
else
MessageBox.Show("there was problems parsing");
}
Also look the if where you dont really know witch is the separator (like 100,010 or 100.001) where can be a decimal or thousand separator.
You can save this looking in the document for a number with the amount of data necessary to know witch is the culture of the document, save that culture and use always the same culture (if you can asume that the document is all in the same culture...)
Hope this will help
You should be able to that with Double.TryParse. Your biggest problem as I see it is that you have inconsistencies in the way you interpret the numbers.
For example, how can
"-133.696" => -133696
When
"-166.713" => -166.713
?
If the rules for converting the numbers aren't consistent then you won't be able to solve this in code. As klausbyskov pointed out, why does the period in "-133.696" have a different meaning than the one in "-166.713"? How would you know what to do with a number containing a decimal point given these 2 examples where one is using it as expected but the other is using it as a thousand separator?
You'll need to define the various cases you're likely to encounter, create some logic to match each incoming string to one of your cases, and then parse it specifying an appropriate FormatProvider. For example - if your string contains a decimal point BEFORE a comma, then you can assume that for this particular string, they're using the decimal point as the thousands separator and the comma as the decimal separator, so you can construct a format provider to cope with this scenario.
Try something along these lines:
public IFormatProvider GetParseFormatProvider(string s) {
var nfi = new CultureInfo("en-US", false).NumberFormat;
if (/* s contains period before comma */) {
nfi.NumberDecimalSeparator = ",";
nfi.NumberGroupSeparator = ".";
} else if (/* some other condition */) {
/* construct some other format provider */
}
return(nfi);
}
and then use Double.Parse(myString, GetParseFormatProvider(myString)) to perform the actual parsing.
"and then present the value to the user to decide if this is correct or not."
If there are multiple possibilities, why not show the user both of them?
You can have multiple methods calling TryParse with the different cultures you want to be able to handle, and collect the parse results for those methods that succeed in a list (removing duplicates).
You could even estimate the likelihood of the different possiblities being correct based on what frequency the various formats are used elsewhere in the document, and present the alternatives in a list sorted by likelihood of being correct. For example, if you have seen a lot of numbers like 3,456,231.4 already then you can guess that comma is probably the thousands seperator when you see 4,675 later in the same document, and present "4675" first in the list, and "4.675" second.
If you have a dot or comma followed by no more than two digits, it's the decimal point. Otherwise, ignore it.

Categories

Resources