Remove formatting from a string: "(123) 456-7890" => "1234567890"? - c#

I have a string when a telephone number is inputted - there is a mask so it always looks like "(123) 456-7890" - I'd like to take the formatting out before saving it to the DB.
How can I do that?

One possibility using linq is:
string justDigits = new string(s.Where(c => char.IsDigit(c)).ToArray());
Adding the cleaner/shorter version thanks to craigmoliver
string justDigits = new string(s.Where(char.IsDigit).ToArray())

You can use a regular expression to remove all non-digit characters:
string phoneNumber = "(123) 456-7890";
phoneNumber = Regex.Replace(phoneNumber, #"[^\d]", "");
Then further on - depending on your requirements - you can either store the number as a string or as an integer. To convert the number to an integer type you will have the following options:
// throws if phoneNumber is null or cannot be parsed
long number = Int64.Parse(phoneNumber, NumberStyles.Integer, CultureInfo.InvariantCulture);
// same as Int64.Parse, but returns 0 if phoneNumber is null
number = Convert.ToInt64(phoneNumber);
// does not throw, but returns true on success
if (Int64.TryParse(phoneNumber, NumberStyles.Integer,
CultureInfo.InvariantCulture, out number))
{
// parse was successful
}

Since nobody did a for loop.
long GetPhoneNumber(string PhoneNumberText)
{
// Returns 0 on error
StringBuilder TempPhoneNumber = new StringBuilder(PhoneNumberText.Length);
for (int i=0;i<PhoneNumberText.Length;i++)
{
if (!char.IsDigit(PhoneNumberText[i]))
continue;
TempPhoneNumber.Append(PhoneNumberText[i]);
}
PhoneNumberText = TempPhoneNumber.ToString();
if (PhoneNumberText.Length == 0)
return 0;// No point trying to parse nothing
long PhoneNumber = 0;
if(!long.TryParse(PhoneNumberText,out PhoneNumber))
return 0; // Failed to parse string
return PhoneNumber;
}
used like this:
long phoneNumber = GetPhoneNumber("(123) 456-7890");
Update
As pr commented many countries do have zero's in the begining of the number, if you need to support that, then you have to return a string not a long. To change my code to do that do the following:
1) Change function return type from long to string.
2) Make the function return null instead of 0 on error
3) On successfull parse make it return PhoneNumberText

You can make it work for that number with the addition of a simple regex replacement, but I'd look out for higher initial digits. For example, (876) 543-2019 will overflow an integer variable.

string digits = Regex.Replace(formatted, #"\D", String.Empty, RegexOptions.Compiled);

Aside from all of the other correct answers, storing phone numbers as integers or otherwise stripping out formatting might be a bad idea.
Here are a couple considerations:
Users may provide international phone numbers that don't fit your expectations. See these examples So the usual groupings for standard US numbers wouldn't fit.
Users may NEED to provide an extension, eg (555) 555-5555 ext#343 The # key is actually on the dialer/phone, but can't be encoded in an integer. Users may also need to supply the * key.
Some devices allow you to insert pauses (usually with the character P), which may be necessary for extensions or menu systems, or dialing into certain phone systems (eg, overseas). These also can't be encoded as integers.
[EDIT]
It might be a good idea to store both an integer version and a string version in the database. Also, when storing strings, you could reduce all punctuation to whitespace using one of the methods noted above. A regular expression for this might be:
// (222) 222-2222 ext# 333 -> 222 222 2222 # 333
phoneString = Regex.Replace(phoneString, #"[^\d#*P]", " ");
// (222) 222-2222 ext# 333 -> 2222222222333 (information lost)
phoneNumber = Regex.Replace(phoneString, #"[^\d]", "");
// you could try to avoid losing "ext" strings as in (222) 222-2222 ext.333 thus:
phoneString = Regex.Replace(phoneString, #"ex\w+", "#");
phoneString = Regex.Replace(phoneString, #"[^\d#*P]", " ");

Try this:
string s = "(123) 456-7890";
UInt64 i = UInt64.Parse(
s.Replace("(","")
.Replace(")","")
.Replace(" ","")
.Replace("-",""));
You should be safe with this since the input is masked.

You could use a regular expression or you could loop over each character and use char.IsNumber function.

You would be better off using regular expressions. An int by definition is just a number, but you desire the formatting characters to make it a phone number, which is a string.
There are numerous posts about phone number validation, see A comprehensive regex for phone number validation for starters.

As many answers already mention, you need to strip out the non-digit characters first before trying to parse the number. You can do this using a regular expression.
Regex.Replace("(123) 456-7890", #"\D", String.Empty) // "1234567890"
However, note that the largest positive value int can hold is 2,147,483,647 so any number with an area code greater than 214 would cause an overflow. You're better off using long in this situation.
Leading zeros won't be a problem for North American numbers, as area codes cannot start with a zero or a one.

Alternative using Linq:
string phoneNumber = "(403) 259-7898";
var phoneStr = new string(phoneNumber.Where(i=> i >= 48 && i <= 57).ToArray());

This is basically a special case of C#: Removing common invalid characters from a string: improve this algorithm. Where your formatng incl. White space are treated as "bad characters"

'you can use module / inside sub main form VB.net
Public Function ClearFormat(ByVal Strinput As String) As String
Dim hasil As String
Dim Hrf As Char
For i = 0 To Strinput.Length - 1
Hrf = Strinput.Substring(i, 1)
If IsNumeric(Hrf) Then
hasil &= Hrf
End If
Next
Return Strinput
End Function
'you can call this function like this
' Phone= ClearFormat(Phone)

public static string DigitsOnly(this string phoneNumber)
{
return new string(
new[]
{
// phoneNumber[0], (
phoneNumber[1], // 6
phoneNumber[2], // 1
phoneNumber[3], // 7
// phoneNumber[4], )
// phoneNumber[5],
phoneNumber[6], // 8
phoneNumber[7], // 6
phoneNumber[8], // 7
// phoneNumber[9], -
phoneNumber[10], // 5
phoneNumber[11], // 3
phoneNumber[12], // 0
phoneNumber[13] // 9
});
}

Related

Formatting a double within an interpolation

I have an interpolation where I need to format the variable to 2 places. The variable here is 'difference'
double difference = moneyLeft - trip;
Console.WriteLine($"Not enough money! {difference} needed.") ;
I have tried putting {0:f2} but doesn't seem to work. Currently it gives me a number like 418.2, where I need it to be 418.20. How can I fix it?
You can use the following code
double res = moneyLeft - trip;
string difference = String.Format("{0:0.00}", res); // say difference: 418.2
Console.WriteLine($"Not enough money! {difference} needed."); // Output: 418.20
There are two parts of the token syntax ({ }), the "before the colon", and the "after the colon".
When you're inside an interpolated string, the "before the colon" part is treated as code. That means if you use a variable name, it evaluates the value stored in that variable. If you give it a numeric literal, such as 0, it uses the value 0.
var input = 3.21;
string a = $"{input}"; // 3.21
string b = $"{0}"; // 0
0 In this case doesn't mean the "first positional argument after the template", such as is used in string.Format.
You already figured out that you should use f2 after the colon to get two decimal spots, but remember you can't use 0 before the colon, or else the value you'll be formatting is literally the number zero.
var input = 3.21267674;
// your first attempt
string a = $"{input}"; // 3.21267674
// your second attempt
string b = $"{0:f2}"; // 0.00
// the correct way
string c = $"{input:f2}"; // 3.21

Read input with different datatypes and space seperation

I'm trying to figure out how to write code to let the user input three values (string, int, int) in one line with space to separate the values.
I thought of doing it with String.Split Method but that only works if all the values have the same datatype.
How can I do it with different datatypes?
For example:
The user might want to input
Hello 23 54
I'm using console application C#
Well the first problem is that you need to decide whether the text the user enters itself can contain spaces. For example, is the following allowed?
Hello World, it's me 08 15
In that case, String.Split will not really be helpful.
What I'd try is using a regular expression. The following may serve as a starting point:
Match m = Regex.Match(input, #"^(?<text>.+) (?<num1>(\+|\-)?\d+) (?<num2>(\+|\-)?\d+)$");
if (m.Success)
{
string stringValue = m.Groups["text"].Value;
int num1 = Convert.ToInt32(m.Groups["num1"].Value);
int num2 = Convert.ToInt32(m.Groups["num2"].Value);
}
BTW: The following part of your question makes me frown:
I thought of doing it with String.Split Method but that only works if all the values have the same datatype.
A string is always just a string. Whether it contains a text, your email-address or your bank account balance. It is always just a series of characters. The notion that the string contains a number is just your interpretation!
So from a program's point of view, the string you gave is a series of characters. And for splitting that it doesn't matter at all what the real semantics of the content are.
That's why the splitting part is separate from the conversion part. You need to tell your application that that the first part is a string, the second and third parts however are supposed to be numbers. That's what you need type conversions for.
You are confusing things. A string is either null, empty or contains a sequence of characters. It never contains other data types. However, it might contain parts that could be interpreted as numbers, dates, colors etc... (but they are still strings). "123" is not an int! It is a string containing a number.
In order to extract these pieces you need to do two things:
Split the string into several string parts.
Convert string parts that are supposed to represent whole numbers into a the int type (=System.Int32).
string input = "Abc 123 456"
string[] parts = input.Split(); //Whitespaces are assumed as separators by default.
if (parts.Count == 3) {
Console.WriteLine("The text is \"{0}\"", parts[0]);
int n1;
if (Int32.TryParse(parts[1], out n1)) {
Console.WriteLine("The 1st number is {0}", n1);
} else {
Console.WriteLine("The second part is supposed to be a whole number.");
}
int n2;
if (Int32.TryParse(parts[2], out n2)) {
Console.WriteLine("The 2nd number is {0}", n2);
} else {
Console.WriteLine("The third part is supposed to be a whole number.");
}
} else {
Console.WriteLine("You must enter three parts separated by a space.");
}
What you have to do is get "Hello 23 54" in a string variable. Split by " " and treat them.
string value = "Hello 23 54";
var listValues = value.Split(' ').ToList();
After that you have to parse each item from listValues to your related types.
Hope it helps. ;)

decimal.TryParse is happily accepting badly formatted number strings

Is there a way to make the C# TryParse() functions a little more... strict ?
Right now, if you pass in a string containing numbers, the correct decimal & thousand separator characters, it often just seems to accept them, even if the format doesn't make sense, eg: 123''345'678
I'm looking for a way to make TryParse not be successful if the number isn't in the right format.
So, I'm based in Zurich, and if I do this:
decimal exampleNumber = 1234567.89m;
Trace.WriteLine(string.Format("Value {0} gets formatted as: \"{1:N}\"", exampleNumber, exampleNumber));
...then, with my regional settings, I get this...
Value 1234567.89 gets formatted as: "1'234'567.89"
So you can see that, for my region, the decimal place character is a full-stop and the thousand-separator is an apostrophe.
Now, let's create a simple function to test whether a string can be parsed into a decimal:
private void ParseTest(string str)
{
decimal val = 0;
if (decimal.TryParse(str, out val))
Trace.WriteLine(string.Format("Parsed \"{0}\" as {1}", str, val));
else
Trace.WriteLine(string.Format("Couldn't parse: \"{0}\"", str));
}
Okay, let's call this function with a few strings.
Which of the following strings would you think would get successfully parsed by this function ?
Below are the results I got:
ParseTest("123345.67"); // 1. Parsed "123345.67" as 123345.67
ParseTest("123'345.67"); // 2. Parsed "123'345.67" as 123345.67
ParseTest("123'345'6.78"); // 3. Parsed "123'345'6.78" as 1233456.78
ParseTest("1''23'345'678"); // 4. Parsed "1''23'345'678" as 123345678
ParseTest("'1''23'345'678"); // 5. Couldn't parse: "'1''23'345'678"
ParseTest("123''345'678"); // 6. Parsed "123''345'678" as 123345678
ParseTest("123'4'5'6.7.89"); // 7. Couldn't parse: "123'4'5'6.7.89"
ParseTest("'12'3'45'678"); // 8. Couldn't parse: "'12'3'45'678"
I think you can see my point.
To me, only the first two strings should've parsed successfully. The others should've all failed, as they don't have 3-digits after a thousand separator, or have two apostrophes together.
Even if I change the ParseTest to be a bit more specific, the results are exactly the same. (For example, it happily accepts "123''345'678" as a valid decimal.)
private void ParseTest(string str)
{
decimal val = 0;
var styles = (NumberStyles.AllowDecimalPoint | NumberStyles.AllowThousands);
if (decimal.TryParse(str, styles, CultureInfo.CurrentCulture, out val))
Trace.WriteLine(string.Format("Parsed \"{0}\" as {1}", str, val));
else
Trace.WriteLine(string.Format("Couldn't parse: \"{0}\"", str));
}
So, is there a straightforward way to not allow badly formatted strings to be accepted by TryParse ?
Update
Thanks for all of the suggestions.
Perhaps I should clarify: what I'm looking for is for the first two of these strings to be valid, but the third one to be rejected.
ParseTest("123345.67");
ParseTest("123'456.67");
ParseTest("12'345'6.7");
Surely there must be a way to use "NumberStyles.AllowThousands" so it can optionally allow thousand-separators but make sure the number format does make sense ?
Right now, if I use this:
if (decimal.TryParse(str, styles, CultureInfo.CurrentCulture, out val))
I get these results:
Parsed "123345.67" as 123345.67
Parsed "123'456.67" as 123456.67
Parsed "12'345'6.7" as 123456.7
And if I use this:
if (decimal.TryParse(str, styles, CultureInfo.InvariantCulture, out val))
I get these results:
Parsed "123345.67" as 123345.67
Couldn't parse: "123'456.67"
Couldn't parse: "12'345'6.7"
This is my problem... regardless of CultureInfo settings, that third string should be rejected, and the first two accepted.
The easiest way to tell if it is correctly formatted based on the current culture would be to compare the resulting number after formatting with the original string.
//input = "123,456.56" -- true
//input = "123,4,56.56" -- false
//input = "123456.56" -- true
//input = "123,,456.56" -- false
string input = "123456.56";
decimal value;
if(!decimal.TryParse(input, out value))
{
return false;
}
return (value.ToString("N") == input || value.ToString() == input);
This will succeed for inputs that completely omit thousand separators and inputs that specify correct thousand separators.
If you need it to accept a range of decimal places then you would need to grab the number of characters after the decimal separator and append it to the "N" format string.
Putting together all the useful suggestions here, here's what I ended up using.
It's not perfect, but, for my corporate app, it does at least reject numeric-strings which "don't look right".
Before I present my code, here's the differences between what my TryParseExact function will accept, and what the regular decimal.TryParse would accept:
And here's my code.
I'm sure there's a more efficient way of doing some of this, using regex or something, but this is sufficient for my needs, and I hope it helps other developers:
public static bool TryParseExact(string str, out decimal result)
{
// The regular decimal.TryParse() is a bit rubbish. It'll happily accept strings which don't make sense, such as:
// 123'345'6.78
// 1''23'345'678
// 123''345'678
//
// This function does the same as TryParse(), but checks whether the number "makes sense", ie:
// - has exactly zero or one "decimal point" characters
// - if the string has thousand-separators, then are there exactly three digits inbetween them
//
// Assumptions: if we're using thousand-separators, then there'll be just one "NumberGroupSizes" value.
//
// Returns True if this is a valid number
// False if this isn't a valid number
//
result = 0;
if (str == null || string.IsNullOrWhiteSpace(str))
return false;
// First, let's see if TryParse itself falls over, trying to parse the string.
decimal val = 0;
if (!decimal.TryParse(str, out val))
{
// If the numeric string contains any letters, foreign characters, etc, the function will abort here.
return false;
}
// Note: we'll ONLY return TryParse's result *if* the rest of the validation succeeds.
CultureInfo culture = CultureInfo.CurrentCulture;
int[] expectedDigitLengths = culture.NumberFormat.NumberGroupSizes; // Usually a 1-element array: { 3 }
string decimalPoint = culture.NumberFormat.NumberDecimalSeparator; // Usually full-stop, but perhaps a comma in France.
string thousands = culture.NumberFormat.NumberGroupSeparator; // Usually a comma, but can be apostrophe in European locations.
int numberOfDecimalPoints = CountOccurrences(str, decimalPoint);
if (numberOfDecimalPoints != 0 && numberOfDecimalPoints != 1)
{
// You're only allowed either ONE or ZERO decimal point characters. No more!
return false;
}
int numberOfThousandDelimiters = CountOccurrences(str, thousands);
if (numberOfThousandDelimiters == 0)
{
result = val;
return true;
}
// Okay, so this numeric-string DOES contain 1 or more thousand-seperator characters.
// Let's do some checks on the integer part of this numeric string (eg "12,345,67.890" -> "12,345,67")
if (numberOfDecimalPoints == 1)
{
int inx = str.IndexOf(decimalPoint);
str = str.Substring(0, inx);
}
// Split up our number-string into sections: "12,345,67" -> [ "12", "345", "67" ]
string[] parts = str.Split(new string[] { thousands }, StringSplitOptions.None);
if (parts.Length < 2)
{
// If we're using thousand-separators, then we must have at least two parts (eg "1,234" contains two parts: "1" and "234")
return false;
}
// Note: the first section is allowed to be upto 3-chars long (eg for "12,345,678", the "12" is perfectly valid)
if (parts[0].Length == 0 || parts[0].Length > expectedDigitLengths[0])
{
// This should catch errors like:
// ",234"
// "1234,567"
// "12345678,901"
return false;
}
// ... all subsequent sections MUST be 3-characters in length
foreach (string oneSection in parts.Skip(1))
{
if (oneSection.Length != expectedDigitLengths[0])
return false;
}
result = val;
return true;
}
public static int CountOccurrences(string str, string chr)
{
// How many times does a particular string appear in a string ?
//
int count = str.Length - str.Replace(chr, "").Length;
return count;
}
Btw, I created the table image above in Excel, and noticed that it's actually hard to paste values like this into Excel:
1'234567.89
Does Excel complain above this value, or try to store it as text ? Nope, it also happily accepts this as a valid number, and pastes it as "1234567.89".
Anyway, job done.. thanks to everyone for their help & suggestions.
It's because parsing just skips the NumberFormatInfo.NumberGroupSeparator string and completely ignores the NumberFormatInfo.NumberGroupSizes property. However, you can implement such a validation:
static bool ValidateNumberGroups(string value, CultureInfo culture)
{
string[] parts = value.Split(new string[] { culture.NumberFormat.NumberGroupSeparator }, StringSplitOptions.None);
foreach (string part in parts)
{
int length = part.Length;
if (culture.NumberFormat.NumberGroupSizes.Contains(length) == false)
{
return false;
}
}
return true;
}
It's still not completely perfect, as the MSDN says:
The first element of the array defines the number of elements in the least significant group of digits immediately to the left of the NumberDecimalSeparator. Each subsequent element refers to the next significant group of digits to the left of the previous group. If the last element of the array is not 0, the remaining digits are grouped based on the last element of the array. If the last element is 0, the remaining digits are not grouped.
For example, if the array contains { 3, 4, 5 }, the digits are grouped similar to "55,55555,55555,55555,4444,333.00". If the array contains { 3, 4, 0 }, the digits are grouped similar to "55555555555555555,4444,333.00".
But you can see the point now.

How do I trim the "0." after I do modulo 1 on a double variable

Hello everyone as the title say I want to trim the "0." after I do modulo 1 on a double variable
Example:
double Number;
Number = Convert.ToDouble(Console.ReadLine()); //12.777
test = Number % 1; //0.777
I want my output to be: 777
only using math with no
string trims and so...
Thank you all !!
and in c# please
That is just a formatting on the ToString. Take a look at all your options here
How about
.ToString(".###");
Without using any string functions!
while(Math.Round(Number-(int)Number,1)!=1)
{
Number=Number/0.1;
if(Number-(int)Number==0)break;//To cover edge case like 0.1 or 0.9
}
NOTE: Number should be of double type!
If I take your question literally, then you do not want the decimal point either, so .ToString(".###") will not get you what you want, unless you remove the first character (which is string manipulation, and you said you don't want that either).
If you want 777 in a numeric variable (not a string), then you can multiply your result by 1000, though I don't know if you'll always have exactly 3 digits after the decimal or not.
The easiest way really is just to use string manipulation. ToString the result without any formatting, then get the substring starting after the decimal. For example:
var x = (.777d).ToString();
var result = x.SubString(x.IndexOf('.') + 1);
You are certainly looking for this:-
.ToString(".###");
As correctly pointed by Marc in comments you should have everything to be in a string, because if you output that 0.777 as it really is stored internally, you'd get 8 random bytes.
Something like this:-
var num = (.777d).ToString();
var result = num.SubString(num.IndexOf('.') + 1);
The most generic way to do this would be:
using System.Globalization;
var provider = NumberFormatInfo.InvariantInfo;
var output = test.ToString(".###", provider)
.Replace(provider.NumberDecimalSeparator, String.Empty);
You can also set the NumberDecimalSeparator on a custom NumberFormatInfo, but if you set it to empty it will throw the exception "Decimal separator cannot be the empty string."

Convert Unicode string made up of culture-specific digits to integer value

I am developing a program in the Marathi language. In it, I want to add/validate numbers entered in Marathi Unicode by getting their actual integer value.
For example, in Marathi:
४५ = 45
९९ = 99
How do I convert this Marathi string "४५" to its actual integer value i.e. 45?
I googled a lot, but found nothing useful. I tried using System.Text.Encoding.Unicode.GetString() to get string and then tried to parse, but failed here also.
Correct way would be to use Char.GetNumericValue that lets you to convert individual characters to corresponding numeric values and than construct complete value. I.e. Char.GetNumericValue('९') gives you 9.
Depending on your goal it may be easier to replace each national digit character with corresponding invariant digit and use regular parsing functions.
Int32.Parse("९९".Replace("९", "9"))
Quick hack of #Alexi's response.
public static double ParseValue(string value)
{
return double.Parse(string.Join("",
value.Select(c => "+-.".Contains(c)
? "" + c: "" + char.GetNumericValue(c)).ToArray()),
NumberFormatInfo.InvariantInfo);
}
calling ParseValue("१२३.३२१") yields 123.321 as result
I found my solution...
The following code will convert given Marathi number to its equivalent Latin number..
Thanks to #Alexei, I just changed some of your code and its working fine..
string ToLatinDigits(string nativeDigits)
{
int n = nativeDigits.Length;
StringBuilder latinDigits = new StringBuilder(capacity: n);
for (int i = 0; i < n; ++i)
{
if (char.IsDigit(nativeDigits, i))
{
latinDigits.Append(char.GetNumericValue(nativeDigits, i));
}
else if (nativeDigits[i].Equals('.') || nativeDigits[i].Equals('+') || nativeDigits[i].Equals('-'))
{
latinDigits.Append(nativeDigits[i]);
}
else
{
throw new Exception("Invalid Argument");
}
}
return latinDigits.ToString();
}
This method is working for both + and - numbers.
Regards Guruprasad
Windows.Globalization.DecimalFormatter will parse different numeral systems in addition to Latin, including Devanagari (which is what is used by Marathi).

Categories

Resources