Parsing measurement units - c#

I have the following string:
string value = "123.456L";
What is the best way to parse this string into a string and a double:
double number = 123.456;
string measure = "L"
Instead of the L, we could also have something else, like oz, m/s, liter, kilograms, etc

Assuming that the units of measure are always expressed as a single character at the back of the string, you can do this:
string value = "123.456L";
var pos = value.LastIndexOfAny("0123456789".ToCharArray());
double number = double.Parse(value.Substring(0, pos+1));
string measure = value.Substring(pos+1);

Based on the comment explaining the input, I'd use Regex.
double number = double.Parse(Regex.Match(value, #"[\d.]+").Value);
string measure = value.Replace(number.ToString(), "");
The regex [\d.] will match any number or ., the + means it must be for 1 or more matches.

I'd do it like this:
public bool TryParseUnit ( string sValue, out double fValue, out string sUnit )
{
fValue = 0;
sUnit = null;
if ( !String.IsNullOrEmpty ( sValue ) )
{
sUnit = GetUnit ( sValue );
if ( sUnit != null )
{
return ( Double.TryParse ( sValue.Substring ( sValue.Length - sUnit.Length ),
out fValue );
}
}
return ( false );
}
private string GetUnit ( string sValue )
{
string sValue = sValue.SubString ( sValue.Length - 1 );
switch ( sValue.ToLower () )
{
case "l":
return ( "L" );
}
return ( null );
}
I know it's more complicated than the other answers but this way you can also validate the data during parsing and discard invalid input.

You could do it with a regex
using System.Text.RegularExpression;
Regex reg = new Regex(#"([\d|\.]*)(\w*)");
string value = "123.4L";
MatchCollection matches = reg.Matches(value);
foreach (Match match in matches)
{
if (match.Success)
{
GroupCollection groups = match.Groups;
Console.WriteLine(groups[1].Value); // will be 123.4
Console.WriteLine(groups[2].Value); // will be L
}
}
So what this will do is look for a 0 or more digits or "." and then group them and then look for any character (0 or more). You can then get the groups from each match and get the value. This will work if you want to change the type of measurement and will work if you don't have a decimal point either.
Edit: It is important to note that you must use groups[1] for the first group and groups[2] for the second group. If you use group[0] it will display the original string.

You might want to take a look at Units.NET on GitHub and NuGet. It supports parsing abbreviations in different cultures, but it is still on my TODO list to add support for parsing combinations of numbers and units. I have already done this on a related project, so it should be straight-forward to add.
Update Apr 2015: You can now parse units and values by Length.Parse("5.3 m"); and similar for other units.

Simply spoken: look for all characters that are 0..9 or . and trim them to a new string, then have last part in second string. In a minute I cann give code.
Edit: Yes, I meant digits 0-9, corrected it. But easier is to get index of last number and ignore stuff before for the trimming.

You can try this:
string ma = Regex.Match(name, #"((\d\s)|(\d+\s)|(\d+)|(\d+\.\d+\s))(g\s|kg\s|ml\s)").Value;
this will match:
40 g , 40g , 12.5 g , 1 kg , 2kg , 150 ml ....

Related

Split string to digits and alphabets in c#

I have a string like this
12,3 m and i need 2 sub strings one decimal value and one unit like 12,3 and m
12.3 will return 12.3 and m
123,4 c will return 123,4 and c
The decimal separators can be . or ,
So how can i get it in C# without iterating through every characters like below
char c;
for (int i = 0; i < Word.Length; i++)
{
c = Word[i];
if (Char.IsDigit(c))
string1 += c;
else
string2 += c;
}
string input is not really needed to be formatted like this it can be like A12,3 m or ABC3.45 or 4.5 DEF etc. So string Split is not stable always
Looks like you are trying to split based on the whitespace character:
input = "12.3 c";
string[] stringArray = string.Split(input, ' ');
You can then do a float.Parse operation on the first element of the array. The decimal separator used by float.Parse would depend on your culture and if the wrong one is chosen you could get a FormatException.
You can also choose the decimal separator programatically through the below:
culture.NumberFormat.NumberDecimalSeparator = "."; // or ","
Checking your provided examples { "12,3 m", "A12,3 m", "ABC3.45", "4.5 DEF"} it seems that the string position can not only change but there can be 2 strings and one decimal in your inputstrings.
This solution will show you how to extract these data with only one regex and no manual string split. I will incorporate the CultureInfo from user1666620:
string[] inputStrings = new string[] { "12,3 m", "A12,3 m", "ABC3.45", "4.5 DEF"};
Regex splitterRx = new Regex("([a-zA-Z]*)\\s*([\\d\\.,]+)\\s*([a-zA-Z]*)");
List<Tuple<string, decimal, string>> results = new List<Tuple<string, decimal, string>>();
foreach (var str in inputStrings)
{
var splitterM = splitterRx.Match(str);
if (splitterM.Success)
{
results.Add(new Tuple<string, decimal, string>(
splitterM.Groups[1].Value,
decimal.Parse(
splitterM.Groups[2].Value.Replace(".", System.Globalization.NumberFormatInfo.CurrentInfo.NumberDecimalSeparator).Replace(
",", System.Globalization.NumberFormatInfo.CurrentInfo.NumberDecimalSeparator)
),
splitterM.Groups[3].Value
));
}
}
This will find all possible combinations of a present/not present string in pre/post position, so be sure to check the individual strings or apply any combining logik unto them.
Also it doesn't only check for the presence of a single space between the decimal and the strings but for the presence of any number of whitespaces. If you want to limit it to definately only 0 or 1 space instead replace the Regex with this:
Regex splitterRx = new Regex("([a-zA-Z]*)[ ]{0,1}([\\d\\.,]+)[ ]{0,1}([a-zA-Z]*)");

get measurement value only from string

I have a string which gives the measurement followed the units in either cm, m or inches.
For example :
The number could be 112cm, 1.12m, 45inches or 45in.
I would like to extract only the number part of the string. Any idea how to use the units as the delimiters to extract the number ?
While I am at it, I would like to ignore the case of the units.
Thanks
You can try:
string numberMatch = Regex.Match(measurement, #"\d+\.?\d*").Value;
EDIT
Furthermore, converting this to a double is trivial:
double result;
if (double.TryParse(number, out result))
{
// Yeiiii I've got myself a double ...
}
Use String.Split http://msdn.microsoft.com/en-us/library/tabh47cf.aspx
Something like:
var units = new[] {"cm", "inches", "in", "m"};
var splitnumber = mynumberstring.Split(units, StringSplitOptions.RemoveEmptyEntries);
var number = Convert.ToInt32(splitnumber[0]);
Using Regex this can help you out:
(?i)(\d+(?:\.\d+)?)(?=c?m|in(?:ch(?:es)?)?)
Break up:
(?i) = ignores characters case // specify it in C#, live do not have it
\d+(\.\d+)? = supports numbers like 2, 2.25 etc
(?=c?m|in(ch(es)?)?) = positive lookahead, check units after the number if they are
m, cm,in,inch,inches, it allows otherwise it is not.
?: = specifies that the group will not capture
? = specifies the preceding character or group is optional
Demo
EDIT
Sample code:
MatchCollection mcol = Regex.Matches(sampleStr,#"(?i)(\d+(?:\.\d+)?)(?=c?m|in(?:ch(?:es)?)?)")
foreach(Match m in mcol)
{
Debug.Print(m.ToString()); // see output window
}
I guess I'd try to replace with "" every character that is not number or ".":
//s is the string you need to convert
string tmp=s;
foreach (char c in s.ToCharArray())
{
if (!(c >= '0' && c <= '9') && !(c =='.'))
tmp = tmp.Replace(c.ToString(), "");
}
s=tmp;
Try using regular expression \d+ to find an integer number.
resultString = Regex.Match(measurementunit , #"\d+").Value;
Is it a requirement that you use the unit as the delimiter? If not, you could extract the number using regex (see Find and extract a number from a string).

C# Regexcollection between special characters

I am trying use regex to parse the values 903001,343001,343491 in the following input:
"contact_value":"903001" other random
"contact_value":"343001" random information
"contact_value":"343491" more random
I used the following in c# but it returns "contact_value":"903001"
MatchCollection numMatch = Regex.Matches(input, #"contact_value\"":\"".*"\""");
thanks in advance
The regex could be as simple as
#"\d+"
If you use # with strings (e.g. #"string"), escape characters are not processed. In those strings, you use "" instead of \" to represent double quotes. Try this regex:
var regex = #"contact_value"":""(\d+)"""
Try something like:
string input = "\"contact_value\":\"1234567890\"" ;
Regex rx = new Regex( #"^\s*""contact_value""\s*:\s*""(?<value>\d+)""\s*$" ) ;
Match m = rx.Match( input ) ;
if ( !m.Success )
{
Console.WriteLine("Invalid");
}
else
{
string value = m.Groups["value"].Value ;
int n = int.Parse(value) ;
Console.WriteLine( "The contact_value is {0}",n) ;
}
[And read up on how to use regular expressions]

Extract digit in a string

I have a list of string
goal0=1234.4334abc12423423
goal1=-234234
asdfsdf
I want to extract the number part from string that start with goal,
in the above case is
1234.4334, -234234
(if two fragments of digit get the first one)
how should i do it easily?
Note that "goal0=" is part of the string, goal0 is not a variable.
Therefore I would like to have the first digit fragment that come after "=".
You can do the following:
string input = "goal0=1234.4334abc12423423";
input = input.Substring(input.IndexOf('=') + 1);
IEnumerable<char> stringQuery2 = input.TakeWhile(c => Char.IsDigit(c) || c=='.' || c=='-');
string result = string.Empty;
foreach (char c in stringQuery2)
result += c;
double dResult = double.Parse(result);
Try this
string s = "goal0=-1234.4334abc12423423";
string matches = Regex.Match(s, #"(?<=^goal\d+=)-?\d+(\.\d+)?").Value;
The regex says
(?<=^goal\d+=) - A positive look behind which means look back and make sure goal(1 or more number)= is at the start of the string, but dont make it part of the match
-? - A minus sign which is optional (the ? means 1 or more)
\d+ - One or more digits
(\.\d+)? - A decimal point followed by 1 or more digits which is optional
This will work if your string contains multiple decimal points as well as it will only take the first set of numbers after the first decimal point if there are any.
Use a regex for extracting:
x = Regex.Match(string, #"\d+").Value;
Now convert the resulting string to the number by using:
finalNumber = Int32.Parse(x);
Please try this:
string sample = "goal0=1234.4334abc12423423goal1=-234234asdfsdf";
Regex test = new Regex(#"(?<=\=)\-?\d*(\.\d*)?", RegexOptions.Singleline);
MatchCollection matchlist = test.Matches(sample);
string[] result = new string[matchlist.Count];
if (matchlist.Count > 0)
{
for (int i = 0; i < matchlist.Count; i++)
result[i] = matchlist[i].Value;
}
Hope it helps.
I didn't get the question at first. Sorry, but it works now.
I think this simple expression should work:
Regex.Match(string, #"\d+")
You can use the old VB Val() function from C#. That will extract a number from the front of a string, and it's already available in the framework:
result0 = Microsoft.VisualBasic.Conversion.Val(goal0);
result1 = Microsoft.VisualBasic.Conversion.Val(goal1);
string s = "1234.4334abc12423423";
var result = System.Text.RegularExpressions.Regex.Match(s, #"-?\d+");
List<String> list = new List<String>();
list.Add("goal0=1234.4334abc12423423");
list.Add("goal1=-23423");
list.Add("asdfsdf");
Regex regex = new Regex(#"^goal\d+=(?<GoalNumber>-?\d+\.?\d+)");
foreach (string s in list)
{
if(regex.IsMatch(s))
{
string numberPart = regex.Match(s).Groups["GoalNumber"];
// do something with numberPart
}
}

Regular Expression to match numbers inside parenthesis inside square brackets with optional text

Firstly, I'm in C# here so that's the flavor of RegEx I'm dealing with. And here are thing things I need to be able to match:
[(1)]
or
[(34) Some Text - Some Other Text]
So basically I need to know if what is between the parentheses is numeric and ignore everything between the close parenthesis and close square bracket. Any RegEx gurus care to help?
This should work:
\[\(\d+\).*?\]
And if you need to catch the number, simply wrap \d+ in parentheses:
\[\((\d+)\).*?\]
Do you have to match the []? Can you do just ...
\((\d+)\)
(The numbers themselves will be in the groups).
For example ...
var mg = Regex.Match( "[(34) Some Text - Some Other Text]", #"\((\d+)\)");
if (mg.Success)
{
var num = mg.Groups[1].Value; // num == 34
}
else
{
// No match
}
Regex seems like overkill in this situation. Here is the solution I ended up using.
var src = test.IndexOf('(') + 1;
var dst = test.IndexOf(')') - 1;
var result = test.SubString(src, dst-src);
Something like:
\[\(\d+\)[^\]]*\]
Possibly with some more escaping required?
How about "^\[\((d+)\)" (perl style, not familiar with C#). You can safely ignore the rest of the line, I think.
Depending on what you're trying to accomplish...
List<Boolean> rslt;
String searchIn;
Regex regxObj;
MatchCollection mtchObj;
Int32 mtchGrp;
searchIn = #"[(34) Some Text - Some Other Text] [(1)]";
regxObj = new Regex(#"\[\(([^\)]+)\)[^\]]*\]");
mtchObj = regxObj.Matches(searchIn);
if (mtchObj.Count > 0)
rslt = new List<bool>(mtchObj.Count);
else
rslt = new List<bool>();
foreach (Match crntMtch in mtchObj)
{
if (Int32.TryParse(crntMtch.Value, out mtchGrp))
{
rslt.Add(true);
}
}
How's this? Assuming you only need to determine if the string is a match, and need not extract the numeric value...
string test = "[(34) Some Text - Some Other Text]";
Regex regex = new Regex( "\\[\\(\\d+\\).*\\]" );
Match match = regex.Match( test );
Console.WriteLine( "{0}\t{1}", test, match.Success );

Categories

Resources