I have a string which gives the measurement followed the units in either cm, m or inches.
For example :
The number could be 112cm, 1.12m, 45inches or 45in.
I would like to extract only the number part of the string. Any idea how to use the units as the delimiters to extract the number ?
While I am at it, I would like to ignore the case of the units.
Thanks
You can try:
string numberMatch = Regex.Match(measurement, #"\d+\.?\d*").Value;
EDIT
Furthermore, converting this to a double is trivial:
double result;
if (double.TryParse(number, out result))
{
// Yeiiii I've got myself a double ...
}
Use String.Split http://msdn.microsoft.com/en-us/library/tabh47cf.aspx
Something like:
var units = new[] {"cm", "inches", "in", "m"};
var splitnumber = mynumberstring.Split(units, StringSplitOptions.RemoveEmptyEntries);
var number = Convert.ToInt32(splitnumber[0]);
Using Regex this can help you out:
(?i)(\d+(?:\.\d+)?)(?=c?m|in(?:ch(?:es)?)?)
Break up:
(?i) = ignores characters case // specify it in C#, live do not have it
\d+(\.\d+)? = supports numbers like 2, 2.25 etc
(?=c?m|in(ch(es)?)?) = positive lookahead, check units after the number if they are
m, cm,in,inch,inches, it allows otherwise it is not.
?: = specifies that the group will not capture
? = specifies the preceding character or group is optional
Demo
EDIT
Sample code:
MatchCollection mcol = Regex.Matches(sampleStr,#"(?i)(\d+(?:\.\d+)?)(?=c?m|in(?:ch(?:es)?)?)")
foreach(Match m in mcol)
{
Debug.Print(m.ToString()); // see output window
}
I guess I'd try to replace with "" every character that is not number or ".":
//s is the string you need to convert
string tmp=s;
foreach (char c in s.ToCharArray())
{
if (!(c >= '0' && c <= '9') && !(c =='.'))
tmp = tmp.Replace(c.ToString(), "");
}
s=tmp;
Try using regular expression \d+ to find an integer number.
resultString = Regex.Match(measurementunit , #"\d+").Value;
Is it a requirement that you use the unit as the delimiter? If not, you could extract the number using regex (see Find and extract a number from a string).
Related
I have a kinda simple problem, but I want to solve it in the best way possible. Basically, I have a string in this kind of format: <some letters><some numbers>, i.e. q1 or qwe12. What I want to do is get two strings from that (then I can convert the number part to an integer, or not, whatever). The first one being the "string part" of the given string, so i.e. qwe and the second one would be the "number part", so 12. And there won't be a situation where the numbers and letters are being mixed up, like qw1e2.
Of course, I know, that I can use a StringBuilder and then go with a for loop and check every character if it is a digit or a letter. Easy. But I think it is not a really clear solution, so I am asking you is there a way, a built-in method or something like this, to do this in 1-3 lines? Or just without using a loop?
You can use a regular expression with named groups to identify the different parts of the string you are interested in.
For example:
string input = "qew123";
var match = Regex.Match(input, "(?<letters>[a-zA-Z]+)(?<numbers>[0-9]+)");
if (match.Success)
{
Console.WriteLine(match.Groups["letters"]);
Console.WriteLine(match.Groups["numbers"]);
}
You can try Linq as an alternative to regular expressions:
string source = "qwe12";
string letters = string.Concat(source.TakeWhile(c => c < '0' || c > '9'));
string digits = string.Concat(source.SkipWhile(c => c < '0' || c > '9'));
You can use the Where() extension method from System.Linq library (https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where), to filter only chars that are digit (number), and convert the resulting IEnumerable that contains all the digits to an array of chars, that can be used to create a new string:
string source = "qwe12";
string stringPart = new string(source.Where(c => !Char.IsDigit(c)).ToArray());
string numberPart = new string(source.Where(Char.IsDigit).ToArray());
MessageBox.Show($"String part: '{stringPart}', Number part: '{numberPart}'");
Source:
https://stackoverflow.com/a/15669520/8133067
if possible add a space between the letters and numbers (q 3, zet 64 etc.) and use string.split
otherwise, use the for loop, it isn't that hard
You can test as part of an aggregation:
var z = "qwe12345";
var b = z.Aggregate(new []{"", ""}, (acc, s) => {
if (Char.IsDigit(s)) {
acc[1] += s;
} else {
acc[0] += s;
}
return acc;
});
Assert.Equal(new [] {"qwe", "12345"}, b);
I have a list of string
goal0=1234.4334abc12423423
goal1=-234234
asdfsdf
I want to extract the number part from string that start with goal,
in the above case is
1234.4334, -234234
(if two fragments of digit get the first one)
how should i do it easily?
Note that "goal0=" is part of the string, goal0 is not a variable.
Therefore I would like to have the first digit fragment that come after "=".
You can do the following:
string input = "goal0=1234.4334abc12423423";
input = input.Substring(input.IndexOf('=') + 1);
IEnumerable<char> stringQuery2 = input.TakeWhile(c => Char.IsDigit(c) || c=='.' || c=='-');
string result = string.Empty;
foreach (char c in stringQuery2)
result += c;
double dResult = double.Parse(result);
Try this
string s = "goal0=-1234.4334abc12423423";
string matches = Regex.Match(s, #"(?<=^goal\d+=)-?\d+(\.\d+)?").Value;
The regex says
(?<=^goal\d+=) - A positive look behind which means look back and make sure goal(1 or more number)= is at the start of the string, but dont make it part of the match
-? - A minus sign which is optional (the ? means 1 or more)
\d+ - One or more digits
(\.\d+)? - A decimal point followed by 1 or more digits which is optional
This will work if your string contains multiple decimal points as well as it will only take the first set of numbers after the first decimal point if there are any.
Use a regex for extracting:
x = Regex.Match(string, #"\d+").Value;
Now convert the resulting string to the number by using:
finalNumber = Int32.Parse(x);
Please try this:
string sample = "goal0=1234.4334abc12423423goal1=-234234asdfsdf";
Regex test = new Regex(#"(?<=\=)\-?\d*(\.\d*)?", RegexOptions.Singleline);
MatchCollection matchlist = test.Matches(sample);
string[] result = new string[matchlist.Count];
if (matchlist.Count > 0)
{
for (int i = 0; i < matchlist.Count; i++)
result[i] = matchlist[i].Value;
}
Hope it helps.
I didn't get the question at first. Sorry, but it works now.
I think this simple expression should work:
Regex.Match(string, #"\d+")
You can use the old VB Val() function from C#. That will extract a number from the front of a string, and it's already available in the framework:
result0 = Microsoft.VisualBasic.Conversion.Val(goal0);
result1 = Microsoft.VisualBasic.Conversion.Val(goal1);
string s = "1234.4334abc12423423";
var result = System.Text.RegularExpressions.Regex.Match(s, #"-?\d+");
List<String> list = new List<String>();
list.Add("goal0=1234.4334abc12423423");
list.Add("goal1=-23423");
list.Add("asdfsdf");
Regex regex = new Regex(#"^goal\d+=(?<GoalNumber>-?\d+\.?\d+)");
foreach (string s in list)
{
if(regex.IsMatch(s))
{
string numberPart = regex.Match(s).Groups["GoalNumber"];
// do something with numberPart
}
}
I have the following string:
string value = "123.456L";
What is the best way to parse this string into a string and a double:
double number = 123.456;
string measure = "L"
Instead of the L, we could also have something else, like oz, m/s, liter, kilograms, etc
Assuming that the units of measure are always expressed as a single character at the back of the string, you can do this:
string value = "123.456L";
var pos = value.LastIndexOfAny("0123456789".ToCharArray());
double number = double.Parse(value.Substring(0, pos+1));
string measure = value.Substring(pos+1);
Based on the comment explaining the input, I'd use Regex.
double number = double.Parse(Regex.Match(value, #"[\d.]+").Value);
string measure = value.Replace(number.ToString(), "");
The regex [\d.] will match any number or ., the + means it must be for 1 or more matches.
I'd do it like this:
public bool TryParseUnit ( string sValue, out double fValue, out string sUnit )
{
fValue = 0;
sUnit = null;
if ( !String.IsNullOrEmpty ( sValue ) )
{
sUnit = GetUnit ( sValue );
if ( sUnit != null )
{
return ( Double.TryParse ( sValue.Substring ( sValue.Length - sUnit.Length ),
out fValue );
}
}
return ( false );
}
private string GetUnit ( string sValue )
{
string sValue = sValue.SubString ( sValue.Length - 1 );
switch ( sValue.ToLower () )
{
case "l":
return ( "L" );
}
return ( null );
}
I know it's more complicated than the other answers but this way you can also validate the data during parsing and discard invalid input.
You could do it with a regex
using System.Text.RegularExpression;
Regex reg = new Regex(#"([\d|\.]*)(\w*)");
string value = "123.4L";
MatchCollection matches = reg.Matches(value);
foreach (Match match in matches)
{
if (match.Success)
{
GroupCollection groups = match.Groups;
Console.WriteLine(groups[1].Value); // will be 123.4
Console.WriteLine(groups[2].Value); // will be L
}
}
So what this will do is look for a 0 or more digits or "." and then group them and then look for any character (0 or more). You can then get the groups from each match and get the value. This will work if you want to change the type of measurement and will work if you don't have a decimal point either.
Edit: It is important to note that you must use groups[1] for the first group and groups[2] for the second group. If you use group[0] it will display the original string.
You might want to take a look at Units.NET on GitHub and NuGet. It supports parsing abbreviations in different cultures, but it is still on my TODO list to add support for parsing combinations of numbers and units. I have already done this on a related project, so it should be straight-forward to add.
Update Apr 2015: You can now parse units and values by Length.Parse("5.3 m"); and similar for other units.
Simply spoken: look for all characters that are 0..9 or . and trim them to a new string, then have last part in second string. In a minute I cann give code.
Edit: Yes, I meant digits 0-9, corrected it. But easier is to get index of last number and ignore stuff before for the trimming.
You can try this:
string ma = Regex.Match(name, #"((\d\s)|(\d+\s)|(\d+)|(\d+\.\d+\s))(g\s|kg\s|ml\s)").Value;
this will match:
40 g , 40g , 12.5 g , 1 kg , 2kg , 150 ml ....
I need for text like "joe ($3,004.50)" to be filtered down to 3004.50 but am terrible at regex and can't find a suitable solution. So only numbers and periods should stay - everything else filtered. I use C# and VS.net 2008 framework 3.5
This should do it:
string s = "joe ($3,004.50)";
s = Regex.Replace(s, "[^0-9.]", "");
The regex is:
[^0-9.]
You can cache the regex:
Regex not_num_period = new Regex("[^0-9.]")
then use:
string result = not_num_period.Replace("joe ($3,004.50)", "");
However, you should keep in mind that some cultures have different conventions for writing monetary amounts, such as: 3.004,50.
You are dealing with a string - string is an IEumerable<char>, so you can use LINQ:
var input = "joe ($3,004.50)";
var result = String.Join("", input.Where(c => Char.IsDigit(c) || c == '.'));
Console.WriteLine(result); // 3004.50
For the accepted answer, MatthewGunn raises a valid point in that all digits, commas, and periods in the entire string will be condensed together. This will avoid that:
string s = "joe.smith ($3,004.50)";
Regex r = new Regex(#"(?:^|[^w.,])(\d[\d,.]+)(?=\W|$)/)");
Match m = r.match(s);
string v = null;
if (m.Success) {
v = m.Groups[1].Value;
v = Regex.Replace(v, ",", "");
}
The approach of removing offending characters is potentially problematic. What if there's another . in the string somewhere? It won't be removed, though it should!
Removing non-digits or periods, the string joe.smith ($3,004.50) would transform into the unparseable .3004.50.
Imho, it is better to match a specific pattern, and extract it using a group. Something simple would be to find all contiguous commas, digits, and periods with regexp:
[\d,\.]+
Sample test run:
Pattern understood as:
[\d,\.]+
Enter string to check if matches pattern
> a2.3 fjdfadfj34 34j3424 2,300 adsfa
Group 0 match: "2.3"
Group 0 match: "34"
Group 0 match: "34"
Group 0 match: "3424"
Group 0 match: "2,300"
Then for each match, remove all commas and send that to the parser. To handle case of something like 12.323.344, you could do another check to see that a matching substring has at most one ..
Firstly, I'm in C# here so that's the flavor of RegEx I'm dealing with. And here are thing things I need to be able to match:
[(1)]
or
[(34) Some Text - Some Other Text]
So basically I need to know if what is between the parentheses is numeric and ignore everything between the close parenthesis and close square bracket. Any RegEx gurus care to help?
This should work:
\[\(\d+\).*?\]
And if you need to catch the number, simply wrap \d+ in parentheses:
\[\((\d+)\).*?\]
Do you have to match the []? Can you do just ...
\((\d+)\)
(The numbers themselves will be in the groups).
For example ...
var mg = Regex.Match( "[(34) Some Text - Some Other Text]", #"\((\d+)\)");
if (mg.Success)
{
var num = mg.Groups[1].Value; // num == 34
}
else
{
// No match
}
Regex seems like overkill in this situation. Here is the solution I ended up using.
var src = test.IndexOf('(') + 1;
var dst = test.IndexOf(')') - 1;
var result = test.SubString(src, dst-src);
Something like:
\[\(\d+\)[^\]]*\]
Possibly with some more escaping required?
How about "^\[\((d+)\)" (perl style, not familiar with C#). You can safely ignore the rest of the line, I think.
Depending on what you're trying to accomplish...
List<Boolean> rslt;
String searchIn;
Regex regxObj;
MatchCollection mtchObj;
Int32 mtchGrp;
searchIn = #"[(34) Some Text - Some Other Text] [(1)]";
regxObj = new Regex(#"\[\(([^\)]+)\)[^\]]*\]");
mtchObj = regxObj.Matches(searchIn);
if (mtchObj.Count > 0)
rslt = new List<bool>(mtchObj.Count);
else
rslt = new List<bool>();
foreach (Match crntMtch in mtchObj)
{
if (Int32.TryParse(crntMtch.Value, out mtchGrp))
{
rslt.Add(true);
}
}
How's this? Assuming you only need to determine if the string is a match, and need not extract the numeric value...
string test = "[(34) Some Text - Some Other Text]";
Regex regex = new Regex( "\\[\\(\\d+\\).*\\]" );
Match match = regex.Match( test );
Console.WriteLine( "{0}\t{1}", test, match.Success );