I want to replace all the floating numbers from a mathematical expression with letters using regular expressions. This is what I've tried:
Regex rx = new Regex("[-]?([0-9]*[.])?[0-9]+");
string expression = "((-30+5.2)*(2+7))-((-3.1*2.5)-9.12)";
char letter = 'a';
while (rx.IsMatch(expression))
{
expression = rx.Replace(expression , letter.ToString(), 1);
letter++;
}
The problem is that if I have for example (5-2)+3 it will replace it to: (ab)+c
So it gets the -2 as a number but I don't want that.
I am not experienced with Regex but I think I need something like this:
Check for '-', if there is a one, check if there is a number or right parenthesis before it. If there is NOT then save the '-'.
After that check for digits + dot + digits
My above Regex also works with values like: .2 .3 .4 but I don't need that, it should be explicit: 0.2 0.3 0.4
Following the suggested logic, you may consider
(?:(?<![)0-9])-)?[0-9]+(?:\.[0-9]+)?
See the regex demo.
Regex details
(?:(?<![)0-9])-)? - an optional non-capturing group matching 1 or 0 occurrences of
(?<![)0-9]) - a place in string that is not immediately preceded with a ) or digit
- - a minus
[0-9]+ - 1+ digits
(?:\.[0-9]+)? - an optional non-capturing group matching 1 or 0 occurrences of a . followed with 1+ digits.
In code, it is better to use a match evaluator (see the C# demo online):
Regex rx = new Regex(#"(?:(?<![)0-9])-)?[0-9]+(?:\.[0-9]+)?");
string expression = "((-30+5.2)*(2+7))-((-3.1*2.5)-9.12)";
char letter = (char)96; // char before a in ASCII table
string result = rx.Replace(expression, m =>
{
letter++; // char is incremented
return letter.ToString();
}
);
Console.WriteLine(result); // => ((a+b)*(c+d))-((e*f)-g)
Related
I need to find the matching result i.e a string using Regex. Let me demonstrate the scenario using sample inputs.
string input= "xb-cv_107_20190608_032214_006"; // <-1st case
string input = "yb-ha_107_20190608_032214_006__foobar"; // <-2nd case
string input= "fv_vgf_ka01mq3286__20190426_084135_039"; // <-3rd case
string input="fv_vgf_ka01mq3286__2090426_084135_039"; //<-4th case
For 1st case input, output required= "xb-cv_107_20190608_032214_006".
For 2nd case input, output required= "yb-ha_107_20190608_032214_006".
For 3rd case input, output required= "fv_vgf_ka01mq3286__20190426_084135_039".
For 4th case input, output required= null since the pattern does not match.
The procedure to get the output is:
Check using regex if pattern ends with _ followed by 8 decimals followed by '_'
followed by 6 decimals followed by 3 decimals
Or check using regex if pattern ends with _ followed by 8 decimals followed by _ followed by 6 decimals followed by 3 decimals exists followed by __ exists followed by anything random.
Till now, I have come up with this Regex expression:
string pattern = #".+[_][0-9]{8}[_][0-9]{6}[_][0-9]{3}([_]{2})?";
var result = Regex.Match(input, pattern)?.Groups[0].Value ;
You may use
var result = Regex.Match(input, #"^(.+_[0-9]{8}_[0-9]{6}_[0-9]{3})__")?.Groups[1].Value;
Regex details:
^ - start of string
( - Group 1 start:
.+ - any 1+ chars other than LF, as many as possible
_[0-9]{8}_[0-9]{6}_[0-9]{3} - _, 8 digits, _, 6 digits, _, 3 digits
) - end of Group 1
__ - two underscores.
If there is a match, the result holds the value that resides in Group 1.
If there is no match, result is null.
I need the regex to validate the user entered string, the string may be of following formats.
ex: 1w 2d 1h or 2w 1h or 1w 2d. Likewise combinations of numbers and w,d and h.
I am looking for the regex to allow the combinations number and w or d or h.
Is it possible to have a regex like that way?
You could say you want
\d+ any number, 1 or more times
[wdh] one of w, d, h
(?: |$) space or end of string
Then put this in a group, loop it 1 or more times
(?: start of non-capture group
)+ end of non-capture group, 1 or more times
^ and $ start and end of string respectively
Result
^(?:\d+[wdh](?: |$))+$
Edit: I see you've added more requirements in the comments of your question, this regex will not fulfil your most recent requirements
We can try writing a rudimentary parser to check the input string.
string input = "1w 2d 1h";
string[] parts = Regex.Split(input, #"\s+");
bool success = true;
if (parts.Length > 3)
{
success = false;
}
else
{
Regex regex = new Regex(#"\d+(?:w|d|h)");
foreach (string part in parts)
{
Match match = regex.Match(part);
if (!match.Success)
{
success = false;
}
}
}
if (success)
{
Console.WriteLine("MATCH");
}
else
{
Console.WriteLine("NO MATCH");
}
This answer might carry its own weight if, in your C# application, you also needed to extract the numerical values of each component.
Try ^(?:\d+w ?)?(?:\d+d ?)?(?:\d+h)?$
Explaantion:
^ - match beginning of a string
(?:...) - non-capturing group
\d+ - match one or more digits
w, d, h - match literally w, d, h respectively
? - match preceeding pattern zero or one time
Demo
Finally, the following string seems to be working fine for me...
^(?:\d+w)?((?:\d+d)|(?: \d+d))?((?:\d+h)|(?: \d+h))?$
Thanks, everyone for helping.
This works for your requirement. Hope this helps! Ordering (w->d->h) guaranteed.
^(\d+w){0,1}\s*(\d+d){0,1}\s*(\d+h){0,1}$
Test cases:
10w 12d 11h
1w 2d
2w 1h
1d 1h
1d 1h
3w 2h
I am passing a correct string formate but its not return true.
string dimensionsString= "13.5 inches high x 11.42 inches wide x 16.26 inches deep";
// or 10.1 x 12.5 x 30.9 inches
// or 10.1 x 12.5 x 30.9 inches ; 3.2 pounds
Regex rgxFormat = new Regex(#"^([0-9\.]+) ([a-z]+) x ([0-9\.]+) ([a-z]+) x ([0-9\.]+) ([a-z]+)( ; ([0-9\.]+) ([a-z]+))?$");
if (rgxFormat.IsMatch(dimensionsString))
{
//match
}
I can't understand whats wrong with code ?
Your pattern only accounts for single words after the numbers. Allow any number of symbols there (with .* or .*?) to fix the pattern:
^([0-9.]+) (.*?) x ([0-9\.]+) (.*?) x ([0-9.]+) (.*?)( ; ([0-9.]+) (.*))?$
See the regex demo.
Note that the last .* is used with a greedy quantifier since it is the last unknown bit in the string (to match all the rest of the string). The .*? are non-greedy versions that match as few occurrences of any char but a newline as possible.
Replace regular spaces with \s to match any kind of whitespace if necessary.
So I want the formats xxxxxx-xxxx AND xxxxxxxx-xxxx to be possible. I've managed to fix the first section before the dash, but the last four digits are troublesome. It does require to match at least 4 characters, but I also want the regex to return false if there's more than 4 characters. How do I do it?
This is how it looks so far:
var regex = new Regex(#"^\d{6,8}[-|(\s)]{0,1}\d{4}");
And this is the results:
var regex = new Regex(#"^\d{6,8}[-|(\s)]{0,1}\d{4}");
Match m = regex.Match("840204-2344");
Console.WriteLine(m.Success); // Outputs True
Match m = regex.Match("19840204-2344");
Console.WriteLine(m.Success); // Outputs True
Match m = regex.Match("19840204-23");
Console.WriteLine(m.Success); // Outputs false
Match m = regex.Match("19840204-2323423423");
Console.WriteLine(m.Success); // Outputs true, and this is what I don't want
The \d{6,8} pattern matches 6, 7 or 8 digits, so that will already invalidate your regex pattern. Besdies, [-|(\s)]{0,1} matches 1 or 0 -, (, ), | or whitespace chars, and will also match strings like 19840204|2323, 19840204(2323 and 19840204)2323.
You may use
^\d{6}(?:\d{2})?[-\s]?\d{4}$
See the regex demo.
Details
^ - start of string
\d{6} - 6 digits
(?:\d{2})? - optional 2 digits
[-\s]? - 1 or 0 - or whitespaces
\d{4} - 4 digits
$ - end of string.
To make \d only match ASCII digits, pass RegexOptions.ECMAScriptoption. Example:
var res = Regex.IsMatch(s, #"^\d{6}(?:\d{2})?[-\s]?\d{4}$", RegexOptions.ECMAScript);
You are forgetting the $ at the end:
var regex = new Regex(#"^(\d{6}|\d{8})-\d{4}$");
If you want to match the social security number anywhere in a string, you van also use \b to test for boundaries:
var regex = new Regex(#"\b(\d{6}|\d{8})-\d{4}\b");
Edit: I corrected the RegEx to fix the problems mentioned in the comments. The commentors are right, of course. In my earlier post I just wanted to explain why the RegEx matched the longer string.
Looking for a regex string that will let me find the rightmost (if any) group of digits embedded in a string. We only care about contiguous digits. We don't care about sign, commas, decimals, etc. Those, if found should simply be treated as non-digits just like a letter.
This is for replacement/incrementing purposes so we also need to grab everything before and after the detected number so we can reconstruct the string after incrementing the value so we need a tokenized regex.
Here's examples of what we are looking for:
"abc123def456ghi" should identify the'456'
"abc123def456ghi789jkl" should identify the'789'
"abc123def" should identify the'123'
"123ghi" should identify the'123'
"abc123,456ghi" should identify the'456'
"abc-654def" should identify the'654'
"abcdef" shouldn't return any match
As an example of what we want, it would be something like starting with the name 'Item 4-1a', extracting out the '1' with everything before being the prefix and everything after being the suffix. Then using that, we can generate the values 'Item 4-2a', 'Item 4-3a' and 'Item 4-4a' in a code loop.
Now If I were looking for the first set, this would be easy. I'd just find the first contiguous block of 0 or more non-digits for the prefix, then the block of 1 or more contiguous digits for the number, then everything else to the end would be the suffix.
The issue I'm having is how to define the prefix as including all (if any) numbers except the last set. Everything I try for the prefix keeps swallowing that last set, even when I've tried anchoring it to the end by basically reversing the above.
How about:
^(.*?)(\d+)(\D*)$
then increment the second group and concat all 3.
Explanation:
^ : Begining of string
( : start of 1st capture group
.*? : any number of any char not greedy
) : end group
( : start of 2nd capture group
\d+ : one or more digits
) : end group
( : start of 3rd capture group
\D* : any number of non digit char
) : end group
$ : end of string
The first capture group will match all characters until the first digit of last group of digits before the end of the string.
or if you can use named group
^(?<prefix>.*?)(?<number>\d+)(?<suffix>\D*)$
Try next regex:
(\d+)(?!.*\d)
Explanation:
(\d+) # One or more digits.
(?!.*\d) # (zero-width) Negative look-ahead: Don't find any characters followed with a digit.
EDIT (OFFTOPIC of the question):: This answer is incorrect but this question has already been answered in other posts so to avoid delete this one I will use this same regex other way, for example in Perl could be used like this to get same result as in C# (increment last digit):
s/(\d+)(?!.*\d)/$1 + 1/e;
You can also try little bit simpler version:
(\d+)[^\d]*$
This should do it:
Regex regexObj = new Regex(#"
# Grab last set of digits, prefix and suffix.
^ # Anchor to start of string.
(.*) # $1: Stuff before last set of digits.
(?<!\d) # Anchor start of last set of digits.
(\d+) # $2: Last set of one or more digits.
(\D*) # $3: Zero or more trailing non digits.
$ # Anchor to end of string.
", RegexOptions.IgnorePatternWhitespace);
What about not using Regex. Here's code snippet (for console)
string[] myStringArray = new string[] { "abc123def456ghi", "abc123def456ghi789jkl", "abc123def", "123ghi", "abcdef","abc-654def" };
char[] numberSet = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
char[] filterSet = new char[] {'a','b','c','d','e','f','g','h','i','j','k','l','m',
'n','o','p','q','r','s','t','u','v','w','x','y','z','-'};
foreach (string myString in myStringArray)
{
Console.WriteLine("your string - {0}",myString);
int index1 = myString.LastIndexOfAny(numberSet);
if (index1 == -1)
Console.WriteLine("no number");
else
{
string mySubString = myString.Substring(0,index1 + 1);
string prefix = myString.Substring(index1 + 1);
Console.WriteLine("prefix - {0}", prefix);
int index2 = mySubString.LastIndexOfAny(filterSet);
string suffix = myString.Substring(0, index2 + 1);
Console.WriteLine("suffix - {0}",suffix);
mySubString = mySubString.Substring(index2 + 1);
Console.WriteLine("number - {0}",mySubString);
Console.WriteLine("_________________");
}
}
Console.Read();