How does MatchEvaluator in Regex.Replace work? - c#

This is the input string 23x * y34x2. I want to insert " * " (star surrounded by whitespaces) after every number followed by letter, and after every letter followed by number. So my output string would look like this: 23 * x * y * 34 * x * 2.
This is the regex that does the job: #"\d(?=[a-z])|[a-z](?=\d)". This is the function that I wrote that inserts the " * ".
Regex reg = new Regex(#"\d(?=[a-z])|[a-z](?=\d)");
MatchCollection matchC;
matchC = reg.Matches(input);
int ii = 1;
foreach (Match element in matchC)//foreach match I will find the index of that match
{
input = input.Insert(element.Index + ii, " * ");//since I' am inserting " * " ( 3 characters )
ii += 3; //I must increment index by 3
}
return input; //return modified input
My question how to do same job using .net MatchEvaluator? I am new to regex and don't understand good replacing with MatchEvaluator. This is the code that I tried to wrote:
{
Regex reg = new Regex(#"\d(?=[a-z])|[a-z](?=\d)");
MatchEvaluator matchEval = new MatchEvaluator(ReplaceStar);
input = reg.Replace(input, matchEval);
return input;
}
public string ReplaceStar( Match match )
{
//return What??
}

A MatchEvaluator is a delegate that takes a Match object and returns a string that should be replaced instead of the match. You can also refer to groups from the match. You can rewrite your code as follows:
string input = "23x * y34x2";
Regex reg = new Regex(#"\d(?=[a-z])|[a-z](?=\d)");
string result = reg.Replace(input, delegate(Match m) {
return m.Value + " * ";
});
To give an example of how this works, the first time the delegate is called, Match parameter will be a match on the string "3". The delegate in this case is defined to return the match itself as a string concatenated with " * ". So the first "3" is replaced with "3 * ".
The process continues in this way, with delegate being called once for each match in the original string.

Related

Replace only 'n' occurences of a substring in a string in C#

I have a input string like -
abbdabab
How to replace only the 2nd, 3rd and subsequent occurances of the substring "ab" with any random string like "x" keeping the original string intact. Example in this case -
1st Output - xbdabab 2nd Output - abbdxab 3rd Output - abbdabx and so on...
I have tried using Regex like -
int occCount = Regex.Matches("abbdabab", "ab").Count;
if (occCount > 1)
{
for (int i = 1; i <= occCount; i++)
{
Regex regReplace = new Regex("ab");
string modifiedValue = regReplace.Replace("abbdabab", "x", i);
//decodedMessages.Add(modifiedValue);
}
}
Here I am able to get the 1st output when the counter i value is 1 but not able to get the subsequent results. Is there any overloaded Replace method which could achieve this ? Or Can anyone help me in pointing where I might have gone wrong?
You can try IndexOf instead of regular expressions:
string source = "abbdabab";
string toFind = "ab";
string toSet = "X";
for (int index = source.IndexOf(toFind);
index >= 0;
index = source.IndexOf(toFind, index + 1)) {
string result = source.Substring(0, index) +
toSet +
source.Substring(index + toFind.Length);
Console.WriteLine(result);
}
Outcome:
Xbdabab
abbdXab
abbdabX
You can use a StringBuilder:
string s = "abbdabab";
var matches = Regex.Matches(s, "ab");
StringBuilder sb = new StringBuilder(s);
var m = matches[0]; // 0 for first output, 1 for second output, and so on
sb.Remove(m.Index, m.Length);
sb.Insert(m.Index, "x");
var result = sb.ToString();
Console.WriteLine(result);
You may use a dynamically built regex to be used with regex.Replace directly:
var s = "abbdabab";
var idx = 1; // First = 1, Second = 2
var search = "ab";
var repl = "x";
var pat = new Regex($#"(?s)((?:{search}.*?){{{idx-1}}}.*?){search}"); // ((?:ab.*?){0}.*?)ab
Console.WriteLine(pat.Replace(s, $"${{1}}{repl}", 1));
See the C# demo
The pattern will look like ((?:ab.*?){0}.*?)ab and will match
(?s) - RegexOptions.Singleline to make . also match newlines
((?:ab.*?){0}.*?) - Group 1 (later, this value will be put back into the result with ${1} backreference)
(?:ab.*?){0} - 0 occurrences of ab followed with any 0+ chars as few as possible
.*? - any 0+ chars as few as possible
ab - the search string/pattern.
The last argument to pat.Replace is 1, so that only the first occurrence could be replaced.
If search is a literal text, you need to use var search = Regex.Escape("a+b");.
If the repl can have $, add repl = repl.Replace("$", "$$");.

Why is this Regex.Match failing?

I have this little method to look for the 3-digit number in a string and increment it by one. The types of strings I am passing in are like CP1-P-CP2-004-D and MOT03-C-FP04-003.
char[] alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray();
foreach (char c in alphabet)
{
m = Regex.Match(s, #"\d{3}(?=[" + c + "-]|$)");
}
if (m.Success)
{
int i = Convert.ToInt32(m.Value); i += 1;
Console.WriteLine(s + " - " + i.ToString("D3"));
}
else { Console.WriteLine(s + " - No success"); }
EDIT: Initially I just had this; to test out my Regex.Match case:
Match m = Regex.Match(s, #"\d{3}(?=[A-]|$)");
And it worked with CP1PCP2001A no worries, but when I updated it, and tried CP1PCP2001C it returned "No Success", while CP1PCP2001 works no problem. Can anyone tell me why this is?
Have you tried
m = Regex.Match(s, #"\d{3}(?=[A-Z\-]|$)");
[A-Z] means that it can be any of the capital letters between A and Z thus eliminating the need for char[] alphabet, and the \- allows you to add the '-' as a parameter, without causing conflict with the first parameter.
From the comments, we're looking for "the first 3 digit number (coming from the right)". Here's a literal implementation:
m = Regex.Match(s, #"\d{3}", RegexOptions.RightToLeft);
This is more permissive towards unexpected characters than the other answers. You can decide whether that's good or bad for your application.
re-write the code this way
bool matched = false;
foreach (char c in alphabet)
{
m = Regex.Match(s, #"\d{3}(?=[" + c + "-]|$)");
if (m.Success)
{
int i = Convert.ToInt32(m.Value); i += 1;
Console.WriteLine(s + " - " + i.ToString("D3"));
matched=true;
break;
}
}
if(!matched)
Console.WriteLine(s + " - No success");
a better way would be not to loop and specify the char range to match in regex itself
example
m = Regex.Match(s, #"\d{3}(?=[A-Z\-]|$)");
if (m.Success)
{
int i = Convert.ToInt32(m.Value); i += 1;
Console.WriteLine(s + " - " + i.ToString("D3"));
}
else
Console.WriteLine(s + " - No success");
regex demo here

Taking a piece of a REGEX and setting it to a String

Is there any way to take a part out of a regex? Let's say I have a match for this
\s*(string)\s*(.*\()\s*(\d*)\)\s*;?(.*)
and I want to change it like this
Regex.Replace(line, #"\s*(string)\s*(.*\()\s*(\d*)\)\s*;?(.*)", "$1 $2($3) // $4", RegexOptions.IgnoreCase);
Is there any way I can grab the $4 by itself and set it equal to some string variable?
Let's say the regex match is: string (55) ;comment
In this case I'd like to get the word comment only and set it to a string without going through the String.Split function. Ultimately, though, I'd just like to get the digits between the parentheses.
There's an overload for the Replace method which takes a MatchEvaluator delegate:
string pattern = "...";
string result = Regex.Replace(line, pattern, m =>
{
int digits = 0;
string comment = m.Groups[4].Value; // $4
int.TryParse(m.Groups[3].Value, out digits); // $3
return string.Format("{0} {1}({2}) // {3}",
m.Groups[1].Value, m.Groups[2].Value, digits, comment);
}, RegexOptions.IgnoreCase);
Hope this helps.
Yes, if I understand the question correctly:
var re = new Regex(#"\s*(string)\s*(.*\()\s*(\d*)\)\s*;?(.*)");
var match = re.Match(input);
if (match.Success)
{
int i = match.Groups[4].Index;
int n = match.Groups[4].Length;
input = input.Substring(0, i) + replacementString + input.Substring(i + n);
}

Regular expression to test for /CXXX/ needed

I am just starting to learn about regular expressions. What I need is to check for a slash followed by "C" followed by three uppercase characters / numbers and then another slash followed by anything.
var a = "/C001/dsafalkdsfjsadfj";
var b = "/CXXX/adsf";
Can someone tell me how I can do a check for this within an if test?
if ( regular expression ) {}
try that :
you wrote :
and then another slash followed by anything
which is not your example , but anyway : (according to the sentence )
\/C[A-Z0-9]{3}\/$
(according to the example : )
\/C[A-Z0-9]{3}\/[a-z]$
(according to your response : )
\/C[A-Z0-9]{3}\/
Regex regex = new Regex (#"\/C[A-Z0-9]{3}\/$");
MatchCollection matches = regex.Matches(yourstring);
if matches.Count>0...
string input = "/C001/dsafalkdsfjsadfj";
var pattern = #"/C[A-Z0-9]{3}/.*";
var matches = Regex.Matches(input, pattern);
string result = "";
for (int i = 0; i < matches.Count; i++)
{
result += "match " + i + ",value:" + matches[i].Value + "\n";
}
Console.WriteLine("Result:\n"+result);

C# RegEx string extraction

I have a string:
"ImageDimension=655x0;ThumbnailDimension=0x0".
I have to extract first number ("655" string) coming in between "ImageDimension=" and first occurrence of "x" ;
and need extract second number ("0" string) coming after first "x" occurring after "ImageDimension=" string. Similar with third and fourth numbers.
Can this be done with regex ("ImageDimension=? x ?;ThumbnailDimension=? x ?") and how ? Instead of clumsy substrings and indexof ? Thank you!
My solution which is not nice :
String configuration = "ImageDimension=655x0;ThumbnailDimension=0x0";
String imageDim = configuration.Substring(0, configuration.IndexOf(";"));
int indexOfEq = imageDim.IndexOf("=");
int indexOfX = imageDim.IndexOf("x");
String width1 = imageDim.Substring(indexOfEq+1, indexOfX-indexOfEq-1);
String height1 = imageDim.Substring(imageDim.IndexOf("x") + 1);
String thumbDim = configuration.Substring(configuration.IndexOf(";") + 1);
indexOfEq = thumbDim.IndexOf("=");
indexOfX = thumbDim.IndexOf("x");
String width2 = imageDim.Substring(indexOfEq + 1, indexOfX - indexOfEq-1);
String height2 = imageDim.Substring(imageDim.IndexOf("x") + 1);
This will get each of the values into separate ints for you:
string text = "ImageDimension=655x0;ThumbnailDimension=0x0";
Regex pattern = new Regex(#"ImageDimension=(?<imageWidth>\d+)x(?<imageHeight>\d+);ThumbnailDimension=(?<thumbWidth>\d+)x(?<thumbHeight>\d+)");
Match match = pattern.Match(text);
int imageWidth = int.Parse(match.Groups["imageWidth"].Value);
int imageHeight = int.Parse(match.Groups["imageHeight"].Value);
int thumbWidth = int.Parse(match.Groups["thumbWidth"].Value);
int thumbHeight = int.Parse(match.Groups["thumbHeight"].Value);
var groups = Regex.Match(input,#"ImageDimension=(\d+)x(\d+);ThumbnailDimension=(\d+)x(\d+)").Groups;
var x1= groups[1].Value;
var y1= groups[2].Value;
var x2= groups[3].Value;
var y2= groups[4].Value;
var m = Regex.Match(str,#"(\d+).(\d+).*?(\d+).(\d+)");
m.Groups[1].Value; // 655 ....
(\d+)
Get the first set of one or more digits. and store it as the first captured group after the entire match
.
Match any character
(\d+)
Get the next set of one or more digits. and store it as the second captured group after the entire match
.*?
match and number of any characters in a non greedy fashion.
(\d+)
Get the next set of one or more digits. and store it as the third captured group after the entire match
(\d+)
Get the next set of one or more digits. and store it as the fourth captured group after the entire match
Since a lot of people already gave you what you wanted, I will contribute with something else. Regexes are hard to read and error prone. Maybe a little less verbose than your implementation but more straightforward and friendly than using regex:
private static Dictionary<string, string> _extractDictionary(string str)
{
var query = from name_value in str.Split(';') // Split by ;
let arr = name_value.Split('=') // ... then by =
select new {Name = arr[0], Value = arr[1]};
return query.ToDictionary(x => x.Name, y => y.Value);
}
public static void Main()
{
var str = "ImageDimension=655x0;ThumbnailDimension=0x0";
var dic = _extractDictionary(str);
foreach (var key_value in dic)
{
var key = key_value.Key;
var value = key_value.Value;
Console.WriteLine("Value of {0} is {1}.", key, value.Substring(0, value.IndexOf("x")));
}
}
Sure, it's pretty easy. The regex pattern you're looking for is:
^ImageDimension=(\d+)x0;.+$
The first group in the match is the number you want.

Categories

Resources