RegExp: X number of matches => X number of replacements? - c#

Using regular expressions I'm trying to match a string, which has a substring consisting of unknown number of repeats (one or more) and then replace the repeating substring with the same number of replacement strings.
If the Regexp is "(st)[a]+(ck)", then I want to get these kind of results:
"stack" => "stOck"
"staaack" => "stOOOck" //so three times "a" to be replaced with three times "O"
"staaaaack" => "stOOOOOck"
How do I do that?
Either C# or AS3 would do.

If you use .net you can do this
find: (?<=\bsta*)a(?=a*ck\b)
replace: o
If you want to change all sta+ck that are substring of other words, only remove the \b

Since I am not familiar with either C# or AS3, I will write a solution in JavaScript, but the concept in the solution can be used for C# code or AS3 code.
var str = "stack stackoverflow staaaaaack stOackoverflow should not replace";
var replaced = str.replace(/st(a+)ck/g, function ($0, $1) {
var r = "";
for (var i = 0; i < $1.length; i++) {
r += "O";
}
return "st" + r + "ck";
});
Output:
"stOck stOckoverflow stOOOOOOck stOackoverflow should not replace"
In C#, you would use Regex.Replace(String, String, MatchEvaluator) (or other Regex.Replace methods that takes in a MatchEvaluator delegate) to achieve the same effect.
In AS3, you can pass a function as replacement, similar to how I did above in JavaScript. Check out the documentation of String.replace() method.

For AS3 you can pass a function to the replace method on the String object where matching elements are into the arguments array. So you can build and return a new String with all the 'a' replaced by 'O'
for example:
// first way explicit loop
var s:String="staaaack";
trace("before", s);
var newStr:String = s.replace(/(st)(a+)(ck)/g, function():String{
var ret:String=arguments[1]; // here match 'st'
//arguments[2] match 'aaa..'
for (var i:int=0, len:int=arguments[2].length; i < len; i++)
ret += "O";
return ret + arguments[3]; // arguments[3] match 'ck'
});
trace("after", newStr); // output stOOOOck
// second way array and join
var s1:String="staaaack staaaaaaaaaaaaack stack paaaack"
trace("before", s1)
var after:String = s1.replace(/(st)(a+)(ck)/g, function():String{
return arguments[1]+(new Array(arguments[2].length+1)).join("O")+arguments[3]
})
trace("after", after)
here live example on wonderfl : http://wonderfl.net/c/bOwE

Why not use the String Replace() method instead?
var str = "stack";
str = str.Replace("a", "O");

I would do it like this:
String s = "Staaack";
Console.WriteLine(s);
while (Regex.Match(s,"St[O]*([a]{1})[a]*ck").Success){
s = Regex.Replace(s,"(St[O]*)([a]{1})([a]*ck)", "$1O$3");
Console.WriteLine(s);
}
Console.WriteLine(s);
Console.ReadLine();
it replaces one a with every iteration, until no more as can be found.

Related

Remove anything from string after any "a-zA-Z" char

I have this types of string:
"10a10", "10b5641", "5a1121", "438z2a5f"
and I need to remove anything after the FIRST a-zA-Z char in the string (the symbol itself should be removed as well). What could be a solution?
Examples of results I expect:
"10a10" returns "10"
"10b5641" returns "10"
"5a1121" returns "5"
"438z2a5f" returns "438"
You could use Regular Expressions along with Regex, something like:
string str = "10a10";
str = Regex.Replace(str, #"[a-zA-Z].*", "");
Console.WriteLine(str);
will output:
10
Basically it will takes everything that starts with a-zA-Z and everything after it (.* matches any characters zero or unlimited times) and remove it from the string.
An easy to understand approach would be to use the String.IndexOfAny Method to find the Index of the first a-zA-Z char, and then use the String.Substring Method to cut the string accordingly.
To do so you would create an array containing all a-zA-Z characters and use this as an argument to String.IndexOfAny. After that you use 0 and the result of String.IndexOfAny as arguments for String.Substring.
I am pretty sure there are more elegant ways to do this, but this seems the most basic approach to me, so its worth mentioning.
You could do so using Linq as follows.
var result = new string(strInput.TakeWhile(x => !char.IsLetter(x)).ToArray());
var sList = new List<string> { "10a10", "10b5641", "5a1121", "438z2a5f" };
foreach (string s in sList.ToArray())
{
string number = new string(s.TakeWhile(c => !Char.IsLetter(c)).ToArray());
Console.WriteLine(number);
}
Either Linq:
var result = string.Concat(strInput
.TakeWhile(c => !((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')));
Or regular expression:
using System.Text.RegularExpressions;
...
var result = Regex.Match(strInput, "^[^A-Za-z]*").Value;
In both cases starting from strInput beginning take characters until a..z or A-Z occurred
Demo:
string[] tests = new[] {
"10a10", "10b5641", "5a1121", "438z2a5f"
};
string demo = string.Join(Environment.NewLine, tests
.Select(test => $"{test,-10} returns \"{Regex.Match(test, "^[^A-Za-z]*").Value}\""));
Console.Write(demo);
Outcome:
10a10 returns "10"
10b5641 returns "10"
5a1121 returns "5"
438z2a5f returns "438"

Get a number and string from string

I have a kinda simple problem, but I want to solve it in the best way possible. Basically, I have a string in this kind of format: <some letters><some numbers>, i.e. q1 or qwe12. What I want to do is get two strings from that (then I can convert the number part to an integer, or not, whatever). The first one being the "string part" of the given string, so i.e. qwe and the second one would be the "number part", so 12. And there won't be a situation where the numbers and letters are being mixed up, like qw1e2.
Of course, I know, that I can use a StringBuilder and then go with a for loop and check every character if it is a digit or a letter. Easy. But I think it is not a really clear solution, so I am asking you is there a way, a built-in method or something like this, to do this in 1-3 lines? Or just without using a loop?
You can use a regular expression with named groups to identify the different parts of the string you are interested in.
For example:
string input = "qew123";
var match = Regex.Match(input, "(?<letters>[a-zA-Z]+)(?<numbers>[0-9]+)");
if (match.Success)
{
Console.WriteLine(match.Groups["letters"]);
Console.WriteLine(match.Groups["numbers"]);
}
You can try Linq as an alternative to regular expressions:
string source = "qwe12";
string letters = string.Concat(source.TakeWhile(c => c < '0' || c > '9'));
string digits = string.Concat(source.SkipWhile(c => c < '0' || c > '9'));
You can use the Where() extension method from System.Linq library (https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where), to filter only chars that are digit (number), and convert the resulting IEnumerable that contains all the digits to an array of chars, that can be used to create a new string:
string source = "qwe12";
string stringPart = new string(source.Where(c => !Char.IsDigit(c)).ToArray());
string numberPart = new string(source.Where(Char.IsDigit).ToArray());
MessageBox.Show($"String part: '{stringPart}', Number part: '{numberPart}'");
Source:
https://stackoverflow.com/a/15669520/8133067
if possible add a space between the letters and numbers (q 3, zet 64 etc.) and use string.split
otherwise, use the for loop, it isn't that hard
You can test as part of an aggregation:
var z = "qwe12345";
var b = z.Aggregate(new []{"", ""}, (acc, s) => {
if (Char.IsDigit(s)) {
acc[1] += s;
} else {
acc[0] += s;
}
return acc;
});
Assert.Equal(new [] {"qwe", "12345"}, b);

Replace text in camelCase inside the string

I have string like:
/api/agencies/{AgencyGuid}/contacts/{ContactGuid}
I need to change text in { } to cameCase
/api/agencies/{agencyGuid}/contacts/{contactGuid}
How can I do that? What is the best way to do that? Please help
I have no experience with Regex. So, I have tried so far:
string str1 = "/api/agencies/{AgencyGuid}/contacts/{ContactGuid}";
string str3 = "";
int i = 0;
while(i < str1.Length)
{
if (str1[i] == '{')
{
str3 += "{" + char.ToLower(str1[i + 1]);
i = i + 2;
} else
{
str3 += str1[i];
i++;
}
}
You can do it with regex of course.
But you can do it also with LINQ like this:
var result = String.Join("/{",
str1.Split(new string[1] { "/{" }, StringSplitOptions.RemoveEmptyEntries)
.Select(k => k = !k.StartsWith("/") ? Char.ToLowerInvariant(k[0]) + k.Substring(1) : k));
What is done here is: Splitting into 3 parts:
"/api/agencies/"
"AgencyGuid}/contactpersons"
"ContactPersonGuid}"
After that we are selecting from each element such value: "If you start with "/" it means you are the first element. If so - you should be returned without tampering. Otherwise : take first char (k[0]) change it to lowercase ( Char.ToLowerInvariant() ) and concatenate with the rest.
At the end Join those three (one unchanged and two changed) strings
With Regex you can do it as:
var regex = new Regex(#"\/{(\w)");
var result = regex.Replace(str1, m => m.ToString().ToLower());
in regex we search for pattern "/{\w" meaning find "/{" and one letter (\w). This char will be taken into a group ( because of () surrounding) and after that run Regex and replace such group to m.ToString().ToLower()
I probably wouldn't use regex, but since you asked
Regex.Replace(
"/api/agencies/{AgencyGuid}/contactpersons/{ContactPersonGuid}",
#"\{[^\}]+\}",
m =>
$"{{{m.Value[1].ToString().ToLower()}{m.Value.Substring(2, m.Value.Length-3)}}}",
RegexOptions.ExplicitCapture
)
This assumes string interpolation in c# 6, but you can do the same thing by concatenating.
Explanation:
{[^}]+} - grab all letters that follow an open mustache that are not a close mustache and then the close mustache
m => ... - A lambda to run on each match
"{{{m.Value[1].ToString().ToLower()}{m.Value.Substring(2, m.Value.Length-3)}}}" - return a new string by taking the an open mustache, the first letter lowercased, then the rest of the string, then a close mustache.

replacing characters in a single field of a comma-separated list

I have string in my c# code
a,b,c,d,"e,f",g,h
I want to replace "e,f" with "e f" i.e. ',' which is inside inverted comma should be replaced by space.
I tried using string.split but it is not working for me.
OK, I can't be bothered to think of a regex approach so I am going to offer an old fashioned loop approach which will work:
string DoReplace(string input)
{
bool isInner = false;//flag to detect if we are in the inner string or not
string result = "";//result to return
foreach(char c in input)//loop each character in the input string
{
if(isInner && c == ',')//if we are in an inner string and it is a comma, append space
result += " ";
else//otherwise append the character
result += c;
if(c == '"')//if we have hit an inner quote, toggle the flag
isInner = !isInner;
}
return result;
}
NOTE: This solution assumes that there can only be one level of inner quotes, for example you cannot have "a,b,c,"d,e,"f,g",h",i,j" - because that's just plain madness!
For the scenario where you only need to match one pair of letters, the following regex will work:
string source = "a,b,c,d,\"e,f\",g,h";
string pattern = "\"([\\w]),([\\w])\"";
string replace = "\"$1 $2\"";
string result = Regex.Replace(source, pattern, replace);
Console.WriteLine(result); // a,b,c,d,"e f",g,h
Breaking apart the pattern, it is matching any instance where there is a "X,X" sequence where X is any letter, and is replacing it with the very same sequence, with a space in between the letters instead of a comma.
You could easily extend this if you needed to to have it match more than one letter, etc, as needed.
For the case where you can have multiple letters separated by commas within quotes that need to be replaced, the following can do it for you. Sample text is a,b,c,d,"e,f,a",g,h:
string source = "a,b,c,d,\"e,f,a\",g,h";
string pattern = "\"([ ,\\w]+),([ ,\\w]+)\"";
string replace = "\"$1 $2\"";
string result = source;
while (Regex.IsMatch(result, pattern)) {
result = Regex.Replace(result, pattern, replace);
}
Console.WriteLine(result); // a,b,c,d,"e f a",g,h
This does something similar compared to the first one, but just removes any comma that is sandwiched by letters surrounded by quotes, and repeats it until all cases are removed.
Here's a somewhat fragile but simple solution:
string.Join("\"", line.Split('"').Select((s, i) => i % 2 == 0 ? s : s.Replace(",", " ")))
It's fragile because it doesn't handle flavors of CSV that escape double-quotes inside double-quotes.
Use the following code:
string str = "a,b,c,d,\"e,f\",g,h";
string[] str2 = str.Split('\"');
var str3 = str2.Select(p => ((p.StartsWith(",") || p.EndsWith(",")) ? p : p.Replace(',', ' '))).ToList();
str = string.Join("", str3);
Use Split() and Join():
string input = "a,b,c,d,\"e,f\",g,h";
string[] pieces = input.Split('"');
for ( int i = 1; i < pieces.Length; i += 2 )
{
pieces[i] = string.Join(" ", pieces[i].Split(','));
}
string output = string.Join("\"", pieces);
Console.WriteLine(output);
// output: a,b,c,d,"e f",g,h

Extract digit in a string

I have a list of string
goal0=1234.4334abc12423423
goal1=-234234
asdfsdf
I want to extract the number part from string that start with goal,
in the above case is
1234.4334, -234234
(if two fragments of digit get the first one)
how should i do it easily?
Note that "goal0=" is part of the string, goal0 is not a variable.
Therefore I would like to have the first digit fragment that come after "=".
You can do the following:
string input = "goal0=1234.4334abc12423423";
input = input.Substring(input.IndexOf('=') + 1);
IEnumerable<char> stringQuery2 = input.TakeWhile(c => Char.IsDigit(c) || c=='.' || c=='-');
string result = string.Empty;
foreach (char c in stringQuery2)
result += c;
double dResult = double.Parse(result);
Try this
string s = "goal0=-1234.4334abc12423423";
string matches = Regex.Match(s, #"(?<=^goal\d+=)-?\d+(\.\d+)?").Value;
The regex says
(?<=^goal\d+=) - A positive look behind which means look back and make sure goal(1 or more number)= is at the start of the string, but dont make it part of the match
-? - A minus sign which is optional (the ? means 1 or more)
\d+ - One or more digits
(\.\d+)? - A decimal point followed by 1 or more digits which is optional
This will work if your string contains multiple decimal points as well as it will only take the first set of numbers after the first decimal point if there are any.
Use a regex for extracting:
x = Regex.Match(string, #"\d+").Value;
Now convert the resulting string to the number by using:
finalNumber = Int32.Parse(x);
Please try this:
string sample = "goal0=1234.4334abc12423423goal1=-234234asdfsdf";
Regex test = new Regex(#"(?<=\=)\-?\d*(\.\d*)?", RegexOptions.Singleline);
MatchCollection matchlist = test.Matches(sample);
string[] result = new string[matchlist.Count];
if (matchlist.Count > 0)
{
for (int i = 0; i < matchlist.Count; i++)
result[i] = matchlist[i].Value;
}
Hope it helps.
I didn't get the question at first. Sorry, but it works now.
I think this simple expression should work:
Regex.Match(string, #"\d+")
You can use the old VB Val() function from C#. That will extract a number from the front of a string, and it's already available in the framework:
result0 = Microsoft.VisualBasic.Conversion.Val(goal0);
result1 = Microsoft.VisualBasic.Conversion.Val(goal1);
string s = "1234.4334abc12423423";
var result = System.Text.RegularExpressions.Regex.Match(s, #"-?\d+");
List<String> list = new List<String>();
list.Add("goal0=1234.4334abc12423423");
list.Add("goal1=-23423");
list.Add("asdfsdf");
Regex regex = new Regex(#"^goal\d+=(?<GoalNumber>-?\d+\.?\d+)");
foreach (string s in list)
{
if(regex.IsMatch(s))
{
string numberPart = regex.Match(s).Groups["GoalNumber"];
// do something with numberPart
}
}

Categories

Resources