How would I use regular expressions to remove a parenthesised substring? - c#

I have a regular expression (regex) question...
How would I use regular expressions to remove the contents in parenthesis in a string in C# like this:
"SOMETHING (#2)"
The part of the string I want to remove always appears within paranthesis and they are always # followed by some number. The rest of the string needs to be left alone.

Remove everything including the parenthesis
var input = "SOMETHING (#2) ELSE";
var pattern = #"\(#\d+\)";
var replacement = "";
var replacedString = System.Text.RegularExpressions.Regex.Replace(input, pattern, replacement);
Remove only contents within parenthesis
var input = "SOMETHING (#2) ELSE";
var pattern = #"(.+?)\(#\d+\)(.+?)";
var replacement = "$1()$2";
var replacedString = System.Text.RegularExpressions.Regex.Replace(input, pattern, replacement);

Related

Trying to regex a string with backslashes and quotes

I am trying to regex a string in csharp. I am expecting to pass a string with the following format:
<%=Application(\"DisplayName\")%>
and get back:
DisplayName
I am using the regex class to accomplish this:
var text = "<%=Application(\"DisplayName\")%>";
Regex regex = new Regex(#"(<\%=Application[\>\(\)][\\][""](.*?)[\\][""][\>\(\)k]%\>)");
var v = regex.Match(text);
var s = v.Groups[1].ToString();
I am expecting s to contain the output string, but it is coming back as "". I tried building the regex string step by step, but I can't get the \ or " to process correctly. Any help would be greatly appreciated. Thanks!
var text = "<%=Application(\"DisplayName\")%>";
Regex regex = new Regex(#"(<%=Application[>()][""](.*?)[""][>()k]%>)");
var v = regex.Match(text);
var s = v.Groups[1].ToString();
Your pattern is very close. Since the backslashes are not actually a part of the string, rather only in the string to escape the double quotes, they need to be left out of the regex pattern. Notice I removed the [\\] from before both of the double quotes [""].
Now, you expect DisplayName in Group[1]. Since Regex sticks the entire match in Group[0], that made your outer capture group (whole pattern in parenthesis) the first actual capture group (Making DisplayName actually Group[2]). For best practice, I changed the outer capture group to be a non-capture group by adding ?: to the open parenthesis. This ignores this full group and makes DisplayName Group[1]. Hope this helps.
Full test code:
var text = "<%=Application(\"DisplayName\")%>";
Regex regex = new Regex(#"(?:<\%=Application[\>\(\)][""](.*?)[""][\>\(\)k]%\>)");
var v = regex.Match(text);
var s = v.Groups[1].ToString();

Regex to skip all occurences of ")" but the last one in a string

Is there a way to validate a string that contains parenthesees so that the expression ignores all but the last one?
The regex expression is like this: (?<function>^(?!\/_).[A-Za-z_]*)\((?<args>[^\)]+\)), and the string has the following format:
web_convert_param("sEV_4_URL2",
"SourceString={sEV_4}",
"SourceEncoding=HTML",
"TargetEncoding=URL",
"veh_sym_sel=EXT%20CAB%20(8CYL%204x2)",
LAST);
If run this in the Regex Tester it stops at the next two the last closing parent. Is this possible in this context?
The C# code that runs this looks like this:
try
{
var autoRemove = new ArrayList(ConfigurationManager.AppSettings["AutoRemoveFunctions"].Split(','));
baseFileData = ScriptProperties.ScriptText;
var matches = regEx_SBR.Matches(baseFileData);
foreach (Match match in matches)
{
var functionName = match.Groups["function"].Value.Trim();
if (autoRemove.Contains(functionName) || string.IsNullOrEmpty(functionName)) continue;
var args = match.Groups["args"].Value;
args = match.Groups["args"].Value.Replace("\"", "").Replace("\n", "").Replace("\r", "");
var arguments = args.Split(',');
_scriptFunction = new BaseScriptFunction();
ParseFunction(match.Groups["function"].Value.Trim(), arguments, match.Value.Trim());
if (_scriptFunction.IsNamedTransaction)
{
_scriptFunction.TransactionName = string.Format("{0}{1}",transactionPrefix, _scriptFunction.TransactionName);
}
ScriptFunctions.Add(_scriptFunction);
}
}
You could instead use [\s\S] and use \); for the conclusion of the regex? Something like this:
(?<function>^(?!\/_).[A-Za-z_]*)\((?<args>[\s\S]+?)\);
regex101 demo
[\s\S] matches any character. You can use . instead but with the (?s) flag too.
Maybe you could try this for a better regex without the final semicolon:
(?<function>^(?!\/_).[A-Za-z_]*)\((?<args>(?:"[^"]+"|[^\)"]+)+)\)
regex101 demo.
This works if you don't have any " within the argument strings themselves (escaped or not). If you can have escaped ", then the regex will have to be longer...
(?<function>^(?!\/_).[A-Za-z_]*)\((?<args>(?:"(?:\\.|[^"\\]+)+"|[^\)"]+)+)\)
regex101 demo

Regular Expression For JSON

I have a string -
xyz":abc,"lmn
I want to extract abc. what will be the regular expression for this ?
I am trying this -
/xyz\":(.*?),\"lmn/
But it is not fetching any result.
In c# you could use
var regex = new Regex(#"(?<=xyz\"":).*?(?=,\""lmn)");
var value = regex.Match(#"xyz"":abc,""lmn").Value;
Note this makes use of the c# verbatim string prefix # that means that \ is not treated as an escape character. You will need to use a double " so that a single " will be included in the string.
This regex makes use of prefix and suffix matching rules so that you can get the match without having to select the specific group from the result.
Alternatively you can use group matching
var regex=new Regex(#"xyz\"":(.*?),\""lmn");
var value = regex.Match(#"xyz"":abc,""lmn").Groups[1].Value;
You can check for the existence of a match by doing the following
var match = regex.Match(#"xyz"":abc,""lmn");
var isMatch = match.Success;
and then follow up with either match.Value or match.Groups[1].Value depending on which regex you used.
EDIT
Actually the escaping the " is not needed in a c# regex so you could use either of the following instead.
var regex = new Regex("(?<=xyz\":).*?(?=,\"lmn)");
var regex = new Regex("xyz\":(.*?),\"lmn");
These two do not use the verbatim string prefix, so the \" is translated into just " in the regex giving an a regex of (?<=xyz":).*?(?=,"lmn) or xyz":(.*?),"lmn
Additionally if the is an entire string match rather than a substring you would want one of the following.
var regex = new Regex("(?<=^xyz\":).*?(?=,\"lmn$)");
var regex = new Regex("^xyz\":(.*?),\"lmn$");

How do I replace all strings found with a regular expression with itself concatenated with another string?

For example, my regular expression found the string: some\file\path.xml and I want it to be changed to new_root\some\file\path.xml. Is there a way to do this using the regex replace method? If not, what is the preferred way to do this?
It appears that you can do what you are asking using Regex.Replace.
Check out Substitutions in Regular Expressions article on MSDN.
Example:
var path = #"C:\some\file\path.xml";
var result = Regex.Replace(path, #"(C:\\)(.*)", "$1new_root\\$2");
Result is C:\new_root\some\file\path.xml.
You don't need regex for that, just find the string you want with a buid-in function and concatenate with what you want.
For a more general search/replace you can do this:
string pattern = #"(?>\w+\\)+\w+.xml";
string replacement = "new_root\\$0";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);

String Replace Problem

I have a string replace problem. I have a result like:
"Animal.Active = 1 And Animal.Gender = 2"
I want to replace something in this text.
Animal.Active part is returned from a database and sometimes it is returned with the Animal.Gender part.
When Animal.Gender part comes from the database I have to remove this And Animal.Gender part.
Also if the string has Animal.Active = 1, I have to remove Animal.Active = 1 And part. Note the And.
How can I do this?
You will need to use a regular expression (regex) to replace this, then, since you want to match a number.
string pattern = "And Animal\.Gender\s*=\s*[0-9]+";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
Will work to replace "And Animal.Gender = #" with zero or more white spaces between the = sign.
You can do a similar replacement for the second request, with Animal.Active.
Granted this is a very specific solution that will undoubtedly become more complicated as you add more conditions, but here goes:
dbReturn =
dbReturn.Replace("And Animal.Gender","").Replace("Animal.Active = 1","");

Categories

Resources