Regular Expressions: How to escape the "(" meta char in c# - c#

I am scraping a year value from the innerhtml of a span and the value is in brackets like this:
<span class="year_type">(2009)</span><br>
I want to get the value of the year without the brackets but am getting some compiler errors when trying to escape the "(" char.
My pattern:
const string yearPattern = "<span class=\"year_type\">\((?<year>.*?)\)</span>";
Complete Code:
const string yearPattern = "<span class=\"year_type\">\((?<year>.*?)\)</span>";
var regex = new Regex(yearPattern, RegexOptions.Singleline | RegexOptions.IgnoreCase);
Match match = regex.Match(data);
return match.Groups["year"].Value;
What is the best way to escape the ()
Thanks

use two slashes.
const string yearPattern = "<span class=\"year_type\">\\((?<year>.*?)\\)</span>";
or the # literal string operator
const string yearPattern = #"<span class=""year_type"">\(?<year>.*?)\)</span>";
note; in your original regex you were missing an open-paren.

Prepare to get rocked for parsing HTML with a Regex...
That being said, you just need the # in front of your pattern definition (or double your escapes \\).
const string yearPattern = #"<span class=""year_type"">\(?<year>.*?)\)</span>";

I would consider using a character class for this, e.g. [(] and [)], but using a double-backslash, e.g. \\( and \\) (one \ is for C# and the other one for the regex) is equivalently heavy syntax. So it's a matter of taste.

Related

replace variable name in formula with Regex.Replace

In c#, I want use a regular expression to replace each variable #A with a number withouth replacing other similar variables like #AB
string input = "3*#A+3*#AB/#A";
string value = "5";
string pattern = "#A"; //<- this doesn't work
string result = Regex.Replace(input, pattern, value);
// espected result = "3*5+3*#AB/5"
any good idea?
Use a word boundary \b:
string pattern = #"#A\b";
See regex demo (Context tab)
Note the # before the string literal: I am using a verbatim string literal to declare the regex pattern so that I do not have to escape the \. Otherwise, it would look like string pattern = "#A\\b";.

How to replace two first characters before underscore with regex?

I have example this string:
HU_husnummer
HU_Adrs
How can I replace HU? with MI?
So it will be MI_husnummer and MI_Adrs.
I am not very good at regex but I would like to solve it with regex.
EDIT:
The sample code I have now and that still does not work is:
string test = Regex.Replace("[HU_husnummer] int NOT NULL","^HU","MI");
Judging by your comments, you actually need
string test = Regex.Replace("[HU_husnummer] int NOT NULL",#"^\[HU","[MI");
Have a look at the demo
In case your input string really starts with HU, remove the \[ from the regex pattern.
The regex is #"^\[HU" (note the verbatim string literal notation used for regex pattern):
^ - matches the start of string
\[ - matches a literal [ (since it is a special regex metacharacter denoting a beginning of a character class)
HU - matches HU literally.
String varString="HU_husnummer ";
varString=varString.Replace("HU_","MI_");
Links
https://msdn.microsoft.com/en-us/library/system.string.replace(v=vs.110).aspx
http://www.dotnetperls.com/replace
using Substring
var abc = "HU_husnummer";
var result = "MI" + abc.Substring(2);
Replace in Regex.
string result = Regex.Replace(abc, "^HU", "MI");

Regex to replace double nested quotes in C#

I am trying to replace double nested quotes from string in C# using Regex, but not able to achieve it so far. Below is the sample text and the code i tried -
string html = "<img src=\"imagename=\"g1\"\" alt = \"\">";
string output = string.Empty;
Regex reg = new Regex(#"([^\^,\r\n])""""+(?=[^$,\r\n])", RegexOptions.Multiline);
output = reg.Replace(html, #"$1");
the above gives below output -
"<img src="imagename="g1 alt = >"
actual output i am looking for is -
"<img src="imagename=g1" alt = "">"
Please suggest how to correct the above code.
Pattern : \s*"\s*([^ "]+)"\s*(?=[">])|(?<=")("")(?=")
Replacement : $1
Here is demo and tested at regexstorm
String literals for use in programs:
#"\s*""\s*([^ ""]+)""\s*(?=["">])|(?<="")("""")(?="")"
To keep it simple and more precised directly focused for src attribute value
Pattern : (\bsrc="[^ =]+=)"([^ "]+")"
Replacement : $1$2
Here is online demo and tested at regexstorm
String literals for use in programs:
#"(\bsrc=""[^ =]+=)""([^ ""]+"")"""
Note: I assume attribute values don't contain any spaces.

Need Regex to match [#URL^Url Description^#]

I need regex to find this text
[#URL^Url Description^#]
in a string and replace it with
Url Description
"Url Description" can be set of characters in any language.
Any Regex Experts out there to help me?
Thanks.
It might be a bit confusing, but you can use the following:
string str = #"[#URL^Url Description^#]";
var regex = new Regex(#"^[^^]+\^([^^]+)\^[^^]+$");
var result = regex.Replace(str, #"$1");
The first ^ means the beginning of the string;
The [^^]+ means anything not a caret character;
The \^ is a literal caret;
The $ is the end of the string.
Basically, it captures all characters between the carets (^) and replace this in between the <a> tags.
See ideone demo.
You can also replace the last line with this:
var result = regex.Replace(str, #"$1");
Where link is the variable containing the link you want to replace in.
Why don't you use String.Replace()? A regex would work, but it looks like the format is well defined and regexes are harder to read.
string url = "[#URL^blah^#]";
string url_html = url.Replace("[#URL^", "<a href=\"http://www.somewhere.net\">")
.Replace("^#]", "</a>");

regex and string

Consider the following:
string keywords = "(load|save|close)";
Regex x = new Regex(#"\b"+keywords+"\b");
I get no matches. However, if I do this:
Regex x = new Regex(#"\b(load|save|close)\b");
I get matches. How come the former doesn't work, and how can I fix this? Basically, I want the keywords to be configurable so I placed them in a string.
The last \b in the first code snippet needs a verbatim string specifier (#) in front of it as well as it is a seperate string instance.
string keywords = "(load|save|close)";
Regex x = new Regex(#"\b"+keywords+#"\b");
You're missing another verbatim string specifier (# prefixed to the last \b):
Regex x = new Regex(#"\b" + keywords + #"\b");
Regex x = new Regex(#"\b"+keywords+#"\b");
You forgot additional # before second "\b"

Categories

Resources