how can I use unnamed Regex groups in C# inside my regex? - c#

hey so my current regex is #"(into)(to)add\s[^\s]{1,}\1|\2[^\s]{1,}" I want the input to be something "add word into/to category" the regex in general works fine but just the \1|\2 part, I tried using groups and all sorts of solutions but I just can't seem to figure out how I can make it so that the input can be into or to
Can anyone help me out? (this is in C# and using the Regex class)

If I have understood you correctly, then you don't need back references to (unnamed) Groups, you can use a simple alternation, like this:
#"add \w+ (into|to) \w+"
That will select either into or to in the search string.
Edit:
Let's get a Little more 'advanced', using the optional sign '?':
#"add \w+ (in)?to \w+"
This will match 'in' zero or one time, followed by 'to', so it will match into as well as to, exactly as the original RegEx.
Edit2:
I have a feeling, you want to use a variable inside your RegEx, you can of course do that like this:
string search = "into|to";
RegEx regEx = new ReqEx(#"add \w+ (" + search + ") \w+");

From your given example I think you're looking for a regex like add\s\w+\s(into|to)\s\w+. Your current regex matches only strings starting with "intoto" wich is probably not what you want.

Related

Regex and taking care of possible whitespace

I need help with my regex expression. I am trying to match keyword this but only when it is in parenthesis (this). So far I have:
(\()\bthis\b(\))
But looks like it also matches the parenthesis wrapping the word, while I only need to grab the word itself. Another issue is that it won't work if there are whitespaces inside the parenthesis: ( this )
What about group matching with that kind of REGEXP expression:
\((this)\)
Also if you want to match when there are white spaces (spaces, tabs, etc.):
\(\s*(this)\s*\)
Try it out here : Regex101. All the details about each character I'm using in the regex are detailed on that site.
You can retrieve the this value matched in the group by code. Please, check out the documentation related to the language you're using for that.

Regex matching wrong pattern

I'm trying to pull text out of a word document using regex look ahead and look behind foudn in this answer:
Regular Expression to find a string included between two characters while EXCLUDING the delimiters
The delimeters I have to work with are
Start: RQ
End: END-RQ
I have added the following (powershell) code:
$regex = [regex] '(?<=RQ)(.*?)(?=END-RQ)'
$matches = $regex.Matches($concat)
The problem is the matching is grabbing the RQ from END-RQ as the beginning of the next pattern. Can anyone tell me how to eliminate that (e.g. force the regex to match exactly RQ and END-RQ)? Wrapping the matching patterns in quotes does not seem to work, even when the quotes are escaped.
Try this:
$regex = [regex] '(?<=(?<!END-)RQ)(.*?)(?=END-RQ)'
you should download this application:
http://www.sellsbrothers.com/posts/Details/12425
it is priceless when trying to debug regex.
This might work (hard to say without knowing exactly what your data is):
$regex = [regex]'(?<=(?:^|[^-])RQ)(.*?)(?=END-RQ)'

Regular Expression Pattern Matching

Hi I need to do like this.
Actually **ctu** is a good university but **ctu's** is not. There are many **,ctus,** present.
What I want to do is, I want to replace ctu in the string like this.
Actually **<s>ctu<e>** is a good university but **<s>ctu's<e>** is not. There are many **,<s>ctus<e>,** present.
But with the following pattern
**\\bctu*(?:['\\\\|""\\\\]*)\\w+\\b**
I'm getting the out put as:
A**<s>ctu<e>**ally **<s>ctu<e>** is a good university but **<s>ctu's<e>** is not. There are many **,ctus,** present.
I dont want to replace ctu inside words Actually. and also I need to replace " ,ctus, " with " ,<s>ctus<e>, "
How do I achieve this using regex. I need this in c#. csharp.
Thanks in advance.
The following regex matches all the cases listed in your example:
#"(\bctu(?:'\w+)?\w*\b)"
Then just replace the match with #"<s>\1<e>" where \1 is the backreference to the match above.
Are you looking for #"\bctu\b" ("ctu" with word boundaries on both sides, so it matches ctu but not Actually, ctu's, or ,ctus,) for the first search pattern and ",ctus," (exactly the string ,ctus,, regardless of where it might fall in a word) as the second search pattern? To search for both of these at once, you could use #"(\bctu\b|,ctus,)".
As a slight aside, in C# you can write regex literals much easier by using the #"" notation (verbatim strings) instead of "". E.g. to get regex to understand a word boundary, it must see \b, which can be represented as #"\b" or "\\b", and a literal \ is "\\\\" or #"\\". The first is easier to read, especially in more complex cases.
If this doesn't answer your question, please give a clear example of expected input/output.

Regex : replace a string

I'm currently facing a (little) blocking issue. I'd like to replace a substring by one another using regular expression. But here is the trick : I suck at regex.
Regex.Replace(contenu, "Request.ServerVariables("*"))",
"ServerVariables('test')");
Basically I'd like to replace whatever is between the " by "test". I tried ".{*}" as a pattern but it doesn't work.
Could you give me some tips, I'd appreciate it!
There are several issues you need to take care of.
You are using special characters in your regex (., parens, quotes) -- you need to escape these with a slash. And you need to escape the slashes with another slash as well because we 're in a C# string literal, unless you prefix the string with # in which case the escaping rules are different.
The expression to match "any number of whatever characters" is .*. In this case, you would want to match any number of non-quote characters, which is [^"]*.
In contrast to (1) above, the replacement string is not a regular expression so you don't want any slashes there.
You need to store the return value of the replace somewhere.
The end result is
var result = Regex.Replace(contenu,
#"Request\.ServerVariables\(""[^""]*""\)",
"Request.ServerVariables('test')");
Based purely on my knowledge of regex (and not how they are done in C#), the pattern you want is probably:
"[^"]*"
ie - match a " then match everything that's not a " then match another "
You may need to escape the double-quotes to make your regex-parser actually match on them... that's what I don't know about C#
Try to avoid where you can the '.*' in regex, you can usually find what you want to get by avoiding other characters, for example [^"]+ not quoted, or ([^)]+) not in parenthesis. So you may just want "([^"]+)" which should give you the whole thing in [0], then in [1] you'll find 'test'.
You could also just replace '"' with '' I think.
Taryn Easts regex includes the *. You should remove it, if it is just a placeholder for any value:
"[^"]"
BTW: You can test this regex with this cool editor: http://rubular.com/r/1MMtJNF3kM

regex replace - but with a few exceptions

I have a string containing HTML and I need to replace some words to be links - I do this with the following code;
string lNewHTML = Regex.Replace(lOldHTML, "(\bword1\b|\bword2|word3\b)", "$1", RegexOptions.IgnoreCase);
The code works, but I need to include some exceptions to the replace - e.g. I will not replace anything i an img-, li- and a-tag (including link-text and attributes like href and title) but still allow replacements in p-, td- and div-tags.
Can anyone figure this one out?
Ok, after some time of trying to construct a fitting regex, here my try.. This might need additional work, but should point you in the right direction.
I am matching the words "word1" and "word2", not inside a "tag1" or "tag2" tag. You need to adjust this to your needs, of course. Enable RegexOptions.IgnorePatternWhitespace, if you'd like to keep my formatting.
Unfortunatly, I have come up with a regex you could simply plug into Regex.Replace, since this Regex will match the whole String since the match before, but the word you are concerned with is in the first group. This group contains index and length of the word, so you can easily replace it using String.Substring...
(?:
\G
(?:
(?>
<tag1(?<N>)
|<tag2(?<N>)
|</tag1(?<-N>)
|</tag2(?<-N>)
|.)*?
(?(N)(?!))
)*
)
(word1|word2)
You need to use the Replace overload with the MatchEvaluator parameter so that you examine each match and decide whether to replace or not.

Categories

Resources