Replacing text between single quotes - c#

I'm trying to replace some text in an Excel formula using FlexCel, C# and Regex, but single quotes appear to be tricky from the googling I've done.
The string is like this;
"=RIGHT(CELL(\"filename\",'C.Whittaker'!$A$1),LEN(CELL(\"filename\",'C.Whittaker'!$A$1))-FIND(\"]\",CELL(\"filename\",'C.Whittaker'!$A$1)))"
I want to replace C.Whittaker with another name. It's always going to be First Initial . Last Name and it's always going to be inside single quotes.
I've got this regex matching; (?:')[^\"]+(?:')
I thought the (?:') means that regex matches it, but then ignores it for a replace, yet this doesn't seem to be the case. Suspect the issue is to do with how strings are handled, but it's a bit beyond me.

(?:...) is a non-capturing group; it doesn't capture what it matches (that is, it doesn't make that part of the string available later by means of a backreference), but it does consume it. What you're thinking of is lookarounds, which don't consume what they match.
(?<=')[^']+(?=')
But do you really need them? You might find it simpler to consume the quotes and then add them to the replacement string. In other words, replace '[^']+' with 'X.Newname'.

I believe you need to escape your character you want Regex to take it literally...
ie: (?:')
becomes (?:\')

Do you really need to use regex? Regex is good for complex cases where speed is important, but it leads to difficult to maintain code.
Try this instead:
var formula2 =
String.Join("'", formula
.Split(new [] { '\'' })
.Select((x, n) => n % 2 == 1 ? "new name here" : x));

Related

RegEx to find non-existence of white space prefix but not include the character in the match?

So i have the following RegEx for the purpose of finding and adding whitespace:
(\S)(\()
So for a string like "SomeText(Somemoretext)" I want to update this to "SomeText (Somemoretext)" it matches "t(" and so my replace eliminates the "t" from the string which is not good. I also do not know what the character could be, I'm merely trying to find the non-existence of whitespace.
Is there a better expression to use or is there a way to exclude the found character from the match returned so that I can safely replace without catching characters i do not want to replace?
Thanks
I find lookarounds hard to read and would prefer using substitutions in the replacement string instead:
var s = Regex.Replace("test1() test2()", #"(\S)\(", "$1 (");
Debug.Assert(s == "test1 () test2 ()");
$1 inserts the first capture group from the regex into the replacement string which is the non-space character before the opening parenthesis (.
If you need to detect the absence of space before a specific character (such as bracket) after a word, how about the following?
\b(?=[^\s])\(
This will detect words ( [a-zA-z0-9_] that are followed by a bracket, without a space).
(if I got your problem correctly) you can replace the full match with ( and get exactly what you need.
In case you need to look for absence spaces before a symbol (like a bracket) in any kind of text (as in the text may be non-word, such as punctuation) you might want to use the following instead.
^(?:\S*)(\()(?:\S*)$
When using this, your result will be in group 1, instead of just full match (which now contains the whole line, if a line is matched).

Prevent Regex from devouring optional part of the match

I'v searched extensively but I can't find a simple answer to this and my Regex experience is limited. I'd appreciate a simple solution that is explained, please.
I have a very large string and I need to substitute certain words in it as follows:
Example: wherever you find the string "LINK-ABC" make it "LINK_ABC".
I wrote my Regex Match and Replace strings:
#"LINK-ABC", #"LINK_ABC" and it worked.
But there were a couple of things I had not recognized.
There COULD be words in the file like this:
LINK-ABC-DEF LINK-ABC-GHI-JKL ... and so on.
So I get "LINK_ABC-DEF" etc. (which is NOT what I want; this should have remained intact...)
Once I realized the problem it seemed that what I REALLY wanted was to recognize ONLY the word being matched and leave any cases where it was in combination with something else, unchanged. It seemed to me that if I checked for a space or period on the Match word, that should do it, so...
#"LINK-ABC[ |\\.]",#"LINK_ABC"
... and now I have stumbled.
Sample string:
link-xxx link-aaa-sss link-xxx-bbb link-xxx link-xxx.
Match/Replace string:
link-xxx[ |\\.],link_xxx
Result string:
link_xxxlink-aaa-sss link-xxx-bbb link_xxxlink_xxx
The replacements are correct, BUT the trailing comma or period has been "devoured" and so the result string is wrong.
Is there a way that I can match so that if it matches on space, the replacement will have a space and if it matches on a period, the replacement will have a period? I s'pose I could do 2 separate matches but I'd like to increase my understanding of Regex and do it more elegantly if it is possible.
You should be able to achieve the behavior you want with "capture groups"
var matchstring = #"link-xxx([ \.]|$)";
var fixstr = #"link_xxx$1";
The parenthesis around the last part of the matchstring will retain whatever matched inside it, and the $1 in the fixstr will substitute whatever was captured by that group.
I've also modified your punctuation section a little bit, presuming you want to replace a match if it happens to be the last word in the input (by adding the |$). A | inside a character class [] is a literal | character, so I removed it assuming you don't actually expect that in your input.

What is the correct RegEx to extract my substring

I have an input string like this
$(xx.xx.xx)abcde$(yyy.yyy.yyy)fghijk$(zzz.zz.zz.zzz)
I want to be able to pull out each subset of strings matching $(anything inside here), so for the example above I would like to get 3 substrings.
the characters in between the brackets do not necessarily always match the same pattern.
I have tried using the following regex
(\$\([a-z]+.*\))
but this matches whole string, due to the fact it starts with '$', anything in middle, and ends with ')'
Hopefully this makes sense.
I should also note that I have very limited experience using regex.
Thanks
(\$\([a-z]+.*?\))
Use ? to make your search non greedy.* is greedy and consumes the max it can.adding ? to * makes it non greedy and it will stop at the first instance of ).
See demo.
http://regex101.com/r/sU3fA2/28
try the below
\((.*?)\)\g
for the given string $(xx.xx.xx)abcde$(yyy.yyy.yyy)fghijk$(zzz.zz.zz.zzz) it returns the three substring..
MATCH 1
1. [2-10] `xx.xx.xx`
MATCH 2
1. [18-29] `yyy.yyy.yyy`
MATCH 3
1. [38-51] `zzz.zz.zz.zzz`
http://regex101.com/r/bX7qR2/1

Regex : replace a string

I'm currently facing a (little) blocking issue. I'd like to replace a substring by one another using regular expression. But here is the trick : I suck at regex.
Regex.Replace(contenu, "Request.ServerVariables("*"))",
"ServerVariables('test')");
Basically I'd like to replace whatever is between the " by "test". I tried ".{*}" as a pattern but it doesn't work.
Could you give me some tips, I'd appreciate it!
There are several issues you need to take care of.
You are using special characters in your regex (., parens, quotes) -- you need to escape these with a slash. And you need to escape the slashes with another slash as well because we 're in a C# string literal, unless you prefix the string with # in which case the escaping rules are different.
The expression to match "any number of whatever characters" is .*. In this case, you would want to match any number of non-quote characters, which is [^"]*.
In contrast to (1) above, the replacement string is not a regular expression so you don't want any slashes there.
You need to store the return value of the replace somewhere.
The end result is
var result = Regex.Replace(contenu,
#"Request\.ServerVariables\(""[^""]*""\)",
"Request.ServerVariables('test')");
Based purely on my knowledge of regex (and not how they are done in C#), the pattern you want is probably:
"[^"]*"
ie - match a " then match everything that's not a " then match another "
You may need to escape the double-quotes to make your regex-parser actually match on them... that's what I don't know about C#
Try to avoid where you can the '.*' in regex, you can usually find what you want to get by avoiding other characters, for example [^"]+ not quoted, or ([^)]+) not in parenthesis. So you may just want "([^"]+)" which should give you the whole thing in [0], then in [1] you'll find 'test'.
You could also just replace '"' with '' I think.
Taryn Easts regex includes the *. You should remove it, if it is just a placeholder for any value:
"[^"]"
BTW: You can test this regex with this cool editor: http://rubular.com/r/1MMtJNF3kM

How to Check if a String is a "string" or a RegEx?

How can I check if a String in an textbox is a plain String ore a RegEx?
I'm searching through a text file line by line.
Either by .Contains(Textbox.Text); or by Regex(Textbox.Text) Match(currentLine)
(I know, syntax isn't working like this, it's just for presentation)
Now my Program is supposed to autodetect if Textbox.Text is in form of a RegEx or if it is a normal String.
Any suggestions? Write my own little RexEx to detect if Textbox contains a RegEx?
Edit:
I failed to add thad my Strings
can be very simple like Foo ore 0005
I'm trying the suggested solutions
right away!
You can't detect regular expressions with a regular expression, as regular expressions themselves are not a regular language.
However, the easiest you probably could do is trying to compile a regex from your textbox contents and when it succeeds you know that it's a regex. If it fails, you know it's not.
But this would classify ordinary strings like "foo" as a regular expression too. Depending on what you need to do, this may or may not be a problem. If it's a search string, then the results are identical for this case. In the case of "foo.bar" they would differ, though since it's a valid regex but matches different things than the string itself.
My advice, also stated in another comment, would be that you simply always enable regex search since there is exactly no difference if you split code paths here. Aside from a dubious performance benefit (which is unlikely to make any difference if there is much of a benefit at all).
Many strings could be a regex, every regex could actually be a string.
Consider the string "thin." could either be a string ('.' is a dot) or a regex ('.' is any character).
I would just add a checkbox where the user indicates if he enters a regex, as usual in many applications.
One possible solution depending on your definition of string and regex would be to check if the string contains any regex typical characters.
You could do something like this:
string s = "I'm not a Regex";
if (s == Regex.Escape(s))
{
// no regex indeed
}
Try and use it in a regex and see if an exception is thrown.
This approach only checks if it is a valid regex, not whether it was intended to be one.
Another approach could be to check if it is surrounded by slashes (ie. ‘/foo/‘) Surrounding regexes with slashes is common practice (although you must remove the slashes before feeding it into the regex library)

Categories

Resources