Regex.Replace with groups duplicated output? [duplicate]

Regex.Replace with groups duplicated output? [duplicate] - c#

This question already has answers here:
String.replaceAll(regex) makes the same replacement twice
(2 answers)
Closed 4 years ago.
I have a weird problem with Regex.Replace.
I think my immediate window says it all:
pattern
"([^_]*)(.*)"
fileNameToReplicate
"{Productnr}_LEI1.JPG"
Regex.Replace(fileNameToReplicate, pattern, $"$1")
"{Productnr}"
Regex.Replace(fileNameToReplicate, pattern, $"$2")
"_LEI1.JPG"
Regex.Replace(fileNameToReplicate, pattern, $"sometext$2")
"sometext_LEI1.JPGsometext"
Thus, my pattern looks for the first underscore and captures everything until that underscore in group1.
Then it captures the rest of the text (starting with that underscore until the end of the string) and captures that as group 2.
The regex captures correctly, look here to review it.
Why is the prefixed text outputted twice? Once before the group, and once after the group. Obviously I expected to have this is output:
"sometext_LEI1.JPG"

It does not matter how many X-stars occur in sequence:
(.*)(.*)(.*)(...
since there is a position called end of subject string that all of them will match it. To see your expected result change your pattern to:
^([^_]*)(.*)
Above adds a caret which defines a boundary and makes engine to not start a match right at the end of input string.

Related

Can regex match Interleaved matches? [duplicate]

This question already has answers here:
How to find overlapping matches with a regexp?
(4 answers)
Closed 5 years ago.
I have a pattern with opening tags and closing tags
e.g. /*tag1_START*/ some content /*tag1_END*/ other text /*tag2_START*/ some content /*tag2_END*/
and i use the Regex \/\*([a-zA-Z0-9]+)_START\*\/(.*?)\/\*\1_END\*
can see # regex101
BUT, There was a situation where the tags were interleaved (mistakingly):
e.g. /*tag3_START*/ some /*tag4_START*/ content /*tag3_END*/ other /*tag4_END*/ content
I can easily check the overlap in the matches, but REGEX does not return Both tags because it continue from the last char it matched...
Can i use Regex to find Overlapping matches or i need to write my own code ?

Lookarounds do assert rather than consume characters. However capturing groups still store matched parts in them. Just put overlapping part inside a positive lookahead:
\/\*([a-zA-Z0-9]+)_START\*\/(?=(.*?)\/\*\1_END\*)
Live demo

(?=\*([a-zA-Z0-9]+)_START\*\/(.*?)\/\*(\1)_END\*)
You will have to use lookahead and not capture anything.See demo.
https://regex101.com/r/vsA3ZU/1

Best way to remove unknown characters and spaces using C#? [duplicate]

This question already has answers here:
How can I remove the spaces, tabs, new lines between characters using c#'s REGEX?
(2 answers)
Closed 6 years ago.
Unknown Characters:
|b9-12-2016,¢Xocoak¡LO2A35(2)(b)¡ÓocORe3ao-i|],¢Xa?u¡±o¡±i?¢X$3,597,669On 9-12-2016, the price adjusted to $3,597,669 dueto the reason allowed under section 35(2)(b) of theOrdinance
Good Result:
$3,597,669On 9-12-2016, the price adjusted to $3,597,669 due to the reason allowed under section 35 of the Ordinance

You should be able to use regular expressions to do this. You can use the Regex.Replace method to run regular expressions on your text. Regular expressions are patterns that a regular expression engine tries to match in input text. I recommend that you take a look at the MSDN article here. You can also take a look at the documentation for the Regex.Replace method here. For example, in order to remove the letter c you could use this snippet of code:
output = Regex.Replace(input, "c", "", RegexOptions.IgnoreCase);
This would replace both lowercase and capital Cs because the ignore case option is turned on.

If it is a standard pattern as what you've told me. Use the following code. It takes everything after the last $ sign.
string str = "|b9-12-2016,¢Xocoak¡LO2A35(2)(b)¡ÓocORe3ao-i|],¢Xa?u¡±o¡±i?¢X$3,597,669On 9-12-2016, the price adjusted to $3,597,669 dueto the reason allowed under section 35(2)(b) of theOrdinance";
var result = str.Substring(str.LastIndexOf('$'));

Regular expression for characters after '.' [duplicate]

This question already has answers here:
How do I match an entire string with a regex?
(8 answers)
Closed 6 years ago.
I need to detect following format when I enter serial number like
CK123456.789
I used Regex with pattern of
^(CV[0-9]{6}\.[0-9]{3}
to match but if I enter
CK123456.7890
it still able to proceed without flagging error. Is there a better regular expression to detect the trailing 3 digits after '.'?

Depending on how you use the regular expression matcher, you might need to enclose it in ^...$ which forces the pattern to be the whole string, i.e.
^CK[0-9]{6}\.[0-9]{3}$ (Note the CK prefix).
I've also removed your leading (mismatched) parenthesis.

What does regex expression match pattern "\\[.*\\]" mean? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 6 years ago.
I am new to regex. What does regex expression match pattern "\[.*\]" mean?
If I have a text like "Hello [Here]", then success is returned in the match. And match contain [Here].
I read that:
. indicates Any except \n (newline),
* indicates 0 or more times
I don't understand the "\". It believe it is just escape sequence for "\".
So, is the expression "\[.*\]" trying to match a pattern like \[Any text\]?

Yes, you are right. It will match any characters enclosed in []. The .* imply any or no characters enclosed in [].
Also you should try this link which is a very helpful regex tool. You can input the regex pattern and check for matches easily.

I have tried this on regexr, here is a screen shot:

Matching a comment (//Comment) in regex outside quotation marks [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Regex to strip line comments from C#
I'm completely stuck with this, and i'm not good at making regex.
Basicly i want to match comments in pieces of text, for example this one:
//Comment outside quotations
string text = "//Comment inside quotations..";
//Another comment
I want only the top and bottom comment to match, but not the middle one inside quotations
What i have now for comments is:
//.*$
To match a comment throughout the end of the line.
What i want this to use for is for syntax highlighting in a textBox.
Is this possible to do?

Try this :
"^(?!\".*\")//.*$"
This will match
//Comment outside quotations
and will not match
string text = "//Comment inside quotations..";
Please make required escaping for c#

Try this regex:
([^"]|"[^"]*")*(?<COMMENT>//.*)
Parse each match for the named group "COMMENT" (or whatever you choose to name it). Quick disclaimer that I didn't test it out in C#, I just threw the regex together using an online tool.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex.Replace with groups duplicated output? [duplicate] - c#

Related

Can regex match Interleaved matches? [duplicate]

Best way to remove unknown characters and spaces using C#? [duplicate]

Regular expression for characters after '.' [duplicate]

What does regex expression match pattern "\\[.*\\]" mean? [duplicate]

Matching a comment (//Comment) in regex outside quotation marks [duplicate]

Categories

Resources