Only replace pattern if the whole line matches regex - c#

I am sure there is a trivial solution to this question but I can't seem to get it right:
I want to replace a specific pattern in a line only if the whole line matches the regex.
So in my case three pipes | should be replaced by underscores _ only if the whole line is numbers and pipes:
|||10|||-80|||-120|||400 ---> replace
|||10|||asdf|||-120|||400 ---> don't replace
|||10|||-80|||400 ---> replace
|||10|||-80|||-120|||400|||test ---> don't replace
Expected result:
___10___-80___-120___400
|||10|||asdf|||-120|||400
___10___-80___400
|||10|||-80|||-120|||400|||test
My attempts:
\|\|\|(?=\-?\d+)
replaces the pipes if followed by numbers as expected but of course also in the "invalid" lines
^(\|\|\|\-?\d+){1,}$
matches the whole line and therefore I can't replace only the pipes
I understand why my patterns don't work and perhaps I have to simply do it with two passes but it feels like this should totally be possible.

Without more details, it seems you can use
(?<=^(?:\|{3}-?\d+)*)\|{3}(?=-?\d+(?:\|{3}-?\d+)*$)
Or, if you need to process lines in a larger string:
(?m)(?<=^(?:\|{3}-?\d+)*)\|{3}(?=-?\d+(?:\|{3}-?\d+)*\r?$)
See the regex demo.
Details:
(?<=^(?:\|{3}-?\d+)*) - a positive lookbehind that requires that, immediately to the left of the current location, there is:
^ - start of string anchor
(?:\|{3}-?\d+)* - zero or more sequences of 3 |s followed with an optional - (-?) and then 1 or more digits
\|{3} - 3 pipes
(?=-?\d+(?:\|{3}-?\d+)*$) - a positive lookahead that requires that, immediately to the right of the current location, there is
-?\d+ - an optional - and then 1+ digits
(?:\|{3}-?\d+)* - 0 or more sequences of 3 |s + an optional - and then 1+ digits
$ - end of string anchor.
C#:
var res = Regex.Replace(s, #"(?<=^(?:\|{3}-?\d+)*)\|{3}(?=-?\d+(?:\|{3}-?\d+)*$)", "___", RegexOptions.ECMAScript);
The RegexOptions.ECMAScript flag is used to make \d only match ASCII digits.

Related

Regularexpression for duplicate pattern

I am trying to write a regex to handle these cases
contains only alphanumeric with minimum of 2 alpha characters(numbers are optional).
only special character allowed is hyphen.
cannot be all same letter ignoring hyphen.
cannot be all hyphens
cannot be all numeric
My regex: (?=[^A-Za-z]*[A-Za-z]){2}^[\w-]{6,40}$
Above regex works for most of the scenarios except 1) & 3).
Can anyone suggest me to fix this. I am stuck in this.
Regards,
Sajesh
Rule 1 eliminates rule 4 and 5: It can neither contain only hyphens, nor only digits.
/^(?=[a-z\d-]{6,40}$)[\d-]*([a-z]).*?(?!\1)[a-z].*$/i
(?=[a-z\d-]{6,40}$) look ahead for specified characters from 6 to 40
([a-z]).*?(?!\1)[a-z] checks for two letters and at least one different
See this demo at regex101
This pattern with i flag considers A and a as the "same" letter (caseless matching) and will require another alpbhabet. For case sensitive matching here another demo at regex101.
You can use
^(?!\d+$)(?!-+$)(?=(?:[\d-]*[A-Za-z]){2})(?![\d-]*([A-Za-z])(?:[\d-]*\1)+[\d-]*$)[A-Za-z\d-]{6,40}$
See the regex demo. If you use it in C# or PHP, consider replacing ^ with \A and $ with \z to make sure you match the entire string even in case there is a trailing newline.
Details:
^ - start of string
(?!\d+$) - fail the match if the string only consists of digits
(?!-+$) - fail the match if the string only consists of hyphens
(?=(?:[\d-]*[A-Za-z]){2}) - there must be at least two ASCII letters after any zero or more digits or hyphens
(?![\d-]*([A-Za-z])(?:[\d-]*\1)+[\d-]*$) - fail the match if the string contains two or more identical letters (the + after (?:[\d-]*\1) means there can be any one letter)
[A-Za-z\d-]{6,40} - six to forty alphanumeric or hyphen chars
$ - end of string. (\z might be preferable.)

Regular expression to match base path

I'm trying to come out with a regular expression to match a certain base path. The rule should be to match the base path itself plus a "/" or a "." and the rest of the path.
for example, given /api/ping, the following should match
/api/ping.json
/api/ping
/api/ping/xxx/sss.json
/api/ping.xml
and this should NOT match
/api/pingpong
/api/ping_pong
/api/ping-pong
I tried with the following regexp:
/api/ping[[\.|\/].*]?
But it doesn't seem to catch the /api/ping case
Here is the a link to regex storm tester
--
update: thanks to the answers, now I have this version that reflects better my reasoning:
\/api\/ping(?:$|[.\/?]\S*)
The expression either ends after ping (that's the $ part) or continues with a ., / or ? followed by any non-space characters
here's the regex
You can use this regex which uses alternations to ensure the base path is followed by either a . or / or end of line $
\/api\/ping(?=\.|\/|$)\S*
Explanation:
\/api\/ping - Matches /api/ping text literally
(?=\.|\/|$) - Look ahead ensuring what follows is either a literal dot . or a slash / or end of line $
\S* - Optionally follows whatever non-space character follows the path
Demo
In your regex, /api/ping[[\.|\/].*]? usage of character set [] is not correct, where you don't need to escape a dot . and alternation | isn't needed in a character set and can't be done by placing | within character class, and also as the character class looks nested, it isn't required and not the right thing to do. I guess you wanted to make your regex something like this,
\/api\/ping([.\/].*)?$
Demo with your corrected regex
Notice, once you place anything in [] then it is only counted as one character allowing everything contained within character set, hence it allows either a dot . or slash / and notice you need to escape / as \/
Your pattern uses a character class that will match any of the listed which could also be written as [[./|].
It does not match /api/ping because the the character class has to match at least 1 time as it is not optional.
You could use an alternation to match /api/ping followed by asserting the end of the string or | match the structure by repeating 0 or more times matching a forward slash followed by not a forward slash followed by a dot and 1+ times and then a dot and the extension.
/api/ping(?:(?:/[^/\s]+)*\.\S+|$)
That will match
/api/ping Match literally
(?: Non capturing group
(?:/[^/\s]+)* Repeat a grouping structure 0+ times matching / then 1+ times not / or a whitespace character
\.\S+ Match a dot and 1+ times a non whitespace character
| Or
$ Assert the end of the string
) Close non capturing group
See the regex demo | C# demo

Regex pattern to match numbers in certain conditions

First time posting, please forgive the formatting. Not really a programmer, I work in C# with the Revit and AutoCAD APi's. Important to note, as the Revit API is a bit of mess, so the same code may produce different results in a different API. So I have three basic string patterns where I want to return certain numbers depending on what their prefix & suffix. They could be surrounded by other text than what I show, and the actual numbers and positions within the string may vary.
String 1: (12) #4x2'-0 # 6 EF
String 2: (12) #4 # 6 EF
String 3: STAGGER 2'-0, SPCG AT 6 AT 12 SLAB
The code I'm using:
if (LengthAsString.IsMatch(remarkdata) == true)
{
Regex remarklength = new Regex(#"isnertRegexPatternhere");
if (remarklength.IsMatch(remarkdata))
{
remarkdata = remarklength.Replace(remarkdata, "${0}\u0022");
}
}
remarkdata is the strings from above, and im adding inch marks " after each number match.
The patterns ive tested and their returns:
String 1 String 2 String 3
\d+(?!['-]|([(\d+)])) 0,6 4,6 0,6,12
(?<![#])\d+ 12,2,0,6 12,6 2,9,6,12
\d+(?= #)|(?<=# )\d+ 0,6 6 no matches
expected results: 0,6 6 0,6,12
so im close, but no cigar. Thoughts?
Double Edit: looking for the numbers that aren't preceded by #, nor between (). Ignore # and x, they may or may not be there.
You seem to be looking for
(?<!#)\d+(?!.*(?:['-]|[#x]\d))
See the regex demo
Details
(?<!#) - a negative lookbehind that fails the match if there is a # immediately to the left of the current location
\d+ - 1 or more digits (or [0-9]+ to only match ASCII digits)
(?!.*(?:['-]|[#x]\d)) - a negative lookahead that fails the match once there are any 0+ chars other than newline (.*) followed with ', -, or #/x followed with a digit immediately to the right of the current location.
Note that in case your strings always have balanced non-nested parentheses, and you may have (123) substrings after # or x1, you may also want to add [^()]*\) into the lookahead
(?<!#)\d+(?!.*(?:['-]|[#x]\d)|[^()]*\))
to avoid matching digits inside the parentheses.
See another .NET demo.

Regular expression to match following criterias [duplicate]

I am using the following regular expression without restricting any character length:
var test = /^(a-z|A-Z|0-9)*[^$%^&*;:,<>?()\""\']*$/ // Works fine
In the above when I am trying to restrict the characters length to 15 as below, it throws an error.
var test = /^(a-z|A-Z|0-9)*[^$%^&*;:,<>?()\""\']*${1,15}/ //**Uncaught SyntaxError: Invalid regular expression**
How can I make the above regular expression work with the characters limit to 15?
You cannot apply quantifiers to anchors. Instead, to restrict the length of the input string, use a lookahead anchored at the beginning:
// ECMAScript (JavaScript, C++)
^(?=.{1,15}$)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*$
^^^^^^^^^^^
// Or, in flavors other than ECMAScript and Python
\A(?=.{1,15}\z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\z
^^^^^^^^^^^^^^^
// Or, in Python
\A(?=.{1,15}\Z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\Z
^^^^^^^^^^^^^^^
Also, I assume you wanted to match 0 or more letters or digits with (a-z|A-Z|0-9)*. It should look like [a-zA-Z0-9]* (i.e. use a character class here).
Why not use a limiting quantifier, like {1,15}, at the end?
Quantifiers are only applied to the subpattern to the left, be it a group or a character class, or a literal symbol. Thus, ^[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']{1,15}$ will effectively restrict the length of the second character class [^$%^&*;:,<>?()\"'] to 1 to 15 characters. The ^(?:[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*){1,15}$ will "restrict" the sequence of 2 subpatterns of unlimited length (as the * (and +, too) can match unlimited number of characters) to 1 to 15 times, and we still do not restrict the length of the whole input string.
How does the lookahead restriction work?
The (?=.{1,15}$) / (?=.{1,15}\z) / (?=.{1,15}\Z) positive lookahead appears right after ^/\A (note in Ruby, \A is the only anchor that matches only start of the whole string) start-of-string anchor. It is a zero-width assertion that only returns true or false after checking if its subpattern matches the subsequent characters. So, this lookahead tries to match any 1 to 15 (due to the limiting quantifier {1,15}) characters but a newline right at the end of the string (due to the $/\z/\Z anchor). If we remove the $ / \z / \Z anchor from the lookahead, the lookahead will only require the string to contain 1 to 15 characters, but the total string length can be any.
If the input string can contain a newline sequence, you should use [\s\S] portable any-character regex construct (it will work in JS and other common regex flavors):
// ECMAScript (JavaScript, C++)
^(?=[\s\S]{1,15}$)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*$
^^^^^^^^^^^^^^^^^
// Or, in flavors other than ECMAScript and Python
\A(?=[\s\S]{1,15}\z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\z
^^^^^^^^^^^^^^^^^^
// Or, in Python
\A(?=[\s\S]{1,15}\Z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\Z
^^^^^^^^^^^^^^^^^^

Regex for special case

I need to create a regex expression for the following scenario.
It can have only numbers and only one dot or comma.
First part can have one to three digits.
The second part can be a dot or a comma.
The third part can have one to two digits.
The valid scenarios are
123,12
123.12
123,1
123
12,12
12.12
1,12
1.12
1,1
1.1
1
I came up so far with this expression
\d{1,3}(?:[.,]\d{1,2})?
but it doesn't work well. For example the input is 11:11 is marked as valid.
You need to put anchors around your expression:
^\d{1,3}(?:[.,]\d{1,2})?$
^ will match the start of the string
$ will match the end of the string
If those anchors are missing, it will partially match on your string, since the last part is optional, means on "11:11" it can match on the digits before the colon and a second match will be on the digits after the colon.
Try to use ^ and $:
^\d{1,3}(?:[.,]\d{1,2})?$
^ The match must start at the beginning of the string or line.
$ The match must occur at the end of the string or before \n at the end of the line or string.

Categories

Resources