I need to find all matches of word which strictly begins with "$" and contains only digits. So I wrote
[$]\d+
which gave me 4 matches for
$10 $10 $20a a$20
so I thought of using word boundaries using \b:
[$]\d+\b
But it again matched
a$20 for me.
I tried
\b[$]\d+\b
but I failed.
I'm looking for saying, ACCEPT ONLY IF THE WORD STARTS WITH $ and is followed by DIGITS. How do I tell IT STARTS WITH $, because I think \b is making it assume word boundaries which means surrounded inside alphanumeric characters.
What is the solution?
Not the best solution but this should work. (It does with your test case)
(?<=\s+|^)\$\d+\b
Have you tried
\B\$\d+\b
You were close, you just need to escape the $:
\B\$\d+\b
See the example matches here: http://regexhero.net/tester/?id=79d0ac3b-dd2c-4872-abb4-6a9780c91fc1
Try with ^\$\d+
where ^ denoted the beginning of a string.
Related
What seemed to be a simple task ended up to not work as expected...
I'm trying to match \$\w+\b, unless it's preceded by an uneven number of backslashes.
Examples (only $result should be in the match):
This $result should be matched
This \$result should not be matched
This \\$result should be matched
This \\\$result should not be matched
etc...
The following pattern works:
(?<!\\)(\\\\)*\$\w+\b
However, even repeats of backslashes are included in the match, which is unwanted, so I'm trying to achieve this purely with a variable-length lookbehind, but nothing I tried so far seems to work.
Any regex virtuoso here can lend a hand?
You may use the following pattern:
(?<!(?:^|[^\\])\\(?:\\\\)*)\$\w+\b
Demo.
Breakdown of the Lookbehind; i.e., not preceded by:
(?:^|[^\\]) - Beginning of string/line or any character other than backslash.
\\ - Then, one backslash character.
(?:\\\\)* Then, any even number of backslash characters (including zero).
Looks like asking the question helped me answer my own question.
The part I don't want to be matched has to be wrapped with a positive lookbehind.
(?<=(?<!\\)(\\\\)*)\$\w+\b
Also works if the $result is at the start of the line.
If anyone has more optimal solutions, shoot!
This regular expression gets the wanted text in the third capture group:
(^| )(\\\\)*(\$\w+\b)
Explanation:
(^| ) Either beginning of line or a space
(\\\\)* An even number of backslash characters, including none
( Start of capture group 3
\$\w+\b The wanted text
) End of capture group 3
I am trying to create a regex to validate a string. The string could be of the following formats (to give an idea of what I am trying to do here):
145/1/3 or
748/57676/6765/454/345 or
45/234 45/235 45/236
So basically the string can contain numbers, spaces and forward slashes and the string can end with a number only. I am new at regex and have gone through many of the questions on the website. But please you have to admit that this is really confusing and difficult to master. And if someone could refer an author or any weblink that can teach regex, that would be really helpful. Thanks in advance mates!
I came up with this
^[0-9]( |[0-9]|\/)*[0-9]$
And used this to test it.
You can see it matches anything that begins (^) with a number, has zero or more (*) of either a space, a number or (|) a forward-slash (/) and ends ($) with a number.
Now that I am aware that the space and / cannot go together and multiple spaces and/or slashes are also not allowed, this RegEx is a better fit for you.
^[0-9]+([ \/][0-9]+)*$
This should work: ^[/\d\s]*\d$.
It is looking for the beginning of the string ^ , then 0 or more digits, spaces [/\d\s]* followed by a digit \d then the end of the string $.
You should use following regular expression:
(\d+(/\d+)*\s*)+
This mean: some digits (\d+) followed by optional repeating pattern of some digits and \ ((/\d+)*) followed by an optional number of whitespaces (\s*), all repeated at least once.
Try this:
^\d(\d|\s|\/)*\d$
\d = digit character (you can also use [0-9]).
\s = space character
The brackets followed by a star means to repeat a \d, \s, or / an infinite amount of times.
The final \d$ means the ending must match a digit.
I am looking for a way to get words out of a sentence. I am pretty far with the following expression:
\b([a-zA-Z]+?)\b
but there are some occurrences that it counts a word when I want it not to. E.g a word followed by more than one period like "text..". So, in my regex I want to have the period to be at the end of a word zero or one time. Inserting \.? did not do the trick, and variations on this have not yielded anything fruitful either.
Hope someone can help!
A single dot means any character. You must escape it as
\.?
Maybe you want an expression like this:
\w+\.?
or
\p{L}+\.?
You need to add \.? (and not .?) because the period has special meaning in regexes.
to avoid a match on your example "test.." you ask for you not only need to put the \.? for checking first character after the word to be a dot but also look one character further to check the second character after the word.
I did end up with something like this
\w{2,}\.?[^.]
You should also consider that a sentence not always ends with a . but also ! or ? and alike.
I usually use rubulator.com to quick test a regexp
I am trying to use Regex to find out if a string matches *abc - in other words, it starts with anything but finishes with "abc"?
What is the regex expression for this?
I tried *abc but "Regex.Matches" returns true for xxabcd, which is not what I want.
abc$
You need the $ to match the end of the string.
.*abc$
should do.
So you have a few "fish" here, but here's how to fish.
An online expression library and .NET-based tester: RegEx Library
An online Ruby-based tester (faster than the .NET one) Rubular
A windows app for testing exressions (most fully-featured, but no zero-width look-aheads or behind) RegEx Coach
Try this instead:
.*abc$
The $ matches the end of the line.
^.*abc$
Will capture any line ending in abc.
It depends on what exactly you're looking for. If you're trying to match whole lines, like:
a line with words and spacesabc
you could do:
^.*abc$
Where ^ matches the beginning of a line and $ the end.
But if you're matching words in a line, e.g.
trying to match thisabc and thisabc but not thisabcd
You will have to do something like:
\w*abc(?!\w)
This means, match any number of continuous characters, followed by abc and then anything but a character (e.g. whitespace or the end of the line).
If you want a string of 4 characters ending in abc use, /^.abc$/
I am updating some code that I didn't write and part of it is a regex as follows:
\[url(?:\s*)\]www\.(.*?)\[/url(?:\s*)\]
I understand that .*? does a non-greedy match of everything in the second register.
What does ?:\s* in the first and third registers do?
Update: As requested, language is C# on .NET 3.5
The syntax (?:) is a way of putting parentheses around a subexpression without separately extracting that part of the string.
The author wanted to match the (.*?) part in the middle, and didn't want the spaces at the beginning or the end from getting in the way. Now you can use \1 or $1 (or whatever the appropriate method is in your particular language) to refer to the domain name, instead of the first chunk of spaces at the beginning of the string
?: makes the parentheses non-grouping. In that regex, you'll only pull out one piece of information, $1, which contains the middle (.*?) expression.
What does ?:\s* in the first and third registers do?
It's matching zero or more whitespace characters, without capturing them.
The regex author intends to allow trailing whitespace in the square-bracket-tags, matching all DNS labels following the "www." like so:
[url]www.foo.com[/url] # foo.com
[url ]www.foo.com[/url ] # same
[url ]www.foo.com[/url] # same
[url]www.foo.com[/url ] # same
Note that the regex also matches:
[url]www.[/url] # empty string!
and fails to match
[url]stackoverflow.com[/url] # no match, bummer
You may find this Regular Expressions Cheat Sheet very helpful (hopefully). I spent ages trying to learn Regex with no luck. And once I read this cheat-sheet - I immediately understood what I previously failed to learn.
http://krijnhoetmer.nl/stuff/regex/cheat-sheet/