I have no knowledge of regular expressions and find the documentations so hard to understand.
Currently I use this expression
#"\d+(\R.\d{0,2})?"
It only allows decimals which is what I want but it does not allows negative numbers.
I found this question about the same subject :
How do I include negative decimal numbers in this regular expression?
but I just cannot see what I need to change in my expression to get it working.
I would appreciate some help with this.
If there is some documentation on the subject that is clear to read and understand that would also be nice.
Use ^-?\d+(?:\.\d{0,2})?$, but your regex allows numbers like 20. so i suggest to chage it at least for this one ^-?\d+(?:\.\d{1,2})?$.
Also don't forget the ^ and the $. You can use www.regex101.com/, where you can try regex and watch a good documentation.
you can include the - as
#"[+-]?\d+(\R.\d{0,2})?"
Check this simple Cheat sheet for C# regular expressions metacharacters, operators, quantifiers etc
and For sure https://regex101.com is the best place Online regex tester
Related
I am extracting all numbers used in an xml file. The numbers are written in following two patterns
<Environment Id="11" StringId="8407" DescriptionId="5014" RemoteControlAppStringId="8119; 8118" EnvironmentType="BlueToothBridge" AlternateId="1" XML_NAME_ID="BTBSpeechPlusM" FactoryGainType="LIN18">
<Offsets />
</Environment>
I am using regex: "\"\d*;\"" and "\"\d*\"" to extract all numbers.
from the above when i ran Regex "\"\d*\"" using
Regex.Match(myString, "\"\\d*\"")
the above line returns 8407, 11,5014 but it is not returning 8119 and 8118
Your regex will fail to match 8119; 8118 because your pattern is finding quoted numbers.
try with
\b\d+\b
\b specify that \d+ will match only in word boundary. So LIN18 will not match.
Depening on whether you can assume that the provided input is valid XML, you could use the following regular expression:1
Regex.match(myString, "(?<=\")\\d+(?=\")|(?<=\")\\d+(?=; ?\\d+\")|(?<=\"\\d+; ?)\\d+(?=\")" )
The main idea behind this is that it takes the three possible situations into account:
"[number]"
"[number]; [other_number]" (With or without a space before [other_number])
"[other_number]; [number]" (With or without a space before [number])
There are two new concepts I included in the regular expression:2
Positive lookahead: (?=[regex])
Positive lookbehind: (?<=[regex])
These concepts allow the regular expression to check if something specific is before or after it, without putting it in the match.
This regular expression could easily be optimised, but this is meant as an example of a basic approach.
One good tip for developing a regular expression like this is to use a tool (online or offline) to test your regular expression. The tool I used was .NET Regex Tester.
As #poke stated in the comment, it's because your regex doesn't match the string. Change your regex to capture specific matches and account for the possibility of the ';'.
Something like below should probably do the trick.
EDIT: (\b\d+\b)|(\b\d+[;*]\d+\b)
I'm horrible at regex so please bear with me here:
I need to a match where the first character can be anything and the next two have to be RS.
so...
XRS123445 - Match
I suggest you start reading this. Matching any character at a position is basically the simplest thing you can do with regular expressions. There are many different things you can use too:
Any alphanumeric character(\w)
Any character whatsoever(.)
A range of characters ([A-Z])
Any character in a certain unicode range ([\uxxx-\uxxx])
and more. You should also be careful as certain regex languages have ceratin nuances and certain flags have to be set to get the same result. I wouldn't get into more detail to avoid confusion here.
This is the regex you're looking for:
^.RS.*
This would match on any of these:
XRS123445
4RSabc
YRS
.RS.*
Should match as . means any character and then RS as per your requirements
Use this pattern
var pattern = "^.RS";
I want a regular expression that accepts all numbers, alphabets and only the hyphen (‐) from special characters.
I am trying this expression: ^\d+$/[-]/[a-z] but it does not work. I want to accept expressions like this one:
Emp-IN-0000001
Can someone help me with this?
If it's always this format (Emp-IN-0000001), then use this regexp:
^[a-zA-Z]+-[a-zA-Z][a-zA-Z]-[0-9]+$
or, if you have extended regexps:
^[a-zA-Z]+-[a-zA-Z]{2}-\d+$
when there are always seven digits, use this:
^[a-zA-Z]+-[a-zA-Z]{2}-\d{7}$
You can even say:
^Emp-IN-\d{7}$
if it's exactly "Emp-IN-" + digits.
Btw, this is not C# specific, you can use these regular expressions with any language, as long as they support regexps at all.
If you stickily wants to follow this format Emp-IN-0000001, then you might need to use this regular expression:
^[a-zA-Z]+-[a-zA-Z]+-\d+$
I don't really get what you tried with your regular expression, but it is actually as simple as this:
^[a-zA-Z\d-]+$
Or if you want to allow empty strings:
^[a-zA-Z\d-]*$
If you use the case-insensitive modifier with your regular expression, you can leave out either the a-z or A-Z from both variants.
I recommend you read up on some regex basics in this great tutorial.
I'm try to develop a regex that will be used in a C# program..
My initial regex was:
(?<=\()\w+(?=\))
Which successfully matches "(foo)" - matching but excluding from output the open and close parens, to produce simply "foo".
However, if I modify the regex to:
\[(?<=\()\w+(?=\))\]
and I try to match against "[(foo)]" it fails to match. This is surprising. I'm simply prepending and appending the literal open and close brace around my previous expression. I'm stumped. I use Expresso to develop and test my expressions.
Thanks in advance for your kind help.
Rob Cecil
Your look-behinds are the problem. Here's how the string is being processed:
We see [ in the string, and it matches the regex.
Look-behind in regex asks us to see if the previous character was a '('. This fails, because it was a '['.
At least thats what I would guess is causing the problem.
Try this regex instead:
(?<=\[\()\w+(?=\)\])
Out of context, it is hard to judge, but the look-behind here is probably overkill. They are useful to exclude strings (as in strager's example) and in some other special circumstances where simple REs fail, but I often see them used where simpler expressions are easier to write, work in more RE flavors and are probably faster.
In your case, you could probably write (\b\w+\b) for example, or even (\w+) using natural bounds, or if you want to distinguish (foo) from -foo- (for example), using \((\w+)\).
Now, perhaps the context dictates this convoluted use (or perhaps you were just experimenting with look-behind), but it is good to know alternatives.
Now, if you are just curious why the second expression doesn't work: these are known as "zero-width assertions": they check that what is following or preceding is conform to what is expected, but they don't consume the string so anything after (or before if negative) them must match the assertion too. Eg. if you put something after the positive lookahead which doesn't match what is asserted, you are sure the RE will fail.
i want to get an ending html tag like </EM> only if somewhere before it i.e. before any previous tags or text there is no starting <EM> tag my sample string is
ddd d<STRONG>dfdsdsd dsdsddd<EM>ss</EM>r and</EM>and strong</STRONG>
in this string the output should be </EM> and this also the second </EM> because it lacks the starting <EM>. i have tried
(?!=<EM>.*)</EM>
but it doesnt seem to work please help thnks
I am not sure regex is best suited for this kind of task, since tags can always be nested.
Anyhow, a C# regex like:
(?<!<EM>[^<]+)</EM>
would only bring the second </EM> tag
Note that:
?! is a negative lookahead which explains why both </EM> are found.
So... (?!=<EM>.*)xxx actually means capture xxx if it is not followed by =<EM>.*. I am not sure you wanted to include an = in there
?<! is a negative lookbehind, more suited to what you wanted to do, but which would not work with java regex engine, since this look-behind regex does not have an obvious maximum length.
However, with a .Net regex engine, as tested on RETester, it does work.
You need a pushdown automaton here. Regular expressions aren't powerful enough to capture this concept, since they are equivalent to finite-state automata, so a regex solution is strictly speaking a no-go.
That said, .NET regular expressions do have a pushdown automaton behind them so they can theoretically cope with such cases. If you really feel you need to do this with regular expressions rather than a formal HTML parser, take a glimpse here.
You should see the top answer to this other Stack Overflow question, because it gives the perfect answer. In short, don't use regular expressions to try to parse HTML - it's a really bad idea.