I'm new to regex and I want to highlight hexadecimal numbers in Assembly style. Like this:
$00
$FF
$1234
($00)
($00,x)
and even hexadecimal numbers that begin with #.
So far I wrote "$[A-Fa-f0-9]+" to see if it highlights numbers beginning with $ but it doesn't. Why? And can someone help me with what I'm doing? Thanks.
Put a back slash before $ and your regex will work like so
\$[A-Fa-f0-9]+
$ is a valid regex character that matches with end of string. So if your pattern contains dollar then you need to escape it. See regex reference for details
This should cover all those cases, including the cases in which you get a # instead of a $
public Regex MyRegex = new Regex(
"^(\\()?[\\$#][0-9a-fA-F]+(,x)?(?(1)\\))[\\s]*$",
RegexOptions.Singleline
| RegexOptions.Compiled
);
The unescaped sequence for the single line: ^(\()?[\$#][0-9a-fA-F]+(,x)?(?(1)\))[\s]*$
That should validate on a per-line match.
By the way, I made this regex pretty quickly using Expresso
Related
Currently i using this pattern: [HelloWorld]{1,}.
So if my input is: Hello -> It will be match.
But if my input is WorldHello -> Still match but not right.
So how to make input string must match exactly will value inside pattern?
Just get rid of the square brackets, and the comma and you're good to go!
HelloWorld{1}
In regex what's between square brackets is a character set.
So [HelloWorld] matches 1 character that's in the set [edlorHW].
And .{1,} or .+ both match 1 or more characters.
What you probably want is the literal word.
So the regex would simple be "HelloWorld".
That would match HelloWord in the string "blaHelloWorldbla".
If you want the word to be a single word, and not part of a word?
Then you could use wordboundaries \b, which indicate the transition between a word character (\w = [A-Za-z0-9_]) and a non-word character (\W = [^A-Za-z0-9_]) or the beginning of a line ^ or the end of a line $.
For example #"\bHelloWorld\b" to get a match from "bla HelloWorld bla" but not from "blaHelloWorldbla".
Note that the regex string this time was proceeded by #.
Because by using a verbatim string the backslashes don't have to be backslashed.
it seems you need to use online regex tester web sites to check your pattern. for example you could find one of them here and also you could study c# regex reference here
Try this pattern:
[a-zA-Z]{1,}
You can test it online
I'm trying to extract only numbers from a string/text. Below is the regex pattern I'm using.
Regex regex = new Regex(#"[\d+]\S+");
string extract_from = " 12 abcd 1-2-3a a123z 1.2.3.4 xyz";
From the string "extract_from" above, the regex is extracting the numbers
12
1-2-3a
123z
1.2.3.4
The regex is extracting it correctly except the second and third one "1-2-3a", "123z", which shouldn't be extracted because it contains an alphabet. What pattern can I add in regex to not extract where the numbers also have an alphabet in between?
dash and dot are ok, just not alphabets.
Here, change the regex \S to be \s, notice the caps.
\S matches all but space, \s matches space.
Regex regex = new Regex(#"[\d+]\s+");
Try this one:
[0-9\-.]+\s+
That will allow expressions with more than one decimal, and dashes inside them, vs just at the beginning.
You can use regexhero.net or www.regexplanet.com to test your regex expressions, they're very powerful tools.
Output from your given input would be the following matches:
12
1.2.3.4
Edit, based on comment from OP
This regex shouldn't require a space at the beginning. If you need to match a number at the end of the line, it's probably simplest to just add a special case for it:
[0-9\-.]+\s|[0-9\-.]+$
use this pattern to catch anything but alphabets
(?!\S*[a-zA-Z])\b([^a-zA-Z\s]+)\b
Demo
I am using below regex to strip all non-ascii characters from a string.
String pattern = #"[^\u0000-\u007F]";
Regex rx = new Regex(pattern, RegexOptions.Compiled);
rx.Replace(data," ");
However, i want to allow use of curreny (pound symbol) and trademark symbols.
I have modified above regex as shown below & it works for me. Can anyone just confirm if the regex is valid ?
String pattern = #"[^\u0000-\u007F \p{Sc}]";
Basically, I want to allow all currency symbols too.
Yes, your regex is correct.
What you are doing with your code is replacing the characters matched by your regular expressions by an empty character.
Now, what characters does your regular expression match?
Anything except:
The range you specified: 0000-007F
Currency symbol characters: \p{Sc}. See http://regular-expressions.info/unicode.html#prop
If you just want to keep allowing some other characters, yes, you can add them too (exactly like you did with \p{Sc}.
Edit:
Be careful when doing it in the future. The regex would really be [^\u0000-\u007F\p{Sc}] (no space), although in this case it doesn't matter since the space character was already in the ASCII range.
Im using C# and wanting to use the following regular expression in my code:
sDatabaseServer\s*=\s*"([^"]*)"
I have placed it in my code as:
Regex databaseServer = new Regex(#"sDatabaseServer\s*=\s*"([^"]*)"", RegexOptions.Compiled | RegexOptions.IgnorePatternWhitespace);
I know you have to escape all parenthesis and quotes inside the string quotes but for some reason the following does still not work:
Working Version:
Regex databaseServer = new Regex(#"sDatabaseServer\s*=\s*""([^""]*)""", RegexOptions.Compiled | RegexOptions.IgnorePatternWhitespace);
Any ideas how to get C# to see my regex as just a string? I know i know....easy question...Sorry im still somewhat of an amateur to C#...
SOLVED: Thanks guys!
You went one step too far when you escaped the parentheses. If you want them to be regex meta-characters (i.e. a capturing group), then you must not escape them. Otherwise they will match literal parentheses.
So this is probably what you are looking for:
#"sDatabaseServer\s*=\s*""([^""]*)"""
string regex = "sDatabaseServer\\s*=\\s*\"([^\"]*)\""
in your first try, you forgot to escape your quotes. But since it's a string literal, escaping with a \ doesn't work.
In y our second try, you escaped the quotes, but you didn't escape the \ that's needed for your whitespace token \s
Use \x22 instead of quotes:
string pattern = #"sDatabaseServer\s*=\s*\x22([^\x22]*)\x22";
But
Ignorepattern whitespace allows for comments in the regex pattern (the # sign) or the pattern split over multiple lines. You don't have either; remove.
A better pattern for what you seek is
string pattern =#"(?:sDatabaseServer\s*=\s*\x22)([^\x22]+)(?:\x22)";
(?: ) is match but don't capture and acts like an anchor for the parser. Also it assumes there will be at least 1 character in the quotes, so using the + instead of the *.
I am trying to create a regular expression pattern in C#. The pattern can only allow for:
letters
numbers
underscores
So far I am having little luck (i'm not good at RegEx). Here is what I have tried thus far:
// Create the regular expression
string pattern = #"\w+_";
Regex regex = new Regex(pattern);
// Compare a string against the regular expression
return regex.IsMatch(stringToTest);
EDIT :
#"^[a-zA-Z0-9\_]+$"
or
#"^\w+$"
#"^\w+$"
\w matches any "word character", defined as digits, letters, and underscores. It's Unicode-aware so it'll match letters with umlauts and such (better than trying to roll your own character class like [A-Za-z0-9_] which would only match English letters).
The ^ at the beginning means "match the beginning of the string here", and the $ at the end means "match the end of the string here". Without those, e.g. if you just had #"\w+", then "##Foo##" would match, because it contains one or more word characters. With the ^ and $, then "##Foo##" would not match (which sounds like what you're looking for), because you don't have beginning-of-string followed by one-or-more-word-characters followed by end-of-string.
Try experimenting with something like http://www.weitz.de/regex-coach/ which lets you develop regex interactively.
It's designed for Perl, but helped me understand how a regex works in practice.
Regex
packedasciiRegex = new Regex(#"^[!#$%&'()*+,-./:;?#[\]^_]*$");