Regex Expressions for all non alphanumeric symbols - c#

I am trying to make a regular expression for a string that has at least 1 non alphanumeric symbol in it
The code I am trying to use is
Regex symbolPattern = new Regex("?[!##$%^&*()_-+=[{]};:<>|./?.]");
I'm trying to match only one of !##$%^&*()_-+=[{]};:<>|./?. but it doesn't seem to be working.

If you want to match non-alphanumeric symbols then just use \W|_.
Regex pattern = new Regex(#"\W|_");
This will match anything except 0-9 and a-z. Information on the \W character class and others available here (c# Regex Cheet Sheet).
https://www.mikesdotnetting.com/article/46/c-regular-expressions-cheat-sheet

You could also avoid regular expressions if you want:
return s.Any(c => !char.IsLetterOrDigit(c))

Can you check for the opposite condition?
Match match = Regex.Match(#"^([a-zA-Z0-9]+)$");
if (!match.Success) {
// it's alphanumeric
} else {
// it has one of those characters in it.
}

I didn't get your entire question, but this regex will match those strings that contains at least one non alphanumeric character. That includes whitespace (couldn't see that in your list though)
[^\w]+

Your regex just needs little tweaking. The hyphen is used to form ranges like A-Z, so if you want to match a literal hyphen, you either have to escape it with a backslash or move it to the end of the list. You also need to escape the square brackets because they're the delimiters for character class. Then get rid of that question mark at the beginning and you're in business.
Regex symbolPattern = new Regex(#"[!##$%^&*()_+=\[{\]};:<>|./?,-]");
If you only want to match ASCII punctuation characters, this is probably the simplest way. \W matches whitespace and control characters in addition to punctuation, and it matches them from the entire Unicode range, not just ASCII.
You seem to be missing a few characters, though: the backslash, apostrophe and quotation mark. Adding those gives you:
#"[!##$%^&*()_+=\[{\]};:<>|./?,\\'""-]"
Finally, it's a good idea to always use C#'s verbatim string literals (#"...") for regexes; it saves you a lot of hassle with backslashes. Quotation marks are escaped by doubling them.

Related

Why is giving me true the regular expression [^%()*+-\/=?#[\\]ªº´`¿'.]* with the comma (,)?

I have a problem with that regular expression [^%()*+-\/=?#[\\]ªº´¿'.]*` .
I want to avoid the characters inside. the regular expression it is working but when I set something like DAVID, SC I can save the form because it has a comma but this character it is not inside the regular expression.
Could you help me please?
You are not accounting for the special meaning of - inside a character class [.....].
You must either place the dash at the very end, or else escape it with a backslash:
[^%()*+\/=?#\[\]ªº´¿'.-]*
In your original regex, +-\/ disallows any characters between + and / in the ASCII table; these are the comma, dot and dash. Your example input contains a comma so the regex did not match all of the input at once.
I have also fixed the escaping for the [] characters from [\\] to \[\], which I presume was a mistake.
Because you're using * in [^%()*+\/=?#[\\]ªº´¿'.-]* with line start/end anchors. * means match 0 or more of preceding group/pattern in character class and your regex can even match an empty string.
Use this regex:
^[^%()*+\/=?#[\\-]ªº´¿'.]+$
PS: Hyphen - should be either or first OR at last position in character class to avoid escaping.
Rubular Demo

Strip non ascii chars but allow currency symbols

I am using below regex to strip all non-ascii characters from a string.
String pattern = #"[^\u0000-\u007F]";
Regex rx = new Regex(pattern, RegexOptions.Compiled);
rx.Replace(data," ");
However, i want to allow use of curreny (pound symbol) and trademark symbols.
I have modified above regex as shown below & it works for me. Can anyone just confirm if the regex is valid ?
String pattern = #"[^\u0000-\u007F \p{Sc}]";
Basically, I want to allow all currency symbols too.
Yes, your regex is correct.
What you are doing with your code is replacing the characters matched by your regular expressions by an empty character.
Now, what characters does your regular expression match?
Anything except:
The range you specified: 0000-007F
Currency symbol characters: \p{Sc}. See http://regular-expressions.info/unicode.html#prop
If you just want to keep allowing some other characters, yes, you can add them too (exactly like you did with \p{Sc}.
Edit:
Be careful when doing it in the future. The regex would really be [^\u0000-\u007F\p{Sc}] (no space), although in this case it doesn't matter since the space character was already in the ASCII range.

.NET RegEx for letters and spaces

I am trying to create a regular expression in C# that allows only alphanumeric characters and spaces. Currently, I am trying the following:
string pattern = #"^\w+$";
Regex regex = new Regex(pattern);
if (regex.IsMatch(value) == false)
{
// Display error
}
What am I doing wrong?
If you just need English, try this regex:
"^[A-Za-z ]+$"
The brackets specify a set of characters
A-Z: All capital letters
a-z: All lowercase letters
' ': Spaces
If you need unicode / internationalization, you can try this regex:
#"$[\\p{L}\\s]+$"
See https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#word-character-w
This regex will match all unicode letters and spaces, which may be more than you need, so if you just need English / basic Roman letters, the first regex will be simpler and faster to execute.
Note that for both regex I have included the ^ and $ operator which mean match at start and end. If you need to pull this out of a string and it doesn't need to be the entire string, you can remove those two operators.
try this for all letter with space :
#"[\p{L} ]+$"
The character class \w does not match spaces. Try replacing it with [\w ] (there's a space after the \w to match word characters and spaces. You could also replace the space with \s if you want to match any whitespace.
If, other then 0-9, a-z and A-Z, you also need to cover any accented letters like ï, é, æ, Ć or Ş then you should better use the Unicode properties \p{...} for matching, i.e. (note the space):
string pattern = #"^[\p{IsLetter}\p{IsDigit} ]+$";
This regex works great for me.
Regex rgx = new Regex("[^a-zA-Z0-9_ ]+");
if (rgx.IsMatch(yourstring))
{
var err = "Special charactes are not allowed in Tags";
}

C# Regular Expression to match letters, numbers and underscore

I am trying to create a regular expression pattern in C#. The pattern can only allow for:
letters
numbers
underscores
So far I am having little luck (i'm not good at RegEx). Here is what I have tried thus far:
// Create the regular expression
string pattern = #"\w+_";
Regex regex = new Regex(pattern);
// Compare a string against the regular expression
return regex.IsMatch(stringToTest);
EDIT :
#"^[a-zA-Z0-9\_]+$"
or
#"^\w+$"
#"^\w+$"
\w matches any "word character", defined as digits, letters, and underscores. It's Unicode-aware so it'll match letters with umlauts and such (better than trying to roll your own character class like [A-Za-z0-9_] which would only match English letters).
The ^ at the beginning means "match the beginning of the string here", and the $ at the end means "match the end of the string here". Without those, e.g. if you just had #"\w+", then "##Foo##" would match, because it contains one or more word characters. With the ^ and $, then "##Foo##" would not match (which sounds like what you're looking for), because you don't have beginning-of-string followed by one-or-more-word-characters followed by end-of-string.
Try experimenting with something like http://www.weitz.de/regex-coach/ which lets you develop regex interactively.
It's designed for Perl, but helped me understand how a regex works in practice.
Regex
packedasciiRegex = new Regex(#"^[!#$%&'()*+,-./:;?#[\]^_]*$");

Regular Expression for alphanumeric and space

What is the regular exp for a text that can't contain any special characters except space?
Because Prajeesh only wants to match spaces, \s will not suffice as it matches all whitespace characters including line breaks and tabs.
A character set that should universally work across all RegEx parsers is:
[a-zA-Z0-9 ]
Further control depends on your needs. Word boundaries, multi-line support, etc... I would recommend visiting Regex Library which also has some links to various tutorials on how Regular Expression Parsing works.
[\w\s]*
\w will match [A-Za-z0-9_] and the \s will match whitespaces.
[\w ]* should match what you want.
Assuming "special characters" means anything that's not a letter or digit, and "space" means the space character (ASCII 32):
^[A-Za-z0-9 ]+$
You need #"^[A-Za-z0-9 ]+$". The \s character class matches things other than space (such as tab) and you since you want to match sure that no part of the string has other characters you should anchor it with ^ and $.
If you just want alphabets and spaces then you can use: #"[A-Za-z\s]+" to match at least one character or space. You could also use #"[A-Za-z ]+" instead without explicitly denoting the space.
Otherwise please clarify.
In C#, I'd believe it's ^(\w|\s)*$

Categories

Resources