.NET RegEx for letters and spaces - c#

I am trying to create a regular expression in C# that allows only alphanumeric characters and spaces. Currently, I am trying the following:
string pattern = #"^\w+$";
Regex regex = new Regex(pattern);
if (regex.IsMatch(value) == false)
{
// Display error
}
What am I doing wrong?

If you just need English, try this regex:
"^[A-Za-z ]+$"
The brackets specify a set of characters
A-Z: All capital letters
a-z: All lowercase letters
' ': Spaces
If you need unicode / internationalization, you can try this regex:
#"$[\\p{L}\\s]+$"
See https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#word-character-w
This regex will match all unicode letters and spaces, which may be more than you need, so if you just need English / basic Roman letters, the first regex will be simpler and faster to execute.
Note that for both regex I have included the ^ and $ operator which mean match at start and end. If you need to pull this out of a string and it doesn't need to be the entire string, you can remove those two operators.

try this for all letter with space :
#"[\p{L} ]+$"

The character class \w does not match spaces. Try replacing it with [\w ] (there's a space after the \w to match word characters and spaces. You could also replace the space with \s if you want to match any whitespace.

If, other then 0-9, a-z and A-Z, you also need to cover any accented letters like ï, é, æ, Ć or Ş then you should better use the Unicode properties \p{...} for matching, i.e. (note the space):
string pattern = #"^[\p{IsLetter}\p{IsDigit} ]+$";

This regex works great for me.
Regex rgx = new Regex("[^a-zA-Z0-9_ ]+");
if (rgx.IsMatch(yourstring))
{
var err = "Special charactes are not allowed in Tags";
}

Related

How to split Alphanumeric with Symbol in C#

I want to spilt Alphanumeric with two part Alpha and numeric with special character like -
string mystring = "1- Any Thing"
I want to store like:
numberPart = 1
alphaPart = Any Thing
For this i am using Regex
Regex re = new Regex(#"([a-zA-Z]+)(\d+)");
Match result = re.Match("1- Any Thing");
string alphaPart = result.Groups[1].Value;
string numberPart = result.Groups[2].Value;
If there is no space in between string its working fine but space and symbol both alphaPart and numberPart showing null where i am doing wrong Might be Regex expression is wrong for this type of filter please suggest me on same
Try this:
(\d+)(?:[^\w]+)?([a-zA-Z\s]+)
Demo
Explanation:
(\d+) - capture one or more digit
[^\w]+ match anything except alphabets
? this tell that anything between word and number can appear or not(when not space is between them)
[a-zA-Z\s]+ match alphabets(even if between them have spaces)
Start of string is matched with ^.
Digits are matched with \d+.
Any non-alphanumeric characters are matched with [\W_] or \W.
Anything is matched with .*.
Use
(?s)^(\d+)\W*(.*)
See proof
(?s) makes . match linebreaks. So, it literally matches everything.

Regex replace special character

I need help in my regex.
I need to remove the special character found in the start of text
for example I have a text like this
.just a $#text this should not be incl#uded
The output should be like this
just a text this should not be incl#uded
I've been testing my regex here but i can't make it work
([\!-\/\;-\#]+)[\w\d]+
How do I limit the regex to check only the text that starts in special characters?
Thank you
Use \B[!-/;-#]+\s*\b:
var result = Regex.Replace(s, #"\B[!-/;-#]+\s*\b", "");
See the regex demo
Details
\B - the position other than a word boundary (there must be start of string or a non-word char immediately to the left of the current position)
[!-/;-#]+ - 1 or more ASCII punctuation
\s* - 0+ whitespace chars
\b - a word boundary, there must be a letter/digit/underscore immediately to the right of the current location.
If you plan to remove all punctuation and symbols, use
var result = Regex.Replace(s, #"\B[\p{P}\p{S}]+\s*\b", "");
See another regex demo.
Note that \p{P} matches any punctuation symbols and \p{S} matches any symbols.
Use lookahead:
(^[.$#]+|(?<= )[.$#]+)
The ^[.$#]+ is used to match the special characters at the start of a line.
The (?<= )[.$#]+) is used to matching the special characters at the start of a word which is in the sentence.
Add your special characters in the character group [] as you need.
Following are two possible options from your question details. Hope it will help you.
string input = ".just a $#text this should not be incl#uded";
//REMOVING ALL THE SPECIAL CHARACTERS FROM THE WHOLE STRING
string output1 = Regex.Replace(input, #"[^0-9a-zA-Z\ ]+", "");
// REMOVE LEADING SPECIAL CHARACTERS FROM EACH WORD IN THE STRING. WILL KEEP OTHER SPECIAL CHARACTERS
var split = input.Split();
string output2 = string.Join(" ", split.Select(s=> Regex.Replace(s, #"^[^0-9a-zA-Z]+", "")).ToArray());
Negative lookahead is fine here :
(?![\.\$#].*)[\S]+
https://regex101.com/r/i0aacp/11/
[\S] match any character
(?![\.\$#].*) negative lookahead means those characters [\S]+ should not start with any of \.\$#

Remove punctuation from string with Regex

I'm really bad with Regex but I want to remove all these .,;:'"$##!?/*&^-+ out of a string
string x = "This is a test string, with lots of: punctuations; in it?!.";
How can I do that ?
First, please read here for information on regular expressions. It's worth learning.
You can use this:
Regex.Replace("This is a test string, with lots of: punctuations; in it?!.", #"[^\w\s]", "");
Which means:
[ #Character block start.
^ #Not these characters (letters, numbers).
\w #Word characters.
\s #Space characters.
] #Character block end.
In the end it reads "replace any character that is not a word character or a space character with nothing."
This code shows the full RegEx replace process and gives a sample Regex that only keeps letters, numbers, and spaces in a string - replacing ALL other characters with an empty string:
//Regex to remove all non-alphanumeric characters
System.Text.RegularExpressions.Regex TitleRegex = new
System.Text.RegularExpressions.Regex("[^a-z0-9 ]+",
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
string ParsedString = TitleRegex.Replace(stringToParse, String.Empty);
return ParsedString;
And I've also stored the code here for future use:
http://code.justingengo.com/post/Use%20a%20Regular%20Expression%20to%20Remove%20all%20Punctuation%20from%20a%20String
Sincerely,
S. Justin Gengo
http://www.justingengo.com

Regex Expressions for all non alphanumeric symbols

I am trying to make a regular expression for a string that has at least 1 non alphanumeric symbol in it
The code I am trying to use is
Regex symbolPattern = new Regex("?[!##$%^&*()_-+=[{]};:<>|./?.]");
I'm trying to match only one of !##$%^&*()_-+=[{]};:<>|./?. but it doesn't seem to be working.
If you want to match non-alphanumeric symbols then just use \W|_.
Regex pattern = new Regex(#"\W|_");
This will match anything except 0-9 and a-z. Information on the \W character class and others available here (c# Regex Cheet Sheet).
https://www.mikesdotnetting.com/article/46/c-regular-expressions-cheat-sheet
You could also avoid regular expressions if you want:
return s.Any(c => !char.IsLetterOrDigit(c))
Can you check for the opposite condition?
Match match = Regex.Match(#"^([a-zA-Z0-9]+)$");
if (!match.Success) {
// it's alphanumeric
} else {
// it has one of those characters in it.
}
I didn't get your entire question, but this regex will match those strings that contains at least one non alphanumeric character. That includes whitespace (couldn't see that in your list though)
[^\w]+
Your regex just needs little tweaking. The hyphen is used to form ranges like A-Z, so if you want to match a literal hyphen, you either have to escape it with a backslash or move it to the end of the list. You also need to escape the square brackets because they're the delimiters for character class. Then get rid of that question mark at the beginning and you're in business.
Regex symbolPattern = new Regex(#"[!##$%^&*()_+=\[{\]};:<>|./?,-]");
If you only want to match ASCII punctuation characters, this is probably the simplest way. \W matches whitespace and control characters in addition to punctuation, and it matches them from the entire Unicode range, not just ASCII.
You seem to be missing a few characters, though: the backslash, apostrophe and quotation mark. Adding those gives you:
#"[!##$%^&*()_+=\[{\]};:<>|./?,\\'""-]"
Finally, it's a good idea to always use C#'s verbatim string literals (#"...") for regexes; it saves you a lot of hassle with backslashes. Quotation marks are escaped by doubling them.

C# Regular Expression to match letters, numbers and underscore

I am trying to create a regular expression pattern in C#. The pattern can only allow for:
letters
numbers
underscores
So far I am having little luck (i'm not good at RegEx). Here is what I have tried thus far:
// Create the regular expression
string pattern = #"\w+_";
Regex regex = new Regex(pattern);
// Compare a string against the regular expression
return regex.IsMatch(stringToTest);
EDIT :
#"^[a-zA-Z0-9\_]+$"
or
#"^\w+$"
#"^\w+$"
\w matches any "word character", defined as digits, letters, and underscores. It's Unicode-aware so it'll match letters with umlauts and such (better than trying to roll your own character class like [A-Za-z0-9_] which would only match English letters).
The ^ at the beginning means "match the beginning of the string here", and the $ at the end means "match the end of the string here". Without those, e.g. if you just had #"\w+", then "##Foo##" would match, because it contains one or more word characters. With the ^ and $, then "##Foo##" would not match (which sounds like what you're looking for), because you don't have beginning-of-string followed by one-or-more-word-characters followed by end-of-string.
Try experimenting with something like http://www.weitz.de/regex-coach/ which lets you develop regex interactively.
It's designed for Perl, but helped me understand how a regex works in practice.
Regex
packedasciiRegex = new Regex(#"^[!#$%&'()*+,-./:;?#[\]^_]*$");

Categories

Resources