How do I use for example this line:
Regex.Matches(str,#"[a-zA-Z]");
that instead of the str I will have a char?
It seems that you want to test if a character is a letter. You do not need a regular expression to do that. Instead you can use the following:
var isLetter = Char.IsLetter(ch);
However, this will return true for all UNICODE letters, not only A-Z, e.g. also accented letters like É or other letters like Æ and 你.
If you want to only test for A-Z (upper and lower case) you can use this simple test:
var upperCaseCh = Char.ToUpperInvariant(ch);
var isLetter = 'A' <= upperCaseCh && upperCaseCh <= 'Z';
I'd rather use the static functions of the char class or use the comparison operators, i.e.
var test = 'a' <= c && c <= 'z';
The static methods can give you the character class, e.g. letter, digit or whitespace.
You can call ToString on character and then use that like:
char ch = 'c';
Regex.Matches(ch.ToString(),#"[a-zA-Z]");
Related
Can somebody please help me write RegEx expression which can check if a string contains more than one occurrence of a uppercase (or lowercase, doesn't matter) letter but not in a row.
I need to have at least two (or even better n) occurrences in a string. If n=2 valid situations would be PassWord or PAssword or PASSWord.
When I tried this /(?=([A-Z]{2,3}))/g it matched PassWOrd but not PassWord.
What is strange to me is that it also matched PaSSWOrd. I thought 3 in {2,3} actually means that no more that 3 Uppercase characters will be matched. Why is SSWO matched?
I tried similar variations but non of them worked for me (nothing strange as i'm not very familiar with RegEx).
Can this be done using RegEx?
The (?=([A-Z]{2,3})) regex matches 2 to 3 consecutive uppercase ASCII letters anywhere inside a string. You want to match a string that only contains 2 to 3 uppercase ASCII letters, not necessarily consecutively.
To match a string that only contains two uppercase ASCII letters (no more no less), use the following expression:
^(?:[^A-Z]*[A-Z]){2}[^A-Z]*$
Or, if you only allow ASCII letters in the whole string:
^(?:[a-z]*[A-Z]){2}[a-z]*$
See the regex demo.
Pattern details
^ - start of string
(?:[^A-Z]*[A-Z]){2} - exactly 2 consequent occurrences of
[^A-Z]* - zero or more chars other than ASCII uppercase letters
[A-Z] - one ASCII uppercase letter
[^A-Z]* - zero or more chars other than ASCII uppercase letters
$ - end of string.
In C#, use
var strs = new List<string> { "PassWord", "PAssword", "PASSWord"};
var n = 2;
var pat = $#"^(?:[^A-Z]*[A-Z]){{{n}}}[^A-Z]*$";
foreach (var s in strs) {
Console.WriteLine("{0}: {1}", s, Regex.IsMatch(s, pat));
}
Result:
PassWord: True
PAssword: True
PASSWord: False
See the online demo
Note that in case you need to require 2 uppercase ASCII letters in a string where other chars can be any chars, you do not need a regex, use LINQ:
var strs = new List<string> { "PassWord", "PAssword", "PASSWord"};
var n = 2;
foreach (var s in strs) {
var res = s.Count(c => (c >= 65 && c <= 90));
Console.WriteLine("{0}: {1}", s, res == 2);
}
See another demo. The .Count(c => (c >= 65 && c <= 90)) part will count the uppercase ASCII letters anywhere in the string, and res==2 will return a boolean result, whether the number is equal to 2 or not. It can be adjusted for a numeric range check easily.
If you need Unicode compatibility, replace .Count(c => (c >= 65 && c <= 90)) with .Where(Char.IsUpper).
How can I check if a TextBox contains numbers, letters, and also have special letters like "õ, ä, ö, ü"?
I use code to check numbers and letters:
Regex.IsMatch(Value, "^[a-z0-9]+$", RegexOptions.IgnoreCase)
How can I check if textbox contains numbers and letters only,
bool isValid = textBox.Text.All(char.IsLetterOrDigit);
Consider the following example:
string str = "Something123õäö";
bool isValid = str.All(char.IsLetterOrDigit);
You will get true for the above case.
Does How can you strip non-ASCII characters from a string? (in C#) contain any pointers?
You could include the unicode using the \uXXXX syntax within the regex for any additional letters you specifically want to strip test for.
Regex.IsMatch(Value, "^[a-z0-9\u00c0-\u00f6]+$", RegexOptions.IgnoreCase)
Just loop over every char and compare with or to the other chars and with char.GetUnicodeCategory for letters and digits:
var allowed = new[] { 'ö', 'ä' };
var isOK = textBox1.Text.All(c =>
char.GetUnicodeCategory(c) == UnicodeCategory.LowercaseLetter ||
char.GetUnicodeCategory(c) == UnicodeCategory.UppercaseLetter ||
char.GetUnicodeCategory(c) == UnicodeCategory.DecimalDigitNumber ||
allowed.Contains(c));
UnicodeCategory.LowercaseLetter are standard lowercase letters ('a'..'z'), UnicodeCategory.UppercaseLetter are uppercase letters, and UnicodeCategory.DecimalDigitNumber are digits, so this and a customized allowed array should take care of everything you want to accept.
If you want to validate all "word charters" just use \w if you want to see if a whole string is just word characters or digits use the regex ^(\w|\d)+$
i want to check if a string only contains correct letters.
I used Char.IsLetter for this.
My problem is, when there are chars like é or á they are also said to be correct letters, which shouldn't be.
is there a possibility to check a char as a correct letter A-Z or a-z without special-letters like á?
bool IsEnglishLetter(char c)
{
return (c>='A' && c<='Z') || (c>='a' && c<='z');
}
You can make this an extension method:
static bool IsEnglishLetter(this char c) ...
You can use Char.IsLetter(c) && c < 128 . Or just c < 128 by itself, that seems to match your problem the closest.
But you are solving an Encoding issue by filtering chars. Do investigate what that other application understands exactly.
It could be that you should just be writing with Encoding.GetEncoding(someCodePage).
You can use regular expression \w or [a-zA-Z] for it
// Create the regular expression
string pattern = #"^[a-zA-Z]+$";
Regex regex = new Regex(pattern);
// Compare a string against the regular expression
return regex.IsMatch(stringToTest);
In C# 9.0 you can use pattern matching enhancements.
public static bool IsLetter(this char c) =>
c is >= 'a' and <= 'z' or >= 'A' and <= 'Z';
As of .NET 7 there is an Char.IsAsciiLetter function which would exactly meet the requirements
https://learn.microsoft.com/en-za/dotnet/api/system.char.isasciiletter?view=net-7.0
Use Linq for easy access:
if (yourString.All(char.IsLetter))
{
//just letters are accepted.
}
I want to take a string and check the first character for being a letter, upper or lower doesn't matter, but it shouldn't be special, a space, a line break, anything. How can I achieve this in C#?
Try the following
string str = ...;
bool isLetter = !String.IsNullOrEmpty(str) && Char.IsLetter(str[0]);
Try the following
bool isValid = char.IsLetter(name.FirstOrDefault());
return (myString[0] >= 'A' && myString[0] <= 'Z') || (myString[0] >= 'a' && myString[0] <= 'z')
You should look up the ASCII table, a table which systematically maps characters to integer values. All lower-case characters are sequential (97-122), as are all upper-case characters (65-90). Knowing this, you do not even have to cast to the int values, just check if the first char of the string is within one of those two ranges (inclusive).
I read a string from the console. How do I make sure it only contains English characters and digits?
Assuming that by "English characters" you are simply referring to the 26-character Latin alphabet, this would be an area where I would use regular expressions: ^[a-zA-Z0-9 ]*$
For example:
if( Regex.IsMatch(Console.ReadLine(), "^[a-zA-Z0-9]*$") )
{ /* your code */ }
The benefit of regular expressions in this case is that all you really care about is whether or not a string matches a pattern - this is one where regular expressions work wonderfully. It clearly captures your intent, and it's easy to extend if you definition of "English characters" expands beyond just the 26 alphabetic ones.
There's a decent series of articles here that teach more about regular expressions.
Jørn Schou-Rode's answer provides a great explanation of how the regular expression presented here works to match your input.
You could match it against this regular expression: ^[a-zA-Z0-9]*$
^ matches the start of the string (ie no characters are allowed before this point)
[a-zA-Z0-9] matches any letter from a-z in lower or upper case, as well as digits 0-9
* lets the previous match repeat zero or more times
$ matches the end of the string (ie no characters are allowed after this point)
To use the expression in a C# program, you will need to import System.Text.RegularExpressions and do something like this in your code:
bool match = Regex.IsMatch(input, "^[a-zA-Z0-9]*$");
If you are going to test a lot of lines against the pattern, you might want to compile the expression:
Regex pattern = new Regex("^[a-zA-Z0-9]*$", RegexOptions.Compiled);
for (int i = 0; i < 1000; i++)
{
string input = Console.ReadLine();
pattern.IsMatch(input);
}
The accepted answer does not work for the white spaces or punctuation. Below code is tested for this input:
Hello: 1. - a; b/c \ _(5)??
(Is English)
Regex regex = new Regex("^[a-zA-Z0-9. -_?]*$");
string text1 = "سلام";
bool fls = regex.IsMatch(text1); //false
string text2 = "123 abc! ?? -_)(/\\;:";
bool tru = regex.IsMatch(text2); //true
One other way is to check if IsLower and IsUpper both doesn't return true.
Something like :
private bool IsAllCharEnglish(string Input)
{
foreach (var item in Input.ToCharArray())
{
if (!char.IsLower(item) && !char.IsUpper(item) && !char.IsDigit(item) && !char.IsWhiteSpace(item))
{
return false;
}
}
return true;
}
and for use it :
string str = "فارسی abc";
IsAllCharEnglish(str); // return false
str = "These are english 123";
IsAllCharEnglish(str); // return true
Do not use RegEx and LINQ they are slower than the loop by characters of string
Performance test
My solution:
private static bool is_only_eng_letters_and_digits(string str)
{
foreach (char ch in str)
{
if (!(ch >= 'A' && ch <= 'Z') && !(ch >= 'a' && ch <= 'z') && !(ch >= '0' && ch <= '9'))
{
return false;
}
}
return true;
}
do you have web access? i would assume that cannot be guaranteed, but Google has a language api that will detect the language you pass to it.
google language api
bool onlyEnglishCharacters = !EnglishText.Any(a => a > '~');
Seems cheap, but it worked for me, legit easy answer.
Hope it helps anyone.
bool AllAscii(string str)
{
return !str.Any(c => !Char.IsLetterOrDigit(c));
}
Something like this (if you want to control input):
static string ReadLettersAndDigits() {
StringBuilder sb = new StringBuilder();
ConsoleKeyInfo keyInfo;
while ((keyInfo = Console.ReadKey(true)).Key != ConsoleKey.Enter) {
char c = char.ToLower(keyInfo.KeyChar);
if (('a' <= c && c <= 'z') || char.IsDigit(c)) {
sb.Append(keyInfo.KeyChar);
Console.Write(c);
}
}
return sb.ToString();
}
If i dont wnat to use RegEx, and just to provide an alternate solution, you can just check the ASCII code of each character and if it lies between that range, it would either be a english letter or a number (This might not be the best solution):
foreach (char ch in str.ToCharArray())
{
int x = (int)char;
if (x >= 63 and x <= 126)
{
//this is english letter, i.e.- A, B, C, a, b, c...
}
else if(x >= 48 and x <= 57)
{
//this is number
}
else
{
//this is something diffrent
}
}
http://en.wikipedia.org/wiki/ASCII for full ASCII table.
But I still think, RegEx is the best solution.
I agree with the Regular Expression answers. However, you could simplify it to just "^[\w]+$". \w is any "word character" (which translates to [a-zA-Z_0-9] if you use a non-unicode alphabet. I don't know if you want underscores as well.
More on regexes in .net here: http://msdn.microsoft.com/en-us/library/ms972966.aspx#regexnet_topic8
As many pointed out, accepted answer works only if there is a single word in the string. As there are no answers that cover the case of multiple words or even sentences in the string, here is the code:
stringToCheck.Any(x=> char.IsLetter(x) && !((int)x >= 63 && (int)x <= 126));
<?php
$string="हिन्दी";
$string="Manvendra Rajpurohit";
echo strlen($string); echo '<br>';
echo mb_strlen($string, 'utf-8');
echo '<br>';
if(strlen($string) != mb_strlen($string, 'utf-8'))
{
echo "Please enter English words only:(";
}
else {
echo "OK, English Detected!";
}
?>