Regex - Match a string only when it contains any alphabetic characters - c#

example strings
785*()&!~`a
##$%$~2343
455frt&*&*
i want to capture the first and the third but not the second since it doesnt contain any alphabet character plz help

In fact, I think [a-zA-Z] might suffice to match your strings.
To capture the whole thing, try: ^.*[a-zA-Z].*$

Here is one possible way:
.*[a-zA-Z]+

You should maybe clarify a bit what you mean by 'catpuring': do you want the whole string of just the ascii bits?
Also, you don't say if it should match just plain Roman alphabet (A to Z) or if it should also match Unicode chars to match strings in other languages.
If you just need to test your string, in C# you would do:
bool matching = Regex.IsMatch(myString, "[a-zA-Z]");
You wouldn't need anything else, since just one letter anywhere in the myString string will match (according to your definition).

This is my favorite RegEx testing site: Javascript Regexp Tester and Cheat Sheet

If you want to match all letters (including non-ascii ones), use p{L} instead of [a-zA-Z]. See Unicode categories.

Related

Regular Expression for Alphanumeric characters with one special char in between

can anyone please help me to figure Regex attribute for string field.
I want my string should be in format of FirstName#LastName thats is it.. I require only one special char in between and rest all alphabets only..
You can use the expression [A-Za-z]+#[A-Za-z]+ to test against a nonempty string of alphabetical characters, followed by an # sign and again followed by a nonempty string of alphabetical characters. You can test it online here.
If you want to accept any non-alphanumeric characters in the middle, like $,#,_,- etc, you can use the following,
[a-zA-Z]+[^a-zA-Z\d\s][a-zA-Z]+
it will match all these among others,
FirstName#LastName
FirstName-LastName
FirstName_LastName
FirstName$LastName
FirstName:LastName
Live Demo
If you want to match whitespace in between as well then simply remove \s from above expression.
Hope it helps.

Pattern matching for swedish character

I need a help regarding regular expression.
I have to match string like this:
âãa34dc
Pattern that i have used:
\s*[a-zA-Z]+[a-zA-Z_0-9]*\s
but this pattern is not good enough to identify this kind of string e.g. âãa34dc
P.S. âã these are swedish character.
Please help me for find out correct pattern for this kind of string.
Do you actually want to restrict it to Swedish characters? In other words, should a German character not match? If so, then you'll probably have to enumerate the whole alphabet, and include that.
If what you really want is to match every alphabetic character, use the regular expression terms for matching all letters.
\w matches any word character, but that includes numbers & some punctuation. That's close, but not exactly what you want for your second term.
For the first term, where you don't want to include numbers, specifying that the character should be a Unicode 'letter' class will work. \p{L} specifies all Unicode characters that are a letter. This includes [a-zA-Z], and all the Swedish characters, and German, and Russian, etc.
Therefore, I think this regular expression is what you want:
\s*[\p{L}][\p{L}_0-9]*\s
If you want to include digits from other character sets, and some other punctuation, then you can use [\w]* for the second term.
please give a set of rules.
according to your question :
[X-Ya-zA-Z]{3}[0-9]{2}[a-zA-Z]{2}
Replace X with the first swedish letter
Replace Y with the last swedish letter
John Machin provides a great answer for this. Adapting his pattern, what you need is probably something similar to: \s*[^\W\d_]\w*\s*
P.S. I removed the + quantifier from your first part. Any subsequent letters would be matched by the subsequent quantified \w.

Regex setting word characters and matching exact word

I need my C# regex to only match full words, and I need to make sure that +-/*() delimit words as well (I'm not sure if the last part is already set that way.) I find regexes very confusing and would like some help on the matter.
Currently, my regex is:
public Regex codeFunctions = new Regex("draw_line|draw_rectangle|draw_circle");
Thank you! :)
Try
public Regex codeFunctions = new Regex(#"\b(draw_line|draw_rectangle|draw_circle)\b");
The \b means match a word boundary, i.e. a transition from a non-word character to a word character (or vice versa).
Word characters include alphabet characters, digits, and the underscore symbol. Non-word characters include everything else, including +-/*(), so it should work fine for you.
See the Regex Class documentation for more details.
The # at the start of the string makes the string a verbatim string, otherwise you have to type two backslashes to make one backslash.
Do you want to match any words, or just the words listed above? To match an arbitrary word, substitute this for the bit that creates the Regex object:
new Regex (#"\b(\w+)\b");
In the future, if you want more characters to be treated as whitespace (for example, underscores), I would recommend String.Replace-ing them to a space character. There may be a clever way to get the same effect with regular expressions, but personally I think it would be too clever. The String.Replace version is obvious.
Also, I can't help but recommend that you read up on regular expressions. Yes, they look like line noise until you get used to them, but once you do they're convenient and there are plenty of good resources out there to help you.

How Can I Check If a C# Regular Expression Is Trying to Match 1-(and-only-1)-Character Strings?

Maybe this is a very rare (or even dumb) question, but I do need it in my app.
How can I check if a C# regular expression is trying to match 1-character strings?
That means, I only allow the users to search 1-character strings. If the user is trying to search multi-character strings, an error message will be displaying to the users.
Did I make myself clear?
Thanks.
Peter
P.S.: I saw an answer about calculating the final matched strings' length, but for some unknown reason, the answer is gone.
I thought it for a while, I think calculating the final matched strings length is okay, though it's gonna be kind of slow.
Yet, the original question is very rare and tedious.
a regexp would be .{1}
This will allow any char though. if you only want alpanumeric then you can use [a-z0-9]{1} or shorthand /w{1}
Another option its to limit the number of chars a user can type in an input field. set a maxlength on it.
Yet another option is to save the forms input field to a char and not a string although you may need some handling around this to prevent errors.
Why not use maxlength and save to a char.
You can look for unescaped *, +, {}, ? etc. and count the number of characters (don't forget to flatten the [] as one character).
Basically you have to parse your regex.
Instead of validating the regular expression, which could be complicated, you could apply it only on single characters instead of the whole string.
If this is not possible, you may want to limit the possibilities of regular expression to some certain features. For instance the user can only enter characters to match or characters to exclude. Then you build up the regex in your code.
eg:
ABC matches [ABC]
^ABC matches [^ABC]
A-Z matches [A-Z]
# matches [0-9]
\w matches \w
AB#x-z matches [AB]|[0-9]|[x-z]|\w
which cases do you need to support?
This would be somewhat easy to parse and validate.

C# Regex - How to parse string for Swedish letters åäöÅÄÖ?

I'm trying to parse an HTML file for strings in this format:
MyUsername O22</td>
I want to retrieve the information where "305157", "MyUsername" and the first letter in "O22" (which can be either T, K or O).
I'm using this regex; \w* \w\d\d and it works fine, as long as there aren't any åäöÅÄÖ's where the "\w" are.
What should I do?
You can use a character class which specifically includes those things:
[\wåäöÅÄÖ]*
Or you can use the Unicode character class for letters:
\p{L}
or specifically for Latin:
\p{InBasicLatin}
You can use \p{L} to match any 'letter', which will support all letters in all languages, as suggested in this SO question.
Or, you can simply replace \w* with [^<]*, to match all characters that are not the opening of an HTML tag.
But as said by others, parsing HTML using regex is a first step towards insanity...
Firstly: DON'T USE REGULAR EXPRESSIONS TO PARSE HTML. USE AN HTML PARSER.
Secondly: if you really want to do this (and you don't) then instead of \w you could match any character apart from '<':
[^<]* \w\d\d

Categories

Resources