I want a Regular Expression for a word.otherword form. I tried \b[a-z]\.[a-z]\b, but it gives me an error at the \. part, saying Unrecognized escape sequence. Any idea what's wrong? I'm working under .NET C#. Thanks!
LE:
john.Smith or JoHn.SmItH or JOHN.SMITH should work.
John Smith or john!Smith or john.Smith.Smith shouldn't work.
Try this :
foundMatch = Regex.IsMatch(SubjectString, #"\b[a-z]\.[a-z]\b");
Probably you were not using #?
Your regex tries to match a.a this means a single character. But since you want it to match complete words you need a quantifier e.g.
\b[a-z]+\.[a-z]+\b
Finally you may want to use the case insensitive match to allow for words with capital letters to be matched too :
foundMatch = Regex.IsMatch(SubjectString, #"\b[a-z]+\.[a-z]+\b", RegexOptions.IgnoreCase);
This will match all words.words with at least one character for each word regardless of capitalization.
This will match all word.otherword only if there is a space behind the first word or it is the start of the string and only if there is a space after the second word or it is the end of the string.
foundMatch = Regex.IsMatch(SubjectString, #"(?<=\s|^)\b[a-z]+\.[a-z]+\b(?=\s|$)", RegexOptions.IgnoreCase);
Try this regex for word.word format:
#"\b([a-z]+)\.\1"
For word.otherword use this:
#"\b[a-z]+\.[a-z]+\b"
Related
Currently i using this pattern: [HelloWorld]{1,}.
So if my input is: Hello -> It will be match.
But if my input is WorldHello -> Still match but not right.
So how to make input string must match exactly will value inside pattern?
Just get rid of the square brackets, and the comma and you're good to go!
HelloWorld{1}
In regex what's between square brackets is a character set.
So [HelloWorld] matches 1 character that's in the set [edlorHW].
And .{1,} or .+ both match 1 or more characters.
What you probably want is the literal word.
So the regex would simple be "HelloWorld".
That would match HelloWord in the string "blaHelloWorldbla".
If you want the word to be a single word, and not part of a word?
Then you could use wordboundaries \b, which indicate the transition between a word character (\w = [A-Za-z0-9_]) and a non-word character (\W = [^A-Za-z0-9_]) or the beginning of a line ^ or the end of a line $.
For example #"\bHelloWorld\b" to get a match from "bla HelloWorld bla" but not from "blaHelloWorldbla".
Note that the regex string this time was proceeded by #.
Because by using a verbatim string the backslashes don't have to be backslashed.
it seems you need to use online regex tester web sites to check your pattern. for example you could find one of them here and also you could study c# regex reference here
Try this pattern:
[a-zA-Z]{1,}
You can test it online
This is something simple but I cannot figure this out. I want to find a substring with this regex. It will mach "M4N 3M5", but doesn't match the below :
const string text = "asdf M4N 3M5 adsf";
Regex regex = new Regex(#"^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$", RegexOptions.None);
Match match = regex.Match(text);
string value = match.Value;
Try removing ^ and $:
Regex regex = new Regex(#"[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}", RegexOptions.None);
^ : The match must start at the beginning of the string or line.
$ : The match must occur at the end of the string or before \n at the
end of the line or string.
If you want to match only in word boundaries you can use \b as suggested by Mike Strobel:
Regex regex = new Regex(#"\b[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}\b", RegexOptions.None);
I know this question has been answered but i have noticed two thing in your pattern which i want to highlight:
No need to mention the single instance of any token.
For example: (Notice the missing {1})
\d{1} = \d
[A-Z]{1} = [A-Z]
Also I won't recommend you to enter a <space>in your pattern use '\s' instead because if mistakenly a backspace is pressed you might not
be able to figure out the mistake and running code will stop
working.
Personally, for this case i would recommend you to use \b since it is best fit here.
This regex below captures the -aaaa and -cccc but not the -eee
How can I do that?
keywords = "-aaa bbb -ccc -eee";
MatchCollection wordColNegEnd = Regex.Matches(keywords, #"-(.*?) ");
Use a "word boundary" /\b/ instead of a space, which matches the end of the string as well as a word/non-word boundary:
Regex.Matches(keywords, #"-(.*?)\b");
or, depending on what characters may be in the strings, just use "word characters" /\w/ to match the pattern:
Regex.Matches(keywords, #"-(\w+)");
MatchCollection worldColNegEnd = Regex.Matches(keywords, #"-(.*?)\b"
Word boundary is better than space, please give someone else upvotes though, since I brain farted the purpose of it.
Also I don't know why you included a ? in your original so I left it, but I believe it is not necessary, as * matches 0 or more matches.
Use
MatchCollection wordColNegEnd = Regex.Matches(keywords, #"-(.+?)\b");
Currently, your regex requires a trailing space behind the capturing group. the strings "aaa" and "ccc" have this, but "eee" does not.
Instead of matching any characters occurring after a dash, try matching nonspace characters:
#"-(\S*?)"
keywords = "-aaa bbb -ccc -eee";
MatchCollection wordColNegEnd = Regex.Matches(keywords, #"-\w+");
You haven't specified what exactly you are trying to match here.
But if I understood it right, you want to match any alpha string that starts with -
Use this RegEx: -[a-z]+
How can I use a regular expression to get words that start with ! ? For example !Test.
I tried this but it doesn't give any matches:
#"\B\!\d+\b"
Although it did work when I replaced the ! with $.
I'd say that your regex was quite OK already, you just need to use \w (alphanumeric character) instead of \d (digit):
#"\B!\w+\b"
will match any word that is immediately preceded by a ! unless that ! itself is preceded by a word itself (that's what the \B asserts). Using a ^ instead will limit the matches to words that start at the beginning of a line which might not be what you want.
So this will match all the words including exactly one preceding ! in this line:
!hello !this ...!will !!!be !matched!
but none of the words in this line:
this! won't!be matched!!!
You could also drop the \B altogether if you don't mind matching !that in this!that.
This should work: ^!\w+
MatchCollection matches = Regex.Matches (inputText, #"^!\w+");
foreach (Match match in matches)
{
Console.WriteLine (match.Value);
}
How would a regex look like when I search for this string:
before CAN be many comment lines --------"Encrypted" after must come a newline.
this does not seem to work:
Regex pattern = new Regex(#"^[-]*$[Encrypted][\n]");
what do I wrong?
The pattern you're searching for is not entirely clear to me, nor are the rest of the contents you're searching in, but if you're really just looking for "Encrypted" directly followed by only a newline then this is all you need to do:
Regex r = new Regex(#"Encrypted\n")
EDIT
Ok, comments seem to suggest that you're looking for zero or more occurences of "-", followed by "Encrypted", followed by newline. In that case the following will work.
Regex r = new Regex(#"-*Encrypted\n");
If there should be at least one "-" before "Encrypted", it will be
Regex r = new Regex(#"-+Encrypted\n");
Try it without the square brackets and move the dollar $ to the end of the pattern. Ie:
Regex pattern = new Regex(#"^-*Encrypted$");
The square brackets is like an Or statement. So [Encrypted] is the same as saying: 'E' or 'n' or 'c' or 'r'.... or 'd'.
The dollar symbol matches the end of the string.
I don't know specifically regex for c# but i thing you must not put the ^ because -'s are not at the begining of the line. And what the $ is doing in the middle.
So i would do in PCRE regex:
/-+"Encrypted" *\n/
That match one or more - followed by "Encryption" followed by 0 or more space followed by newline.
#Sanju
does this work :P
Regex pattern = new Regex(#"-*Encrypted\n");
for me it did! You have to remove the "^" char