regex between characters including end - c#

This regex below captures the -aaaa and -cccc but not the -eee
How can I do that?
keywords = "-aaa bbb -ccc -eee";
MatchCollection wordColNegEnd = Regex.Matches(keywords, #"-(.*?) ");

Use a "word boundary" /\b/ instead of a space, which matches the end of the string as well as a word/non-word boundary:
Regex.Matches(keywords, #"-(.*?)\b");
or, depending on what characters may be in the strings, just use "word characters" /\w/ to match the pattern:
Regex.Matches(keywords, #"-(\w+)");

MatchCollection worldColNegEnd = Regex.Matches(keywords, #"-(.*?)\b"
Word boundary is better than space, please give someone else upvotes though, since I brain farted the purpose of it.
Also I don't know why you included a ? in your original so I left it, but I believe it is not necessary, as * matches 0 or more matches.

Use
MatchCollection wordColNegEnd = Regex.Matches(keywords, #"-(.+?)\b");

Currently, your regex requires a trailing space behind the capturing group. the strings "aaa" and "ccc" have this, but "eee" does not.
Instead of matching any characters occurring after a dash, try matching nonspace characters:
#"-(\S*?)"

keywords = "-aaa bbb -ccc -eee";
MatchCollection wordColNegEnd = Regex.Matches(keywords, #"-\w+");

You haven't specified what exactly you are trying to match here.
But if I understood it right, you want to match any alpha string that starts with -
Use this RegEx: -[a-z]+

Related

Regex.Match() won't match a substring

This is something simple but I cannot figure this out. I want to find a substring with this regex. It will mach "M4N 3M5", but doesn't match the below :
const string text = "asdf M4N 3M5 adsf";
Regex regex = new Regex(#"^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$", RegexOptions.None);
Match match = regex.Match(text);
string value = match.Value;
Try removing ^ and $:
Regex regex = new Regex(#"[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}", RegexOptions.None);
^ : The match must start at the beginning of the string or line.
$ : The match must occur at the end of the string or before \n at the
end of the line or string.
If you want to match only in word boundaries you can use \b as suggested by Mike Strobel:
Regex regex = new Regex(#"\b[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}\b", RegexOptions.None);
I know this question has been answered but i have noticed two thing in your pattern which i want to highlight:
No need to mention the single instance of any token.
For example: (Notice the missing {1})
\d{1} = \d
[A-Z]{1} = [A-Z]
Also I won't recommend you to enter a <space>in your pattern use '\s' instead because if mistakenly a backspace is pressed you might not
be able to figure out the mistake and running code will stop
working.
Personally, for this case i would recommend you to use \b since it is best fit here.

Replace with wildcards

I need some advice. Suppose I have the following string: Read Variable
I want to find all pieces of text like this in a string and make all of them like the following:Variable = MessageBox.Show. So as aditional examples:
"Read Dog" --> "Dog = MessageBox.Show"
"Read Cat" --> "Cat = MessageBox.Show"
Can you help me? I need a fast advice using RegEx in C#. I think it is a job involving wildcards, but I do not know how to use them very well... Also, I need this for a school project tomorrow... Thanks!
Edit: This is what I have done so far and it does not work: Regex.Replace(String, "Read ", " = Messagebox.Show").
You can do this
string ns= Regex.Replace(yourString,"Read\s+(.*?)(?:\s|$)","$1 = MessageBox.Show");
\s+ matches 1 to many space characters
(.*?)(?:\s|$) matches 0 to many characters till the first space (i.e \s) or till the end of the string is reached(i.e $)
$1 represents the first captured group i.e (.*?)
You might want to clarify your question... but here goes:
If you want to match the next word after "Read " in regex, use Read (\w*) where \w is the word character class and * is the greedy match operator.
If you want to match everything after "Read " in regex, use Read (.*)$ where . will match all characters and $ means end of line.
With either regex, you can use a replace of $1 = MessageBox.Show as $1 will reference the first matched group (which was denoted by the parenthesis).
Complete code:
replacedString = Regex.Replace(inStr, #"Read (.*)$", "$1 = MessageBox.Show");
The problem with your attempt is, that it cannot know that the replacement string should be inserted after your variable. Let's assume that valid variable names contain letters, digits and underscores (which can be conveniently matched with \w). That means, any other character ends the variable name. Then you could match the variable name, capture it (using parentheses) and put it in the replacement string with $1:
output = Regex.Replace(input, #"Read\s+(\w+)", "$1 = MessageBox.Show");
Note that \s+ matches one or more arbitrary whitespace characters. \w+ matches one or more letters, digits and underscores. If you want to restrict variable names to letters only, this is the place to change it:
output = Regex.Replace(input, #"Read\s+([a-zA-Z]+)", "$1 = MessageBox.Show");
Here is a good tutorial.
Finally note, that in C# it is advisable to write regular expressions as verbatim strings (#"..."). Otherwise, you will have to double escape everything, so that the backslashes get through to the regex engine, and that really lessens the readability of the regex.

Regex for word.otherword

I want a Regular Expression for a word.otherword form. I tried \b[a-z]\.[a-z]\b, but it gives me an error at the \. part, saying Unrecognized escape sequence. Any idea what's wrong? I'm working under .NET C#. Thanks!
LE:
john.Smith or JoHn.SmItH or JOHN.SMITH should work.
John Smith or john!Smith or john.Smith.Smith shouldn't work.
Try this :
foundMatch = Regex.IsMatch(SubjectString, #"\b[a-z]\.[a-z]\b");
Probably you were not using #?
Your regex tries to match a.a this means a single character. But since you want it to match complete words you need a quantifier e.g.
\b[a-z]+\.[a-z]+\b
Finally you may want to use the case insensitive match to allow for words with capital letters to be matched too :
foundMatch = Regex.IsMatch(SubjectString, #"\b[a-z]+\.[a-z]+\b", RegexOptions.IgnoreCase);
This will match all words.words with at least one character for each word regardless of capitalization.
This will match all word.otherword only if there is a space behind the first word or it is the start of the string and only if there is a space after the second word or it is the end of the string.
foundMatch = Regex.IsMatch(SubjectString, #"(?<=\s|^)\b[a-z]+\.[a-z]+\b(?=\s|$)", RegexOptions.IgnoreCase);
Try this regex for word.word format:
#"\b([a-z]+)\.\1"
For word.otherword use this:
#"\b[a-z]+\.[a-z]+\b"

C# Regex getting words that start with?

How can I use a regular expression to get words that start with ! ? For example !Test.
I tried this but it doesn't give any matches:
#"\B\!\d+\b"
Although it did work when I replaced the ! with $.
I'd say that your regex was quite OK already, you just need to use \w (alphanumeric character) instead of \d (digit):
#"\B!\w+\b"
will match any word that is immediately preceded by a ! unless that ! itself is preceded by a word itself (that's what the \B asserts). Using a ^ instead will limit the matches to words that start at the beginning of a line which might not be what you want.
So this will match all the words including exactly one preceding ! in this line:
!hello !this ...!will !!!be !matched!
but none of the words in this line:
this! won't!be matched!!!
You could also drop the \B altogether if you don't mind matching !that in this!that.
This should work: ^!\w+
MatchCollection matches = Regex.Matches (inputText, #"^!\w+");
foreach (Match match in matches)
{
Console.WriteLine (match.Value);
}

Regex with can be numerous ----- and must newline after the string

How would a regex look like when I search for this string:
before CAN be many comment lines --------"Encrypted" after must come a newline.
this does not seem to work:
Regex pattern = new Regex(#"^[-]*$[Encrypted][\n]");
what do I wrong?
The pattern you're searching for is not entirely clear to me, nor are the rest of the contents you're searching in, but if you're really just looking for "Encrypted" directly followed by only a newline then this is all you need to do:
Regex r = new Regex(#"Encrypted\n")
EDIT
Ok, comments seem to suggest that you're looking for zero or more occurences of "-", followed by "Encrypted", followed by newline. In that case the following will work.
Regex r = new Regex(#"-*Encrypted\n");
If there should be at least one "-" before "Encrypted", it will be
Regex r = new Regex(#"-+Encrypted\n");
Try it without the square brackets and move the dollar $ to the end of the pattern. Ie:
Regex pattern = new Regex(#"^-*Encrypted$");
The square brackets is like an Or statement. So [Encrypted] is the same as saying: 'E' or 'n' or 'c' or 'r'.... or 'd'.
The dollar symbol matches the end of the string.
I don't know specifically regex for c# but i thing you must not put the ^ because -'s are not at the begining of the line. And what the $ is doing in the middle.
So i would do in PCRE regex:
/-+"Encrypted" *\n/
That match one or more - followed by "Encryption" followed by 0 or more space followed by newline.
#Sanju
does this work :P
Regex pattern = new Regex(#"-*Encrypted\n");
for me it did! You have to remove the "^" char

Categories

Resources