Regular Expression Not working in .net - c#

I'm using the following expression.
\W[A-C]{3}
The objective is to match 3 characters of anything between A and C that don't have any characters before them. So with input "ABC" it matches but "DABC" does not.
When i try this expression using various online regex tools (eg. http://gskinner.com/RegExr/), it works perfectly. When i try to use it in an asp.net RegularExpressionValidator or with the RegEx class, it never matches anything.
I've tried various different methods of not allowing a character before the match. eg.
[^\w] and [^a-zA-Z0-9]
all work in the online tools, but not in .net.
This test fails, but i'm not sure why?
[Test]
public void RegExWorks()
{
var regex = new Regex("\\W[A-C]{3}");
Match match = regex.Match("ABC");
Assert.IsTrue(match.Success);
}

How about something like this:
^[A-C]{3}
It is simple, but seems to fit what you are asking, and I tested it in rubular.com and .NET

Problem is that you require there to be a \W character. Use alteration to fix that, or a lookbehind to make sure there are no invalid characters.
Alteration:
(?:\W|^)[A-C]{3}
But I'd prefer a negative lookbehind:
(?<!\w)[A-C]{3}
\b (as in gymbralls answer) is short for (?<!\w)(?=\w)|(?<=\w)(?!\w), which in this case would just mean (?<!\w), thus being equivalent.
Also, in C# you can use the # quoting so you don't have to double escape things, eg:
var regex = new Regex(#"(?<!\w)[A-C]{3}");

You should consider trying:
[Test]
public void RegExWorks()
{
var regex = new Regex("\\b[A-C]{3}");
Match match = regex.Match("ABC");
Assert.IsTrue(match.Success);
}
The \\b matches a word boundary, which means it will match "ABC" as well as " ABC" and "$ABC". Using \\W requires there to be a non-word character, which doesn't sound like it is what you want.
Let me know if I'm missing something.

It is simple like "[A-C]{3}" this

OK so you can try following Expression
"[A-C][A-C]{2}"

Related

how can I use unnamed Regex groups in C# inside my regex?

hey so my current regex is #"(into)(to)add\s[^\s]{1,}\1|\2[^\s]{1,}" I want the input to be something "add word into/to category" the regex in general works fine but just the \1|\2 part, I tried using groups and all sorts of solutions but I just can't seem to figure out how I can make it so that the input can be into or to
Can anyone help me out? (this is in C# and using the Regex class)
If I have understood you correctly, then you don't need back references to (unnamed) Groups, you can use a simple alternation, like this:
#"add \w+ (into|to) \w+"
That will select either into or to in the search string.
Edit:
Let's get a Little more 'advanced', using the optional sign '?':
#"add \w+ (in)?to \w+"
This will match 'in' zero or one time, followed by 'to', so it will match into as well as to, exactly as the original RegEx.
Edit2:
I have a feeling, you want to use a variable inside your RegEx, you can of course do that like this:
string search = "into|to";
RegEx regEx = new ReqEx(#"add \w+ (" + search + ") \w+");
From your given example I think you're looking for a regex like add\s\w+\s(into|to)\s\w+. Your current regex matches only strings starting with "intoto" wich is probably not what you want.

Patterns with special characters in Microsoft.VisualBasic.CompilerServices.LikeOperator.LikeString does not work

I have tried to use the LikeOperator.LikeString functionality for pattern matching as shown below:
// Usage: bool matchValue = LikeOperator.LikeString(string, pattern, CompareMethod);
bool match = LikeOperator.LikeString("*test*/fe_quet", "(*)test(*)/*", Microsoft.VisualBasic.CompareMethod.Text);
The above should return true as per the documentation, but it simply returns false. I tried to escape the (*) with the brackets, but it does not seem to work in that way. Could anyone please help me to define the pattern string with the special characters?
Thanks
From Like Operator (which you provided):
To match the special characters left bracket ([), question mark (?), number sign (#), and asterisk (*), enclose them in brackets.
Therefore you need to wrap your asterisks in [] instead of ():
bool match = LikeOperator.LikeString("*test*/fe_quet", "[*]test[*]/[*]", Microsoft.VisualBasic.CompareMethod.Text);
You'd probably be better off using Regex instead of the VB namespace.

RegEx match string between known strings and after a known text with line breaks

So, I have this text:
testing
<strong>a known text</strong>
<p>testing2</p>
<p>this paragraphs are dynamically</p>
...
testing again
testing and again
I want to get all the hrefs that are under the a known text
I use this regex to get all the matches: (?<=<a\ href=")/find/.*?(?=")
But I also get the result: /find/1 which is a result that I don't want.
I've tried this: a known tex[\w\W](?<=<a\ href=")/find/*?(?=") but it's not working. I have no idea how to get this done correctly. Basically I want to get only /find/2/ and /find/3
PS: I am not really using C# but a software that is made in C# and uses the C# regex.
I have this regex, which is a bit different from Marcin's but I'm not used to have variable length regex in lookbehinds:
var regex = new Regex(#"(?:a known text|(?<!^)\G)[\w\W]+?((?<=<a\ href="")/find/.*?(?=""))");
ideone demo
Which is believe should make the regex a little bit more efficient.
\G is a special character which matches where the previous match ended, so that after finding the first /find/, it tries matching again. I had to put a negative lookbehind to prevent it from matching newline as well.
a known tex[\w\W](?<=<a\ href=")/find/*?(?=")
Concerning your regex, some little mistakes you made was to forget the quantifier for [\w\W] and the dot for *? after /find/. Using a known tex[\w\W]+(?<=<a\ href=")(/find/.*?)(?=") would have got you only /find/2/, which is already better than nothing!
EDIT: As AlanMoore rightly pointed out, you can simplify the regex:
var regex = new Regex(#"(?:a known text|(?<!^)\G)[\w\W]+?<a href=""(/find/.*?)""");
And to make the . match newlines, we can use (?s) and remove the [\w\W] part:
var regex = new Regex(#"(?s)(?:a known text|(?<!^)\G).*?<a href=""(/find/.*?)""");

Retrive a Digit from a String using Regex

What I am trying to do is fairly simple, although I am running into difficulty. I have a string that is a url, it will have the format http://www.somedomain.com?id=someid what I want to retrive is the someid part. I figure I can use a regular expression but I'm not very good with them, this is what I tried:
Match match = Regex.Match(theString, #"*.?id=(/d.)");
I get a regex exception saying there was an error parsing the regex. The way I am reading this is "any number of characters" then the literal "?id=" followed "by any number of digits". I put the digits in a group so I could pull them out. I'm not sure what is wrong with this. If anyone could tell me what I'm doing wrong I would appreciated it, thanks!
No need for Regex. Just use built-in utilities.
string query = new Uri("http://www.somedomain.com?id=someid").Query;
var dict = HttpUtility.ParseQueryString(query);
var value = dict["id"]
You've got a couple of errors in your regex. Try this:
Match match = Regex.Match(theString, #".*\?id=(\d+)");
Specifically, I:
changed *. to .* (dot matches all non-newline chars and * means zero or more of the preceding)
added a an escape sequence before the ? because the question mark is a special charcter in regular expressions. It means zero or one of the preceding.
changed /d. to \d* (you had the slash going the wrong way and you used dot, which was explained above, instead of * which was also explained above)
Try
var match = RegEx.Match(theString, #".*\?id=(\d+)");
The error is probably due to preceding *. The * character in regex matches zero or more occurrences of previous character; so it cannot be the first character.
Probably a typo, but shortcut for digit is \d, not /d
. matches any character, you need to match one or more digits - so use a +
? is a special character, so it needs to be escaped.
So it becomes:
Match match = Regex.Match(theString, #".*\?id=(\d+)");
That being said, regex is not the best tool for this; use a proper query string parser or things will eventually become difficult to manage.

Any ideas why this does not work? C#

public class MyExample
{
public static void Main(String[] args)
{
string input = "The Venture Bros</p></li>";
// Call Regex.Match
Match m = Regex.Match(input, "/show_name=(.*?)&show_name_exact=true\">(.*?)</i");
// Check Match instance
if (m.Success)
{
// Get Group value
string key = m.Groups[1].Value;
Console.WriteLine(key);
// alternate-1
}
}
I want "The Venture Bros" as output (in this example).
try this :
string input = "The Venture Bros</p></li>";
// Call Regex.Match
Match m = Regex.Match(input, "show_name=(.*?)&show_name_exact=true\">(.*?)</a");
// Check Match instance
if (m.Success)
{
// Get Group value
string key = m.Groups[2].Value;
Console.WriteLine(key);
// alternate-1
}
I think it's because you're trying to do the perl-style slashes on the front and the end. A couple of other answerers have been confused by this already. The way he's written it, he's trying to do case-insensitive by starting and ending with / and putting an i on the end, the way you'd do it in perl.
But I'm pretty sure that .NET regexes don't work that way, and that's what's causing the problem.
Edit: to be more specific, look into RegexOptions, an example I pulled from MSDN is like this:
Dim rx As New Regex("\b(?<word>\w+)\s+(\k<word>)\b", RegexOptions.Compiled Or RegexOptions.IgnoreCase)
The key there is the "RegexOptions.IgnoreCase", that'll cause the effect that you were trying for with /pattern/i.
The correct regex in your case would be
^.*&show_name_exact=true\"\>(.*)</a></p></li>$
regexp is tricky, but at http://www.regular-expressions.info/ you can find a great tutorial
/?show_name=(.)&show_name_exact=true\">(.)
would work as you expect I believe. But another thing I notice, is that you're trying to get the value of group[1], but I believe that you want the value of group[2], because there will be 3 groups, the first is the match, and the second is the first group...
Gl ;)
Because of the question mark before show_name. It is in input but not in pattern, thus no match.
Also, you try to match </i but the input doesn't contain this (it contains </li>).
First the regex starts "/show_name", but the target string has "/?show_name" so the first group won't want the first expected hit.
This will cause the whole regex to fail.
Ok, let's break this down.
Test Data: "The Venture Bros</p></li>"
Original Regex: "/show_name=(.*?)&show_name_exact=true\">(.*?)</i"
Working Regex: "/\?show_name=(.*)&show_name_exact=true\">(.*)</a"
We'll start at the left and work our way to the right, through the regex.
"?" became "\?" this is because a "?" means that the preceding character or group is optional. When we put a slash before it, it now matches a literal question mark.
"(.*?)" became "(.*)" the parentheses denote a group, and a question mark means "optional", but the "*" already means "0 or more" so this is really just removing a redundancy.
"</i" became "</a" this change was made to match your actual text which terminates the anchor with a "</a>" tag.
Suggested Regex: "[\\W]show_name=([^><\"]*)&show_name_exact=true\">([^<]*)<"
(The extra \'s were added to provide proper c# string escaping.)
A good tool for testing regular expressions in c#, is the regex-freetool at code.google.com

Categories

Resources