Question: What's the simplest way how to test if given Regex matches whole string ?
An example:
E.g. given Regex re = new Regex("."); I want to test if given input string has only one character using this Regex re. How do I do that ?
In other words: I'm looking for method of class Regex that works similar to method matches() in class Matcher in Java ("Attempts to match the entire region against the pattern.").
Edit: This question is not about getting length of some string. The question is how to match whole strings with regular exprestions. The example used here is only for demonstration purposes (normally everybody would check the Length property to recognise one character strings).
If you are allowed to change the regular expression you should surround it by ^( ... )$. You can do this at runtime as follows:
string newRe = new Regex("^(" + re.ToString() + ")$");
The parentheses here are necessary to prevent creating a regular expression like ^a|ab$ which will not do what you want. This regular expression matches any string starting with a or any string ending in ab.
If you don't want to change the regular expression you can check Match.Value.Length == input.Length. This is the method that is used in ASP.NET regular expression validators. See my answer here for a fuller explanation.
Note this method can cause some curious issues that you should be aware of. The regular expression "a|ab" will match the string 'ab' but the value of the match will only be "a". So even though this regular expression could have matched the whole string, it did not. There is a warning about this in the documentation.
use an anchored pattern
Regex re = new Regex("^.$");
for testing string length i'd check the .Length property though (str.Length == 1) …
"b".Length == 1
is a much better candidate than
Regex.IsMatch("b", "^.$")
You add "start-of-string" and "end-of-string" anchors
^.$
Related
I have a string and a regular expression that I am running against it. But instead of the first match, I am interested in the last match of the Regular Expression.
Is there a quick/easy way to do this?
I am currently using Regex.Matches, which returns a MatchCollection, but it doesn't accept any parameters that will help me, so I have to go through the collection and grab the last one. But it seems there should be an easier way to do this. Is there?
The .NET regex flavor allows you to search for matches from right to left instead of left to right. It's the only flavor I know of that offers such a feature. It's cleaner and more efficient than the traditional methods, such prefixing the regex with .*, or searching out all matches so you can take the last one.
To use it, pass this option when you call Match() (or other regex method):
RegexOptions.RightToLeft
More information can be found here.
Regex regex = new Regex("REGEX");
var v = regex.Match("YOUR TEXT");
string s = v.Groups[v.Count - 1].ToString();
You could use Linq LastOrDefault or Last to get the last match from MatchCollection.
var lastMatch = Regex.Matches(input,"hello")
.OfType<Match>()
.LastOrDefault();
Check this Demo
Or, as #jdweng mentioned in the comment, you could even access using index.
Match lastMatch = matches[matches.Count - 1];
I need a regular expression to replace all instances of:
Session["ANYWORD"] ==
with
Session["ANYWORD"].ToString() ==
I have Session\["\w+"]\s==, which correctly finds the right matches, but I don't know how to insert .ToString() into the match.
What, or perhaps more appropriately, is there a regular expression to do what I need to do?
You will need to put the value that is between the square brackets into a capture group, and substitute that in your replacement.
In short, this will do it:
Regex.Replace(input, #"Session\[(""\w+"")]\s==", #"Session[$1].ToString() ==");
where $1 will insert the contents of your first capture group (determined by parenthesis in the pattern -> ()).
You can also use named groups if you like, then it becomes:
Regex.Replace(input, #"Session\[(?<anyword>""\w+"")]\s==", #"Session[${anyword}].ToString() ==");
Here is the MSDN doc for that particular overload of Regex.Replace.
For more information about capture group substitution in .NET, look here.
I wonder why does the static method Match in Regex.Match receive two obligatory parameters and more optionals and the second parameter doesn't accept a true Regex.
The method specification in Microsoft MSDN is:
public string Replace(
string input,
string replacement
)
In Visual Studio it is different:
The second parameter is made to "support" Regex as says The Regular expression pattern to match
Then, the following code is invalid:
string str = "my {value}";
Regex pattern = new Regex(#"\{[a-zA-Z_][a-zA-Z0-9_]*\}");
int matches = Regex.Match(str, pattern);
But when:
string pattern = #"\{[a-zA-Z_][a-zA-Z0-9_]*\}";
It is valid.
Am I getting crazy or this is really a issue?
I know, it says that receives "string", but wouldn't be correctly also support type Regex?
In your example, pattern is a Regex.
You probably want:
var matches = pattern.Match(str);
the reason the static version exists, is that it adds a quick way to match a regular expression in a one-off, disposable way.
Since Regex's internal state machines can take time to compile - the instance version exists so you can create an instance - and only have to compile it one time - for example, if you were running it many times within a loop, you could see a considerable performance improvement.
I don't think it is entirely clear what you are asking, but I will have a go at answering part of your question anyway.
The second parameter is made to "support" Regex as says The Regular expression pattern to match
If I understand you correctly, you are asking why you cannot pass a Regex when the description of the parameter says that it takes a regular expression.
Regex vs. regular expressions
It is important here to distinguish between the .NET type Regex and the concept of a regular expression: Regex is a .NET type just like Color, String, StringBuilder etc. It is designed to represent a regular expression and it has convenient methods for working with regular expressions. However, it isn't in itself a regular expression.
On the other hand, the string of characters "abc.*" is a regular expression, not just in C# but in a general sense. Not all strings are valid regular expressions, but some strings are regular expressions and can be used to describe a universe of matching strings.
Method signatures
What the documentation above states is that the method takes a parameter, pattern, of type string. The comments associated with the parameter state that the pattern must represent a valid regular expression. Thus, it must conform to the rules describing regular expressions in .NET and cannot, say, contain "?[---]<>", because that string does not represent a regular expression (I assume, I haven't tested it).
To sum up: you cannot pass an instance of the type Regex for the pattern parameter, because the method is not asking for an instance of Regex, it is very explicitly asking for a regular expression expressed as a string.
Note: The documentation matching the method overload you are looking at can be found here.
MSDN documentation for Regex.Mathc says:
pattern
Type: System.String
The regular expression pattern to match.
"The regular expression pattern to match." means "String representation of regular expression to match, with possible extensions to usual regular expression syntax supported by .Net runtime". It looks like you read the sentence as "... also accepting RegEx variable named pattern".
Note that regular expression is normal computer science / language theory term and does not have to match any particular class in any framework.
While it may be useful to add "String with the regular expression", it does not feel very useful because type is shown immediately before the sentence.
I want to find exact url mach in url list using with Regular Expression .
string url = #"http://web/P02/Draw/V/Service.svc";
string myword = #"http://web/P02/Draw/V/Service.svc http://web/P02/Draw/V/Service.svc?wsdl";
string pattern = #"(^|\s)" + url + #"(\s|$)";
Match match = Regex.Match(pattern, myword);
if (match.Success)
{
myword = Regex.Replace(myword, pattern, "pattern");
}
But the pattern returns no result.
What do you think is the problem ?
Strange formatting aside, here is a pattern to match each individual URL in your list.
Pattern = "http://([a-zA-Z]|/|[0-9])*\.svc";
Frankly, I don't think you're having issues with syntax or implementation. If you want to tweak the expression I wrote above, this is the place to do it: Online RegEx Tool
You're passing wrong arguments to Regex.Match method. You need to swap arguments like this>
Match match = Regex.Match(myword,pattern);
Why not use Linq on the string collection (when splitted by a space)
myword.Split(' ').Where(x => x.Equals(url)).Single().Replace(url, "pattern");
You've got your arguments the wrong way around, as has been pointed out
. in a regular expression pattern is a special character, so you need to escape url when you use it to build pattern - you can use Regex.Escape(url)
You don't need to check the match is a success before performing the replacement, unless you have other logic that depends on whether the match was a success.
I am trying to use regular expressions to parse a method in the following format from a text:
mvAddSell[value, type1, reference(Moving, 60)]
so using the regular expressions, I am doing the following
tokensizedStrs = Regex.Split(target, "([A-Za-z ]+[\\[ ][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[\\( ][A-Za-z0-9 ]+[, ].+[\\) ][\\] ])");
It is working, but the problem is that it always gives me an empty array at the beginning if the string started with a method in the given format and the same happens if it comes at the end. Also if two methods appeared in the string, it catches only the first one! why is that ?
I think what is causing the parser not to catch two methods is the existance of ".+" in my patern, what I wanted to do is that I want to tell it that there will be a number of a date in that location, so I tell it that there will be a sequence of any chars, is that wrong ?
it woooorked with ,e =D ... I replaced ".+" by ".+?" which meant as few as possible of any number of chars ;)
Your goal is quite unclear to me. What do you want as result? If you split on that method pattern, you will get the part before your pattern and the part after your pattern in an array, but not the method itself.
Answer to your question
To answer your concrete question: your .+ is greedy, that means it will match anything till the last )] (in the same line, . does not match newline characters by default).
You can change this behaviour by adding a ? after the quantifier to make it lazy, then it matches only till the first )].
tokensizedStrs = Regex.Split(target, "([A-Za-z ]+[\\[ ][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[ ,][A-Za-z0-9 ]+[\\( ][A-Za-z0-9 ]+[, ].+?[\\) ][\\] ])");
Problems in your regex
There are several other problems in your regex.
I think you misunderstood character classes, when you write e.g. [\\[ ]. this construct will match either a [ or a space. If you want to allow optional space after the [ (would be logical to me), do it this way: \\[\\s*
Use a verbatim string (with a leading #) to define your regex to avoid excessive escaping.
tokensizedStrs = Regex.Split(target, #"([A-Za-z ]+\[\s*[A-Za-z0-9 ]+\s*,\s*[A-Za-z0-9 ]+\s*,\s*[A-Za-z0-9 ]+\(\s*[A-Za-z0-9 ]+\s*,\s*.+?\)s*\]\s*)");
You can simplify your regex, by avoiding repeating parts
tokensizedStrs = Regex.Split(target, #"([A-Za-z ]+\[\s*[A-Za-z0-9 ]+(?:\s*,\s*[A-Za-z0-9 ]+){2}\(\s*[A-Za-z0-9 ]+\s*,\s*.+?\)s*\]\s*)");
This is an non capturing group (?:\s*,\s*[A-Za-z0-9 ]+){2} repeated two times.