Regex.IsMatch does not return expected result - c#

var managementCount = from tbdocheader in context.tblDocumentHeaders
join tbDocRevision in context.tblDocumentRevisions
on tbdocheader.DocumentHeaderID equals tbDocRevision.DocumentHeaderID
select new { tbdocheader, tbDocRevision };
var query =(from obj in managementCount.AsEnumerable()
where Regex.IsMatch(obj.tbDocRevision.Revision, #"[A-Za-z]%")
select obj).Count();
I'm trying to get the records count where Revision starts with an alphabet."managementCount" query returns records with "Revision=A", but my query does not returns any matching records.
is something wrong with my regular expression?

Try the pattern "^[A-Za-z]*$
Here
^ indicates start of an expression,
$ indicates end of an expression,
[A-Za-z] will allow any alphabet character and
[A-Za-z]* will allow any length of alphabet characters.
In C# code you will write :
#"^[A-Za-z]*$
Here, the # symbol means to read that string literally, and don't interpret control characters otherwise
I hope this will help you..!

Try the pattern "^[A-Za-z]"...
var query =(from obj in managementCount.AsEnumerable()
where Regex.IsMatch(obj.tbDocRevision.Revision, #"^[A-Za-z]")
select obj).Count();

I think you're looking for the pattern "^[a-z]" with extra parameter RegexOptions.IgnoreCase.
It looks to me like you're used to SQL LIKE syntax. Regular expressions are different--they use different wildcard characters, have many more matching abilities, by default match multiple times in a string, and are also a lot harder to get right. SQL LIKE patterns are always implicitly anchored at the ends, and Regexes are not.
So the pattern above means, match starting at the beginning of the string ^, and then be followed by a letter. There is no need to add a wildcard character because Regexes are not anchored by default.
I encourage you to go do some reading and study. Try regular-expressions.info.

Related

What is a regular expression for inserting a string into a Find and Replace match?

I need a regular expression to replace all instances of:
Session["ANYWORD"] ==
with
Session["ANYWORD"].ToString() ==
I have Session\["\w+"]\s==, which correctly finds the right matches, but I don't know how to insert .ToString() into the match.
What, or perhaps more appropriately, is there a regular expression to do what I need to do?
You will need to put the value that is between the square brackets into a capture group, and substitute that in your replacement.
In short, this will do it:
Regex.Replace(input, #"Session\[(""\w+"")]\s==", #"Session[$1].ToString() ==");
where $1 will insert the contents of your first capture group (determined by parenthesis in the pattern -> ()).
You can also use named groups if you like, then it becomes:
Regex.Replace(input, #"Session\[(?<anyword>""\w+"")]\s==", #"Session[${anyword}].ToString() ==");
Here is the MSDN doc for that particular overload of Regex.Replace.
For more information about capture group substitution in .NET, look here.

Weird Regex behavior in C#

I am trying to extract some alfanumeric expressions out of a longer word in C# using regular expressions. For example I have the word "FooNo12Bee". I use the the following regular expression code, which returns me two matches, "No12" and "No" as results:
alfaNumericWord = "FooNo12Bee";
Match m = Regex.Match(alfaNumericWord, #"(No|Num)\d{1,3}");
If I use the following expression, without paranthesis and without any alternative for "No" it works the way I am expecting, it returns only "No12":
alfaNumericWord = "FooNo12Bee";
Match m = Regex.Match(alfaNumericWord, #"No\d{1,3}");
What is the difference between these two expressions, why using paranthesis results in a redundant result for "No"?
Parenthesis in regex are capture groups; meaning what's in between the paren will be captured and stored as a capture group.
If you don't want a capture group but still need a group for the alternation, use a non-capture group instead; by putting ?: after the first paren:
Match m = Regex.Match(alfaNumericWord, #"(?:No|Num)\d{1,3}");
Usually, if you don't want to change the regex for some reason, you can simply retrieve the group 0 from the match to get only the whole match (and thus ignore any capture groups); in your case, using m.Groups[0].Value.
Last, you can improve the efficiency of the regex by a notch using:
Match m = Regex.Match(alfaNumericWord, #"N(?:o|um)\d{1,3}");
i can't explain how they call it, but it is because putting parentheses around it is creating a new group. it is well explained here
Besides grouping part of a regular expression together, parentheses
also create a numbered capturing group. It stores the part of the
string matched by the part of the regular expression inside the
parentheses.
The regex Set(Value)? matches Set or SetValue. In the first case, the
first (and only) capturing group remains empty. In the second case,
the first capturing group matches Value.
It is because the parentheses are creating a group. You can remove the group with ?: like so
Regex.Match(alfaNumericWord, #"(?:No|Num)\d{1,3}");

Simple regex doesn't work

I want to match the strings "F1" to "F12". I only need the number. I'm out of training - my first try:
var r = new Regex(#"^(?:[F])[\d]{1,2}$");
matches - but returns "F1" - but i expect to get "1"?
What have I done wrong?
Maybe you want to use lookbehind:
var r = new Regex(#"^(?<=F)\d\d?$");
Even though you are using a non-capturing group for the "F", the overall match for your Regex will return the entire string it matched. Groups are used to outline sub-expressions within your regular expression that you want be able to extract the value of. Non-capturing groups are used if you want to specify a sub-expression without having it be stored in a group. They allow you to apply quantifiers to your sub-expression, but do not allow you to extract their resulting value after running the regex against a string. They are typically used for performance gains, since capturing groups add extra overhead.
If you want to get just the number, you need to put the number portion in a capturing group and look at the Groups property of the resulting Match (assuming you are calling the r.Match function).
The updated Regex would be:
var r = new Regex(#"^(?:[F])([\d]{1,2})$");
Since our number is inside of the first set of parenthesis associated with a capturing group, it will be group 1. You could also name your group to avoid confusion or possible errors if the regex gets updated at a later date.
Alternately, you can just use look-behind as M42 has suggested.

Retrive a Digit from a String using Regex

What I am trying to do is fairly simple, although I am running into difficulty. I have a string that is a url, it will have the format http://www.somedomain.com?id=someid what I want to retrive is the someid part. I figure I can use a regular expression but I'm not very good with them, this is what I tried:
Match match = Regex.Match(theString, #"*.?id=(/d.)");
I get a regex exception saying there was an error parsing the regex. The way I am reading this is "any number of characters" then the literal "?id=" followed "by any number of digits". I put the digits in a group so I could pull them out. I'm not sure what is wrong with this. If anyone could tell me what I'm doing wrong I would appreciated it, thanks!
No need for Regex. Just use built-in utilities.
string query = new Uri("http://www.somedomain.com?id=someid").Query;
var dict = HttpUtility.ParseQueryString(query);
var value = dict["id"]
You've got a couple of errors in your regex. Try this:
Match match = Regex.Match(theString, #".*\?id=(\d+)");
Specifically, I:
changed *. to .* (dot matches all non-newline chars and * means zero or more of the preceding)
added a an escape sequence before the ? because the question mark is a special charcter in regular expressions. It means zero or one of the preceding.
changed /d. to \d* (you had the slash going the wrong way and you used dot, which was explained above, instead of * which was also explained above)
Try
var match = RegEx.Match(theString, #".*\?id=(\d+)");
The error is probably due to preceding *. The * character in regex matches zero or more occurrences of previous character; so it cannot be the first character.
Probably a typo, but shortcut for digit is \d, not /d
. matches any character, you need to match one or more digits - so use a +
? is a special character, so it needs to be escaped.
So it becomes:
Match match = Regex.Match(theString, #".*\?id=(\d+)");
That being said, regex is not the best tool for this; use a proper query string parser or things will eventually become difficult to manage.

Extending [^,]+, Regular Expression in C#

Duplicate
Regex for variable declaration and initialization in c#
I was looking for a Regular Expression to parse CSV values, and I came across this Regular Expression
[^,]+
Which does my work by splitting the words on every occurance of a ",". What i want to know is say I have the string
value_name v1,v2,v3,v4,...
Now I want a regular expression to find me the words v1,v2,v3,v4..
I tried ->
^value_name\s+([^,]+)*
But it didn't work for me. Can you tell me what I am doing wrong? I remember working on regular expressions and their statemachine implementation. Doesn't it work in the same way.
If a string starts with Value_name followed by one or more whitespaces. Go to Next State. In That State read a word until a "," comes. Then do it again! And each word will be grouped!
Am i wrong in understanding it?
You could use a Regex similar to those proposed:
(?:^value_name\s+)?([^,]+)(?:\s*,\s*)?
The first group is non-capturing and would match the start of the line and the value_name.
To ensure that the Regex is still valid over all matches, we make that group optional by using the '?' modified (meaning match at most once).
The second group is capturing and would match your vXX data.
The third group is non-capturing and would match the ,, and any whitespace before and after it.
Again, we make it optional by using the '?' modifier, otherwise the last 'vXX' group would not match unless we ended the string with a final ','.
In you trials, the Regex wouldn't match multiple times: you have to remember that if you want a Regex to match multiple occurrences in a strings, the whole Regex needs to match every single occurrence in the string, so you have to build your Regex not only to match the start of the string 'value_name', but also match every occurrence of 'vXX' in it.
In C#, you could list all matches and groups using code like this:
Regex r = new Regex(#"(?:^value_name\s+)?([^,]+)(?:\s*,\s*)?");
Match m = r.Match(subjectString);
while (m.Success) {
for (int i = 1; i < m.Groups.Count; i++) {
Group g = m.Groups[i];
if (g.Success) {
// matched text: g.Value
// match start: g.Index
// match length: g.Length
}
}
m = m.NextMatch();
}
I would expect it only to get v1 in the group, because the first comma is "blocking" it from grabbing the rest of the fields. How you handle this is going to depend on the methods you use on the regular expression, but it may make sense to make two passes, first grab all the fields seperated by commas and then break things up on spaces. Perhaps ^value_name\s+(?:([^,]+),?)* instead.
Oh yeah, lists....
/(?:^value_name\s+|,\s*)([^,]+)/g will theoreticly grab them, but you will have to use RegExp.exec() in a loop to get the capture, rather than the whole match.
I wish pre-matches worked in JS :(.
Otherwise, go with Logan's idea: /^value_name\s+([^,]+(?:,\s*[^,]+)*)$/ followed by .split(/,\s*/);

Categories

Resources