Match.Regex syntax - c#

I have a string that can be either
"MyName (ctid 5555)"
or
"OtherName (id 555-5555-5555-555)"
I tried to write a regex to fetch ctid or id, like so:
"(?<=ctid=).+(?=))"
Checking here gave 0 results.
What's wrong with my syntax?

Try this pattern: (?<=\((?:ctid|id)\s).+?(?=\))
It uses a look-behind to check for "ctid" or "id" followed by whitespace, then it matches any content up till the closing parenthesis.
string[] inputs = { "MyName (ctid 5555)", "OtherName (id 555-5555-5555-555)" };
string pattern = #"(?<=\((?:ctid|id)\s).+?(?=\))";
foreach (var input in inputs)
{
var result = Regex.Match(input, pattern).Value;
Console.WriteLine(result);
}
If you clarify your question a better solution might exist. If you care to know whether the value was a "ctid" or an "id" then named capture groups could be used.

Based on your example, I am assuming you require a regex to explicitally match
try
{
var idRegEx = "^.*?\s\(id\s(\d{3}-\d{4}-\d{4}-\d{3})\)$";
var ctIdRegex = "^.*?\s\(ctid\s(\d{4})\)$";
var idMatch = Regex.Replace(textToTest, idRegEx, RegexOptions.IgnoreCase).Groups[1].Value;
var ctIdMatch = Regex.Replace(textToTest, ctIdRegex , RegexOptions.IgnoreCase).Groups[1].Value;
}
catch(ArgumentException)
{
// Regex is wrong
}
catch(ArgumentOutOfRangeException)
{
// No match found on one or the other
}

Assuming that a ctid is always 4 digits, and an id is always 3-4-4-3 digits, and that either way it is enclosed in round brackets, I would do:
\((?:ctid (?<ctid>\d{4})|id (?<id>\d{3}-\d{4}-\d{4}-\d{3}))\)
This adds named groups and does validity checking at the same time. For example, you can use match.Groups['ctid'].value to get the ctid value, or ['id'] to get the id value. Because there is validity checking, you'll never get (what I am assuming is) an invalid id like "(id 123)" (since it doesn't have the 3-4-4-3 pattern).

Not sure what you want exactly
(?:(ct)?id)\s(.+?)\)
But this regex worked for me at
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
you just need to grab the 2nd group though...
If you don't really want the look around regex, then
\((ct)?id\s(.+?)\)
might do it as well (and is more readable for regex beginners)

Well, you're looking for 'ctid=' and the string has 'ctid '. You'll also need to escape the parenthesis in the lookahead (change ')' to '\)'.

Related

c# regex clarification [duplicate]

What is the regular expression (in JavaScript if it matters) to only match if the text is an exact match? That is, there should be no extra characters at other end of the string.
For example, if I'm trying to match for abc, then 1abc1, 1abc, and abc1 would not match.
Use the start and end delimiters: ^abc$
It depends. You could
string.match(/^abc$/)
But that would not match the following string: 'the first 3 letters of the alphabet are abc. not abc123'
I think you would want to use \b (word boundaries):
var str = 'the first 3 letters of the alphabet are abc. not abc123';
var pat = /\b(abc)\b/g;
console.log(str.match(pat));
Live example: http://jsfiddle.net/uu5VJ/
If the former solution works for you, I would advise against using it.
That means you may have something like the following:
var strs = ['abc', 'abc1', 'abc2']
for (var i = 0; i < strs.length; i++) {
if (strs[i] == 'abc') {
//do something
}
else {
//do something else
}
}
While you could use
if (str[i].match(/^abc$/g)) {
//do something
}
It would be considerably more resource-intensive. For me, a general rule of thumb is for a simple string comparison use a conditional expression, for a more dynamic pattern use a regular expression.
More on JavaScript regexes: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
"^" For the begining of the line "$" for the end of it. Eg.:
var re = /^abc$/;
Would match "abc" but not "1abc" or "abc1". You can learn more at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Get only integer value from a string which contains bracket { in C#

I have a simple, very simple regex pattern like:
private static string FORMAT_REGEX = #"\{(\d)\}";
I have a string like I have {323} dollars and I want to get only 323
When I used:
Regex regex = new Regex(FORMAT_REGEX);
Match match = regex.Match(format);
if (match.Success)
{
return match.Groups[0].Value; // here comes {323} instead of 323
}
I'm sure that my pattern is wrong. What is the correct pattern ?
Only a small mistake.
You need a + sign after \d like this: \d+ to capture all digits.
And you need to get the first group: match.Groups[1].Value
Edit:
Here is a .NETFiddle
Groups[0] will always return the whole capture. You need to get the value of Groups[1].
Also, you need to capture multiple digits:
#"\{(\d+)\}";
// not
#"\{(\d)\}";
See the example at MSDN: Match.Groups Property for an example of just this, where you can capture multiple groups as well as the whole string. In that example they use \d{n} to capture exactly n digits.

Pattern Matching c#

Lets say I have a text file with the line below within it. I want to take both values within the quotations by matching between (" and "), so that would be I retreive ABC and DEF and put them in a string list or something, what's the best way of doing this? It's so annoying
If EXAMPLEA("ABC") AND EXAMPLEB("DEF")
Assuming a case where the value between the double quotes can not contain escaped double quotes might work like this:
var text = "If EXAMPLEA(\"ABC\") AND EXAMPLEB(\"DEF\")";
Regex pattern = new Regex("\"[^\"]*\"");
foreach (Match match in pattern.Matches(text))
{
Console.WriteLine(match.Value.Trim('"'));
}
But this is only one of the many ways you could do it and maybe not the smartest way out there. Try something yourself!
Best way...
List<string> matches=Regex.Matches(File.ReadAllText(yourPath),"(?<="")[^""]*(?="")")
.Cast<Match>()
.Select(x=>x.Value)
.ToList();
This pattern should do the trick:
\"([^"]*)\"
string str = "If EXAMPLEA(\"ABC\") AND EXAMPLEB(\"DEF\")";
MatchCollection matched = Regex.Matches(str, #"\""([^\""]*)\""");
foreach (Match match in matched)
{
Console.WriteLine(match.Groups[1].Value);
}
Note that the quotation marks are doubled in the actual code in order to escape them. And the code refers to group [1] to get just the part inside the parentheses.
IEnumerable<string> matches =
from Match match
in Regex.Matches(File.ReadAllText(filepath), #"\""([^\""]*)\""")
select match.Groups[1].Value;
Others already posted some answers, but my takes into account that you just want ABC and DEF in your example, without quotation marks and save it in a IEnumerable<string>.

Multiple pattern matching using RegEx

I'm trying to use RegEx to split a string into several objects. Each record is separated by a :, and each field is separated by a ~.
So sample data would look like:
:1~Name1:2~Name2:3~Name3
The RegEx I have so far is
:(?<id>\d+)~(?<name>.+)
This however will only match the first record, when really I would expect 3. How do I get the RegEx to return all matches rather than just the first?
Your last .+ is greedy, so it gobbles up the Name1 as well as the rest of the string.
Try
:(?<id>\d+)~(?<name>[^:]+)
This means that the Name can't have a : in it (which is probably OK for your data), and makes sure the name doesn't grab into the next field.
(And also use the Regex.Matches method which grabs all matches, not just the first).
Use:
var result = Regex.Matches(input, #":(?<id>\d+)~(?<name>[^:]+)").Cast<Match>()
.Select(m => new
{
Id = m.Groups["id"].Value,
Name = m.Groups["name"].Value
});
you better use .split() method for strings.
String[] records = myString.split(':');
for(String rec : records)
{
String[] fields = rec.split('~');
//use fields
}

Regex accepting all strings, wronly

I am trying to substrings if they have certain format. Substring Regex query is [CENAOD(xyx)]. I have done following code but when running this in cycle it says all results match which is wrong. Where I've done something wrong?
string strRegex = #"(\[CENAOD\((\S|\W)*\)\])*";
string strCenaOd = sReader["intro"].ToString()
if (Regex.IsMatch(strCenaOd, strRegex, RegexOptions.IgnoreCase))
{
string = (want to read content of ( ) = xyz in example)
}
Remove the outer ( ... )*.
That says no match is a good match too.
Or use + instead of *.
Adding to #Kent's and #leppie's answers, the code surrounding the regex needs work, too. I think this is what you were trying for:
string strRegex = #"\[CENAOD\(([^)]*)\)\]";
string strCenaOd = sReader["intro"].ToString();
Match m = Regex.Match(strCenaOd, strRegex, RegexOptions.IgnoreCase);
if (m.Success)
{
string content = m.Groups[1];
// ...
}
IsMatch() is a simple yes-or-no check, it doesn't provide any way to retrieve the matched text.
I especially want to comment on (\S|\W)*, from your regex. First, \S|\W is a very inefficient way to match any character. . is usually all you need, but as Kent pointed out, [^)] (i.e., any character except )) is more appropriate in this case. Also, by placing the * outside the round brackets, you'll only ever capture the last character. ([^)]*) captures all of them. For more details, read this.
if you said "all strings", how about:
\[CENAOD\([^\)]*\)\]

Categories

Resources