Regex Match.NextMatch() for a string that is not consistent

Regex Match.NextMatch() for a string that is not consistent - c#

I have this input string:
AT+CMGL=4\r\r\n+CMGL: 1,1,,155\r\nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH\r\n+CMGL: 2,1,,126\r\nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH\r\n+CMGL: 3,1,,148\r\nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH\r\n\r\nOK\r\n
I would like to do a regex match on this one extracting two capture groups, and iterate through each match with the NextMatch() method.
I can achieve a partial match excluding the start (AT+CMGL=4\r\r\n) and end (\r\nOK\r\n) of this string which would be different for the first and last iteration.
This is the regex I use for the partial match I'm able to achieve:
\+CMGL: \d+,\d+,,(\d+)\\r\\n(.*?)\\r\\n
How would the correct syntax of the regex look like to get a complete match?
EDIT: I would like to capture the pdu length (155) and the pdu itself (nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH) for each NextMatch().

Your regex is correct..you just need to use the singleline mode with the regex
Regex myRegex = new Regex(yourRegex,RegexOptions.IgnoreCase | RegexOptions.Singleline);
foreach(Match m in myRegex.Matches(yourText))
{
m.Groups[1].Value;//pdu length
m.Groups[2].Value;//pdu buffer
}

Related

Regex for a URL with illegal characters \\

From the following string:
google.com/local/reviews?placeid\\u003dChIJ070npYRaeEgRZNoxwuYYrew\\u0026q\\u003d
To extract u003dChIJ070npYRaeEgRZNoxwuYYrew although this value will change every time.
I have tried
Regex r = new Regex(#"("(?<=\placeid\\\s+)\p{L}+");
Which does not work.
I am guilty of neglecting my knowledge is regex so I apologise if this is painfully easy.

There are no whitespace chars in the string that you want to match with \s+ and there are 2 backslashes.
Using \p{L}+ only matches any letter and the string that you want also contains numbers.
(?<=placeid\\\\\s*)[\p{L}\p{N}]+
Regex demo
For example
string pattern = #"(?<=placeid\\\\\s*)[\p{L}\p{N}]+";
string input = #"google.com/local/reviews?placeid\\u003dChIJ070npYRaeEgRZNoxwuYYrew\\u0026q\\u003d";
Match m = Regex.Match(input, pattern);
Console.WriteLine(m.Value);
Output
u003dChIJ070npYRaeEgRZNoxwuYYrew

Regular Expression to just complete word C#

I would like to know how to extract complete words using a Regex in C#
For example, my String input:
This$#23 is-kt jkdls
I want to get Regex Match as
This$#23
is-kt
jkdls
I need to extract non space words [which can have numbers or special characters]
by specifying Regex Match pattern
Regex myrex = new Regex("pattern")

MatchCollection matches = Regex.Matches("This$#23 is-kt jkdls", #"\S+");
foreach(Match match in matches)
Console.WriteLine(match.Value);
Use \S+ to match words.

var words = string.Split(' ');

regex pattern for tags needed

Howzit,
I need help with the following please.
I need to find tags in a string. These tags start with {{ and end with }}, there will be multiple tags in the string I receive.
So far I have this, but it doesn't find any matches, what am I missing here?
List<string> list = new List<string>();
string pattern = "{{*}}";
Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
Match m = r.Match(text);
while (m.Success)
{
list.Add(m.Groups[0].Value);
m = m.NextMatch();
}
return list;
even tried string pattern = "{{[A-Za-z0-9]}}";
thanx
PS. I know close to nothing about regex.

Not only do you want to use {{.+?}} as your regex, you also need to pass RegexOptions.SingleLine. That will treat your entire string as a single line and the . will match \n (which it normally will not do).

Try {{.+}}. The .+ means there has to be at least one character as part of the tag.
EDIT:
To capture the string containing your tags you can do {{(.+)}} and then tokenize your match with the Tokenize or Scanner class?

I would recommend trying something like the following:
List<string> list = new List<string>();
string pattern = "{{(.*?)}}";
Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
Match m = r.Match(text);
while (m.Success)
{
list.Add(m.Groups[1].Value);
m = m.NextMatch();
}
return list;
the regex specifies:
{{ # match {{ literally
( # begin capturing into group #1
.*? # match any characters, from zero to infinite, but be lazy*
) # end capturing group
}} # match }} literally
"lazy" means to attempt to continue matching the pattern afterwards "}}" before backtracking to the .*? and reluctantly adding a character to the capturing group only if the character does not match }} - hope that made sense.
I changed your code by modifying the regex and to extract the first matching group from the regex match object (m.Groups[1].value) instead of the entire match.

{{.*?}} or
{{.+?}}
. - means any symbol
? - means lazy(don't capute nextpattern)

C# Regular Expressions

I have a string that has multiple regular expression groups, and some parts of the string that aren't in the groups. I need to replace a character, in this case ^ only within the groups, but not in the parts of the string that aren't in a regex group.
Here's the input string:
STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEME^ENDREPLACEME~STARTREPLACEME^BLAH^ENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~
Here's what the output string should look like:
STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEMEENDREPLACEME~STARTREPLACEMEBLAHENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~
I need to do it using C# and can use regular expressions.
I can match the string into groups of those that should and shouldn't be replaced, but am struggling on how to return the final output string.

I'm not sure I get exactly what you're having trouble with, but it didn't take long to come up with this result:
string strRegex = #"STARTREPLACEME(.+)ENDREPLACEME";
RegexOptions myRegexOptions = RegexOptions.None;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = #"STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEME^ENDREPLACEME~STARTREPLACEME^BLAH^ENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~";
string strReplace = "STARTREPLACEMEENDREPLACEME";
return myRegex.Replace(strTargetString, strReplace);
By using my favorite online Regex tool: http://regexhero.net/tester/
Is that helpful?

Regex rgx = new Regex(
#"\^(?=(?>(?:(?!(?:START|END)(?:DONT)?REPLACEME).)*)ENDREPLACEME)");
string s1 = rgx.Replace(s0, String.Empty);
Explanation: Each time a ^ is found, the lookahead scans ahead for an ending delimiter (ENDREPLACEME). If it finds one without seeing any of the other delimiters first, the match must have occurred inside a REPLACEME group. If the lookahead reports failure, it indicates that the ^ was found either between groups or within a DONTREPLACEME group.
Because lookaheads are zero-width assertions, only the ^ will actually be consumed in the event of a successful match.
Be aware that this will only work if delimiters are always properly balanced and groups are never nested within other groups.

If you are able to separate into groups that should be replaced and those that shouldn't, then instead of providing a single replacement string, you should be able to use a MatchEvaluator (a delegate that takes a Match and returns a string) to make the decision of which case it is currently dealing with and return the replacement string for that group alone.
You may also use an additional regex inside the MatchEvaluator. This solution produces the expected output:
Regex outer = new Regex(#"STARTREPLACEME.+ENDREPLACEME", RegexOptions.Compiled);
Regex inner = new Regex(#"\^", RegexOptions.Compiled);
string replaced = outer.Replace(start, m =>
{
return inner.Replace(m.Value, String.Empty);
});

RegEx to extract characters in a string

I need to extract a set of characters in of a string. I plan on usng the RegEx.Match method (c#) but I am unclear about the RegEx pattern to use. I want to extract a pattern where it starts with // and ends with ...
Then length needs to be variable inside the matched string but the start and end characters will always be the same. In DOS, I would have done something like the following:
//*...
but I know this is not the correct syntax for RegEx.

Try with pattern
"//.*?\.\.\."
or
"//.*?\.{3}"
Some codes
string data = #"some codes //to double check...
another codes //done...
//to do...";
MatchCollection matches = Regex.Matches(data, #"//(.*?)\.\.\.");
foreach (Match m in matches) {
print(m.Groups[1].Value);
}
results
to double check
done
to do

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex Match.NextMatch() for a string that is not consistent - c#

Your regex is correct..you just need to use the singleline mode with the regex Regex myRegex = new Regex(yourRegex,RegexOptions.IgnoreCase | RegexOptions.Singleline); foreach(Match m in myRegex.Matches(yourText)) { m.Groups[1].Value;//pdu length m.Groups[2].Value;//pdu buffer }

Related

Regex for a URL with illegal characters \\

Regular Expression to just complete word C#

regex pattern for tags needed

C# Regular Expressions

RegEx to extract characters in a string

Categories

Resources