Regular Expression to just complete word C# - c#

I would like to know how to extract complete words using a Regex in C#
For example, my String input:
This$#23 is-kt jkdls
I want to get Regex Match as
This$#23
is-kt
jkdls
I need to extract non space words [which can have numbers or special characters]
by specifying Regex Match pattern
Regex myrex = new Regex("pattern")

MatchCollection matches = Regex.Matches("This$#23 is-kt jkdls", #"\S+");
foreach(Match match in matches)
Console.WriteLine(match.Value);
Use \S+ to match words.

var words = string.Split(' ');

Related

Find all occurrences of regex pattern in string

I have the following code:
var varFormula = "IIF(ABCCF012HIZ3000=0,0,ABCCF012HCZ3000/ABCCF012HIZ3000)"
MatchCollection match = Regex.Matches(varFormula, #"^AB([CC|DD|EE]+[F|G])[0-9]{3}(HI|IC|HC)Z[A-Z0-9]{4}$", RegexOptions.IgnoreCase);
From above, I want to extract the following from varFormula but I'm not able to get any matches/groups:
ABCCF012HCZ3000
ABCCF012HIZ3000
Your regex uses ^ and $ which designate the start and end of a line. So it will never match the varFormula. Try the following:
var varFormula = #"IIF(ABCCF012HIZ3000=0,0,ABCCF012HCZ3000/ABCCF012HIZ3000)";
MatchCollection match = Regex.Matches(varFormula, #"AB(?:[CC|DD|EE]+[F|G])[0-9]{3}(?:HI|IC|HC)Z[A-Z0-9]{4}", RegexOptions.IgnoreCase)
Should give you three matches:
Match 1 ABCCF012HIZ3000
Match 2 ABCCF012HCZ3000
Match 3 ABCCF012HIZ3000

Regex for a URL with illegal characters \\

From the following string:
google.com/local/reviews?placeid\\u003dChIJ070npYRaeEgRZNoxwuYYrew\\u0026q\\u003d
To extract u003dChIJ070npYRaeEgRZNoxwuYYrew although this value will change every time.
I have tried
Regex r = new Regex(#"("(?<=\placeid\\\s+)\p{L}+");
Which does not work.
I am guilty of neglecting my knowledge is regex so I apologise if this is painfully easy.
There are no whitespace chars in the string that you want to match with \s+ and there are 2 backslashes.
Using \p{L}+ only matches any letter and the string that you want also contains numbers.
(?<=placeid\\\\\s*)[\p{L}\p{N}]+
Regex demo
For example
string pattern = #"(?<=placeid\\\\\s*)[\p{L}\p{N}]+";
string input = #"google.com/local/reviews?placeid\\u003dChIJ070npYRaeEgRZNoxwuYYrew\\u0026q\\u003d";
Match m = Regex.Match(input, pattern);
Console.WriteLine(m.Value);
Output
u003dChIJ070npYRaeEgRZNoxwuYYrew

Regex exclude ":" and a whitespace if they exist

So I have a regex here:
var text = new Regex(#"(?<=Paybacks).*", RegexOptions.IgnoreCase);
This looks for the line where it starts with Paybacks. Now it currently prints ": blah".
The context sometimes can be "Paybacks" or "Paybacks:" or "Paybacks " or I don't know "Paybacks (with thousands of whitespaces). How can I modify this regex to be like.. after "Paybacks" ignore a colon and a whitespace (or whitespaces) that may or may not exist.
I've been playing with it in regex101 and this seems to be working, but is there a better way?
(?<=Volatility(:\s)).*
In these situations, you'd better use a regex with a capturing group:
var pattern = new Regex(#"Paybacks[\s:]*(.*)", RegexOptions.IgnoreCase);
Then, you can use
var output = Regex.Match(text, pattern)?.Groups[1].Value;
See the .NET regex demo:
See the C# demo:
var texts = new List<string> { "Paybacks: blah","Paybacks:blah","Paybacks blah"};
var pattern = new Regex(#"Paybacks[\s:]*(.*)", RegexOptions.IgnoreCase);
texts.ForEach(text => Console.WriteLine(pattern.Match(text)?.Groups[1].Value));
printing 3 blahs.
You might also match optional colons and whitspace chars in the lookbehind, and start matching the first chars being any non whitspace char other than :
(?<=Paybacks[:\s]*)[^\s:].*
The pattern matches:
(?<= Positive lookbehind, assert what is on the left is
Paybacks Match literally
[:\s]* Optionally match either : or a whitespace char using a character class
) Close lookbehind
[^\s:].* Match a single non whitespace char other than : and the rest of the line
Regex demo | C# demo
var regex = new Regex(#"(?<=Paybacks[:\s]*)[^\s:].*", RegexOptions.IgnoreCase);
string[] strings = {"Paybacks: blah", "Paybacks blah", "Paybacks blah"};
foreach (String s in strings)
{
Console.WriteLine(regex.Match(s)?.Value);
}
Output
blah
blah
blah
If the order should be a single optional colon and optional whitespace chars, you can make the colon optional and the quantifier for the whitespace chars 0 or more using :?\s*
(?<=Paybacks:?\s*)[^\s:].*
Regex demo

Regex Match.NextMatch() for a string that is not consistent

I have this input string:
AT+CMGL=4\r\r\n+CMGL: 1,1,,155\r\nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH\r\n+CMGL: 2,1,,126\r\nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH\r\n+CMGL: 3,1,,148\r\nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH\r\n\r\nOK\r\n
I would like to do a regex match on this one extracting two capture groups, and iterate through each match with the NextMatch() method.
I can achieve a partial match excluding the start (AT+CMGL=4\r\r\n) and end (\r\nOK\r\n) of this string which would be different for the first and last iteration.
This is the regex I use for the partial match I'm able to achieve:
\+CMGL: \d+,\d+,,(\d+)\\r\\n(.*?)\\r\\n
How would the correct syntax of the regex look like to get a complete match?
EDIT: I would like to capture the pdu length (155) and the pdu itself (nDFGDF312GF4J5457JG8J0JGKFJ345G67JHGFGHJ06FD45HJG86J958F4FHSGSDGFH23FJ24HGJH58G4D7D465HDK31HFDJCHGH8V7GD45231DFGF314J567V6GGK4GFJCHGKVGDJX765GHFCJX2X4537CCGHGK9VHJ3C2FJXJCGH) for each NextMatch().
Your regex is correct..you just need to use the singleline mode with the regex
Regex myRegex = new Regex(yourRegex,RegexOptions.IgnoreCase | RegexOptions.Singleline);
foreach(Match m in myRegex.Matches(yourText))
{
m.Groups[1].Value;//pdu length
m.Groups[2].Value;//pdu buffer
}

string manipulation in regex

i have a problem in string manipulation
here is the code
string str = "LDAP://company.com/OU=MyOU1 Control,DC=MyCompany,DC=com";
Regex regex = new Regex("OU=\\w+");
var result = regex.Matches(str);
var strList = new List<string>();
foreach (var item in result)
{
strList.Add(item.ToString().Remove(0,3));
}
Console.WriteLine(string.Join("/",strList));
the result i am getting is "MyOU1" instead of getting "MyOU1 Control"
please help thanks
If you want the space character to be matched as well, you need to include it in your regex. \w only matches word charactes, which does not include spaces.
Regex regex = new Regex(#"OU=[\w\s]+");
This matches word characters (\w) and whitespace characters (\s).
(The # in front of the string is just for convenience: If you use it, you don't need to escape backslashes.)
Either add space to the allowed list (\w doesn't allow space) or use the knowledge that comma can be used as a separator.
Regex regex = new Regex("OU=(\\w|\\s)+");
OR
Regex regex = new Regex("OU=[^,]+");

Categories

Resources