C# Replace everything except two cases - c#

how can i do something like this.
new Regex("([^my]|[^test])").Replace("Thats my working test", "");
I would get this:
my test
But i would get a empty string, because everything would be replaced with none.
Thank you in Advance!

You can use this lookahead based regex:
new Regex("(?!\b(?:my|test)\b)\b(\w+)\s*").Replace("Thats my working test", "");
//=> my test
Your use of negation in character class is incorrect here: ([^my]|[^test])
Since inside character class every character is checked individually not as a string.
RegEx Demo

Use this regex replacement:
new Regex("\b(?!my|test)\w+\s?").Replace("Thats my working test", "");
Here is a regex demo!
\b Asserts the position before our word to check.
(?! Negative lookahead - asserts that our match is NOT:
my|test The character sequences "my" or "test".
)
\w+ Then match the word because it's what we want.
\s? And scrap the whitespace after it if it's there, too.

I can suggest to use next regEx :
var res = new Regex(#"my(?:$|[\s\.;\?\!,])|test(?:$|[\s\.;\?\!,])").Replace("Thats my working test", "");
Upd: Or even simplier:
var res = new Regex(#"my($|[\s])|test($|[\s])").Replace("Thats my working test", "");
Upd2: If you don't know what word you'll use you can do it even more flexible:
private string ExeptWords(string input, string[] exept){
string tmpl = "{0}|[\s]";
var regexp = string.Join((exept.Select(s => string.Format(tmpl, s)),"|");
return new Regex(regexp).Replace(("Thats my working test", "");
}

Related

How to capture hyphen, space or none and ignore case?

I am trying to write a Regex in C# to capture all these potential strings:
Test Pre-Requisite
Test PreRequisite
Test Pre Requisite
Of course the user could also enter any possible case. So it would be great to be able to ignore case. The best I can do is:
Regex TestPreReqRegex = new Regex("Test Pre[- rR]");
If (TestPreReqRegex.IsMatch(StringToCompare)){
// Do Stuff
}
But this doesn't capture "Test PreRequisite" and also doesn't capture lower case. How can I fix this? Any help is much appreciated.
If you're trying to match the entire string, use:
Regex TestPreReqRegex = new Regex("^Test Pre[- ]?Requisite$", RegexOptions.IgnoreCase);
If you're looking for partial matches, then change the pattern to:
\bTest Pre[- ]?Requisite
Or:
\bTest Pre[- ]?R
Pattern details:
^ - Beginning of string.
\b - Word boundary.
[- ]? - Match a hyphen or a space character between zero and one times.
$ - End of string.
C# Demo:
var inputs = new[]
{ "Test Pre-Requisite", "Test PreRequisite", "Test Pre Requisite" };
Regex TestPreReqRegex = new Regex("^Test Pre[- ]?Requisite$",
RegexOptions.IgnoreCase);
foreach (string s in inputs)
{
Console.WriteLine("'{0}' is {1}'.", s,
TestPreReqRegex.IsMatch(s) ? "a match" : "not a match");
}
Output:
'Test Pre-Requisite' is a match'.
'Test PreRequisite' is a match'.
'Test Pre Requisite' is a match'.
Try it online.

Regex without taking care of escape codes

I want to validate a string like this (netsh cmd output):
"\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n"
with this pattern:
"URL .+https:\/\/\+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+"
So, I intend to detect this kind of strings (xxxxx is something(+)):
xxxxxURLxxxxxhttps://+:443/SomeWebSite/xxxxxYesxxxxxYesxxxxxSDDLxxxx
I wrote this code in C# to do it but my expression still doesn't work:
string output = "\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n";
output = output.Replace(Environment.NewLine, ""); //==> output2=="R‚servations d'URLÿ:-----------
Regex testUrlOpened = new Regex(output, RegexOptions.Singleline);
MessageBox.Show(testUrlOpened.IsMatch(#"URL").ToString()); // ==> False
MessageBox.Show(testUrlOpened.IsMatch(#".+URL.+").ToString()); // ==> False
MessageBox.Show(testUrlOpened.IsMatch(#"URL .+https:\/\/\+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+").ToString()); // ==> False
So I suppose that I've another issue with regex in c#...
May be encoding issue?
Start by removing the escape codes expected in the string . It might be better to remove them all depending on your use scenario (C# escape codes)
output = output.Replace('\n').Replace('\r').Replace('\t')
Now you have a single line string, you can do the regex matching
.+URL.+https:\/\/.+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+
Notice the following:
1- the ^ and $ means to match the exact begin and end of the string. If you have the target string within the line using these will cause the matching to fail.
2- You need to escape the necessary regex characters .
3- To match "Any character except new line one or more times" you use .+
I hope this helps
You can use Regex.Unescape to unescape the string, and then do your regex match :
var output = #"\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n";
output = Regex.Unescape(output).Dump();
var foundUrl = Regex.IsMatch(output, #"URL .+ https://\+:443/SomeWebSite/.+YES.+YES.+SDDL.+");
+ indicates 1 or more of the previously stated pattern, if we put the pattern (.|\n), which matches anything, in front of those +'s, you'll be all set, without having to remove or account for escape codes.
^(.|\n)+URL(.|\n)+https://(.|\n)+:443/SomeWebSite/(.|\n)+Yes(.|\n)+Yes(.|\n)+SDDL(.|\n)+$
EDIT: The risk of doing something like this instead of sanitizing your string first is that you may get false positives because there could be any character separating the matches, all this regex does is ensure that somewhere in the string, in order, are the strings
"URL", "https://", ":443/SomeWebSite/", "Yes", "Yes", "SDDL"
So simple. Last issue was due to reg expression to put in Regex constructor and input string in IsMatch Method... :(
So final code is:
string output = "\r\nR‚servations d'URLÿ:\r\n--------------------\r\n\r\n URL r‚serv‚e : https://+:443/SomeWebSite/ \r\n Utilisateurÿ: AUTORITE NT\\SERVICE R\u0090SEAU\r\n \u0090couterÿ: Yes\r\n D‚l‚guerÿ: Yes\r\n SDDLÿ: D:(A;;GA;;;NS) \r\n\r\n\r\n";
output = output.Replace(Environment.NewLine, ""); //==> output2=="R‚servations d'URLÿ:-----------
Regex testUrlOpened = new Regex((#"URL .+https:\/\/\+:443\/SomeWebSite\/.+Yes.+Yes.+SDDL.+", RegexOptions.Singleline);
MessageBox.Show(testUrlOpened.IsMatch(output).ToString()); // ==> True!!!
Regex taking decimal number only without using escape character.
^[0-9]+([.][0-9]+)?$
Test It

Remove spaces before non-word character with RegEx

I have the following C# code:
var sentence = "As a result , he failed the test .";
var pattern = new Regex();
var outcome = pattern.Replace(sentence, String.Empty);
What should I do to the RegEx to obtain the following output:
As a result, he failed the test.
If you want to white-list punctuation marks that generally don't appear in English after spaces, you can use:
\s+(?=[.,?!])
\s+ - all white space characters. You may want [ ]+ instead.
(?=[.,?!]) - lookahead. The next character should be ., ,, ?, or ! .
Working example: https://regex101.com/r/iJ5vM8/1
You need to add a pattern to your code that will match spaces before punctuation:
var sentence = "As a result , he failed the test .";
var pattern = new Regex(#"\s+(\p{P})");
var outcome = pattern.Replace(sentence, "$1");
Output:

Regexp skip pattern

Problem
I need to replace all asterisk symbols('*') with percent symbol('%'). The asterisk symbols in square brackets should be ignored.
Example
[Test]
public void Replace_all_asterisks_outside_the_square_brackets()
{
var input = "Hel[*o], w*rld!";
var output = Regex.Replace(input, "What_pattern_should_be_there?", "%")
Assert.AreEqual("Hel[*o], w%rld!", output));
}
Try using a look ahead:
\*(?![^\[\]]*\])
Here's a bit stronger solution, which takes care of [] blocks better, and even escaped \[ characters:
string text = #"h*H\[el[*o], w*rl\]d!";
string pattern = #"
\\. # Match an escaped character. (to skip over it)
|
\[ # Match a character class
(?:\\.|[^\]])* # which may also contain escaped characters (to skip over it)
\]
|
(?<Asterisk>\*) # Match `*` and add it to a group.
";
text = Regex.Replace(text, pattern,
match => match.Groups["Asterisk"].Success ? "%" : match.Value,
RegexOptions.IgnorePatternWhitespace);
If you don't care about escaped characters you can simplify it to:
\[ # Skip a character class
[^\]]* # until the first ']'
\]
|
(?<Asterisk>\*)
Which can be written without comments as: #"\[[^\]]*\]|(?<Asterisk>\*)".
To understand why it works we need to understand how Regex.Replace works: for every position in the string it tries to match the regex. If it fails, it moves one character. If it succeeds, it moves over the whole match.
Here, we have dummy matches for the [...] blocks so we may skip over the asterisks we don't want to replace, and match only the lonely ones. That decision is made in a callback function that checks if Asterisk was matched or not.
I couldn't come up with a pure RegEx solution. Therefore I am providing you with a pragmatic solution. I tested it and it works:
[Test]
public void Replace_all_asterisks_outside_the_square_brackets()
{
var input = "H*]e*l[*o], w*rl[*d*o] [o*] [o*o].";
var actual = ReplaceAsterisksNotInSquareBrackets(input);
var expected = "H%]e%l[*o], w%rl[*d*o] [o*] [o*o].";
Assert.AreEqual(expected, actual);
}
private static string ReplaceAsterisksNotInSquareBrackets(string s)
{
Regex rx = new Regex(#"(?<=\[[^\[\]]*)(?<asterisk>\*)(?=[^\[\]]*\])");
var matches = rx.Matches(s);
s = s.Replace('*', '%');
foreach (Match match in matches)
{
s = s.Remove(match.Groups["asterisk"].Index, 1);
s = s.Insert(match.Groups["asterisk"].Index, "*");
}
return s;
}
EDITED
Okay here is my final attempt ;)
Using negative lookbehind (?<!) and negative lookahead (?!).
var output = Regex.Replace(input, #"(?<!\[)\*(?!\])", "%");
This also passes the test in the comment to another answer "Hel*o], w*rld!"

C# Regex.Split - Subpattern returns empty strings

Hey, first time poster on this awesome community.
I have a regular expression in my C# application to parse an assignment of a variable:
NewVar = 40
which is entered in a Textbox. I want my regular expression to return (using Regex.Split) the name of the variable and the value, pretty straightforward. This is the Regex I have so far:
var r = new Regex(#"^(\w+)=(\d+)$", RegexOptions.IgnorePatternWhitespace);
var mc = r.Split(command);
My goal was to do the trimming of whitespace in the Regex and not use the Trim() method of the returned values. Currently, it works but it returns an empty string at the beginning of the MatchCollection and an empty string at the end.
Using the above input example, this is what's returned from Regex.Split:
mc[0] = ""
mc[1] = "NewVar"
mc[2] = "40"
mc[3] = ""
So my question is: why does it return an empty string at the beginning and the end?
Thanks.
The reson RegEx.Split is returning four values is that you have exactly one match, so RegEx.Split is returning:
All the text before your match, which is ""
All () groups within your match, which are "NewVar" and "40"
All the text after your match, which is ""
RegEx.Split's primary purpose is to extract any text between the matched regex, for example you could use RegEx.Split with a pattern of "[,;]" to split text on either commas or semicolons. In NET Framework 1.0 and 1.1, Regex.Split only returned the split values, in this case "" and "", but in NET Framework 2.0 it was modified to also include values matched by () within the Regex, which is why you are seeing "NewVar" and "40" at all.
What you were looking for is Regex.Match, not Regex.Split. It will do exactly what you want:
var r = new Regex(#"^(\w+)=(\d+)$");
var match = r.Match(command);
var varName = match.Groups[0].Value;
var valueText = match.Groups[1].Value;
Note that RegexOptions.IgnorePatternWhitespace means you can include extra spaces in your pattern - it has nothing to do with the matched text. Since you have no extra whitespace in your pattern it is unnecesssary.
From the docs, Regex.Split() uses the regular expression as the delimiter to split on. It does not split the captured groups out of the input string. Also, the IgnorePatternWhitespace ignore unescaped whitespace in your pattern, not the input.
Instead, try the following:
var r = new Regex(#"\s*=\s*");
var mc = r.Split(command);
Note that the whitespace is actually consumed as a part of the delimiter.

Categories

Resources