Regex n 0 and then a 1

Regex n 0 and then a 1 - c#

I am looking for a regex to match n zeros and then a 1.
E.g:
0000100 -> matches
00200 -> does not math
I thought it was something like that:
var regex = new Regex(#"[0]*[2-9]+");

^[0]+[1] is what you want:
^ start of line
[0] match 0
+ at least once
[1] match 1
You could also add a $ at the end, if you want it to match a complete line.
Note: if you want to be able to match n=0 (i.e. just a 1), you need:
^[0]*[1]
Note: the brackets [] are optional as they only contain one character, but I think they make it easier to read. So you could have ^0+1 if you prefer, for example.
See also http://regexstorm.net/reference (for example) for a complete C# regex reference

Linq solution (and no regular expression):
string source = "0001";
bool isValid = source
.SkipWhile(c => c == '0')
.FirstOrDefault() == '1';
If you insist on regular expression:
bool isValid = Regex.IsMatch(source, "^0*1");
In both cases zero or more 0 followed by 1

There is no need of regex here: left-trim the string from 0, and if it is not null or empty, check the first char.
var s = "00001";
if (!string.IsNullOrEmpty(s.TrimStart('0')) &&
s.TrimStart('0').Substring(0, 1) == "1")
{
/* Valid, otherwise, not */
}
This will work if you have the digits at the beginning of the string and just need a boolean result.

Related

Replace Nth regex match occurrence in string

I know there are quite a few of these questions on SO, but I can't find one that explains how they implemented the pattern to return the N'th match, that was broken down. All the answers I looked just give the code to the OP with minimal explanation.
What I know is, you need to implement this {X} in the pattern where the X is the number occurrence you want to return.
So I am trying to match a string between two chars and I seemed to have been able to get that working.
The string to be tested looks something like this,
"=StringOne&=StringTwo&=StringThree&=StringFour&"
"[^/=]+(?=&)"
Again, after reading as much as I could, this pattern will also return all matches,
[^/=]+(?=&){1}
Due to {1} being the default and therefore redundant in the above pattern.
But I can't do this,
[^/=]+(?=&){2}
As it will not return 3rd match as I was expecting it too.
So could someone please shove me in the right direction and explain how to get the pattern needed to find the occurrence of the match that will be needed?

A pure regex way is possible, but is not really very efficient if your pattern is complex.
var s = "=StringOne&=StringTwo&=StringThree&=StringFour&";
var idx = 2; // Replace this occurrence
var result = Regex.Replace(s, $#"^(=(?:[^=&]+&=){{{idx-1}}})[^=&]+", "${1}REPLACED");
Console.WriteLine(result); // => =StringOne&=REPLACED&=StringThree&=StringFour&
See this C# demo and the regex demo.
Regex details
^ - start of string
(=(?:[^=&]+&=){1}) - Group 1 capturing:
= - a = symbol
(?:[^=&]+&=){1} - 1 occurrence (this number is generated dynamically) of
[^=&]+ - 1 or more chars other than = and & (NOTE that in case the string may contain = and &, it is safer to replace it with .*? and pass RegexOptions.Singleline option to the regex compiler)
&= - a &= substring.
[^=&]+ - 1 or more chars other than = and &
The ${1} in the replacement pattern inserts the contents of Group 1 back into the resulting string.
As an alternative, I can suggest introducing a counter and increment is upon each match, and only replace the one when the counter is equal to the match occurrence you specify.
Use
var s = "=StringOne&=StringTwo&=StringThree&=StringFour&";
var idx_to_replace = 2; // Replace this occurrence
var cnt = 0; // Counter
var result = Regex.Replace(s, "[^=]+(?=&)", m => { // Match evaluator
cnt++; return cnt == idx_to_replace ? "REPLACED" : m.Value; });
Console.WriteLine(result);
// => =StringOne&=REPLACED&=StringThree&=StringFour&
See the C# demo.
The cnt is incremented inside the match evaluator inside Regex.Replace and m is assigned the current Match object. When cnt is equal to idx_to_replace the replacement occurs, else, the whole match is pasted back (with m.Value).
Another approach is to iterate through the matches, and once the Nth match is found, replace it by splitting the string into parts before the match and after the match breaking out of the loop once the replacement is done:
var s = "=StringOne&=StringTwo&=StringThree&=StringFour&";
var idx_to_replace = 2; // Replace this occurrence
var cnt = 0; // Counter
var result = string.Empty; // Final result variable
var rx = "[^=]+(?=&)"; // Pattern
for (var m=Regex.Match(s, rx); m.Success; m = m.NextMatch())
{
cnt++;
if (cnt == idx_to_replace) {
result = $"{s.Substring(0, m.Index)}REPLACED{s.Substring(m.Index+m.Length)}";
break;
}
}
Console.WriteLine(result); // => =StringOne&=REPLACED&=StringThree&=StringFour&
See another C# demo.
This might be quicker since the engine does not have to find all matches.

Replace one character but not two in a string

I want to replace single occurrences of a character but not two in a string using C#.
For example, I want to replace & by an empty string but not when the ocurrence is &&. Another example, a&b&&c would become ab&&c after the replacement.
If I use a regex like &[^&], it will also match the character after the & and I don't want to replace it.
Another solution I found is to iterate over the string characters.
Do you know a cleaner solution to do that?

To only match one & (not preceded or followed by &), use look-arounds (?<!&) and (?!&):
(?<!&)&(?!&)
See regex demo
You tried to use a negated character class that still matches a character, and you need to use a look-ahead/look-behind to just check for some character absence/presence, without consuming it.
See regular-expressions.info:
Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u).
Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (?<!a)b matches a "b" that is not preceded by an "a", using negative lookbehind. It doesn't match cab, but matches the b (and only the b) in bed or debt.

You can match both & and && (or any number of repetition) and only replace the single one with an empty string:
str = Regex.Replace(str, "&+", m => m.Value.Length == 1 ? "" : m.Value);

You can use this regex: #"(?<!&)&(?!&)"
var str = Regex.Replace("a&b&&c", #"(?<!&)&(?!&)", "");
Console.WriteLine(str); // ab&&c

You can go with this:
public static string replacement(string oldString, char charToRemove)
{
string newString = "";
bool found = false;
foreach (char c in oldString)
{
if (c == charToRemove && !found)
{
found = true;
continue;
}
newString += c;
}
return newString;
}
Which is as generic as possible

I would use something like this, which IMO should be better than using Regex:
public static class StringExtensions
{
public static string ReplaceFirst(this string source, char oldChar, char newChar)
{
if (string.IsNullOrEmpty(source)) return source;
int index = source.IndexOf(oldChar);
if (index < 0) return source;
var chars = source.ToCharArray();
chars[index] = newChar;
return new string(chars);
}
}

I'll contribute to this statement from the comments:
in this case, only the substring with odd number of '&' will be replaced by all the "&" except the last "&" . "&&&" would be "&&" and "&&&&" would be "&&&&"
This is a pretty neat solution using balancing groups (though I wouldn't call it particularly clean nor easy to read).
Code:
string str = "11&222&&333&&&44444&&&&55&&&&&";
str = Regex.Replace(str, "&((?:(?<2>&)(?<-2>&)?)*)", "$1$2");
Output:
11222&&333&&44444&&&&55&&&&
ideone demo
It always matches the first & (not captured).
If it's followed by an even number of &, they're matched and stored in $1. The second group is captured by the first of the pair, but then it's substracted by the second.
However, if there's there's an odd number of of &, the optional group (?<-2>&)? does not match, and the group is not substracted. Then, $2 will capture an extra &
For example, matching the subject "&&&&", the first char is consumed and it isn't captured (1). The second and third chars are matched, but $2 is substracted (2). For the last char, $2 is captured (3). The last 3 chars were stored in $1, and there's an extra & in $2.
Then, the substitution "$1$2" == "&&&&".

How to remove only certain substrings from a string?

Using C#, I have a string that is a SQL script containing multiple queries. I want to remove sections of the string that are enclosed in single quotes. I can do this using Regex.Replace, in this manner:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, "'[^']*'", string.Empty);
Results in: "Only can we turn him to the of the Force"
What I want to do is remove the substrings between quotes EXCEPT for substrings containing a specific substring. For example, using the string above, I want to remove the quoted substrings except for those that contain "dark," such that the resulting string is:
Results in: "Only can we turn him to the 'dark side' of the Force"
How can this be accomplished using Regex.Replace, or perhaps by some other technique? I'm currently trying a solution that involves using Substring(), IndexOf(), and Contains().
Note: I don't care if the single quotes around "dark side" are removed or not, so the result could also be: "Only can we turn him to the dark side of the Force." I say this because a solution using Split() would remove all the single quotes.
Edit: I don't have a solution yet using Substring(), IndexOf(), etc. By "working on," I mean I'm thinking in my head how this can be done. I have no code, which is why I haven't posted any yet. Thanks.
Edit: VKS's solution below works. I wasn't escaping the \b the first attempt which is why it failed. Also, it didn't work unless I included the single quotes around the whole string as well.
test = Regex.Replace(test, "'(?![^']*\\bdark\\b)[^']*'", string.Empty);

'(?![^']*\bdark\b)[^']*'
Try this.See demo.Replace by empty string.You can use lookahead here to check if '' contains a word dark.
https://www.regex101.com/r/rG7gX4/12

While vks's solution works, I'd like to demonstrate a different approach:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, #"'[^']*'", match => {
if (match.Value.Contains("dark"))
return match.Value;
// You can add more cases here
return string.Empty;
});
Or, if your condition is simple enough:
test = Regex.Replace(test, #"'[^']*'", match => match.Value.Contains("dark")
? match.Value
: string.Empty
);
That is, use a lambda to provide a callback for the replacement. This way, you can run arbitrary logic to replace the string.

some thing like this would work. you can add all strings you want to keep into the excludedStrings array
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
var excludedString = new string[] { "dark side" };
int startIndex = 0;
while ((startIndex = test.IndexOf('\'', startIndex)) >= 0)
{
var endIndex = test.IndexOf('\'', startIndex + 1);
var subString = test.Substring(startIndex, (endIndex - startIndex) + 1);
if (!excludedString.Contains(subString.Replace("'", "")))
{
test = test.Remove(startIndex, (endIndex - startIndex) + 1);
}
else
{
startIndex = endIndex + 1;
}
}

Another method through regex alternation operator |.
#"('[^']*\bdark\b[^']*')|'[^']*'"
Then replace the matched character with $1
DEMO
string str = "Only 'together' can we turn him to the 'dark side' of the Force";
string result = Regex.Replace(str, #"('[^']*\bdark\b[^']*')|'[^']*'", "$1");
Console.WriteLine(result);
IDEONE
Explanation:
(...) called capturing group.
'[^']*\bdark\b[^']*' would match all the single quoted strings which contains the substring dark . [^']* matches any character but not of ', zero or more times.
('[^']*\bdark\b[^']*'), because the regex is within a capturing group, all the matched characters are stored inside the group index 1.
| Next comes the regex alternation operator.
'[^']*' Now this matches all the remaining (except the one contains dark) single quoted strings. Note that this won't match the single quoted string which contains the substring dark because we already matched those strings with the pattern exists before to the | alternation operator.
Finally replacing all the matched characters with the chars inside group index 1 will give you the desired output.

I made this attempt that I think you were thinking about (some solution using split, Contain, ... without regex)
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
string[] separated = test.Split('\'');
string result = "";
for (int i = 0; i < separated.Length; i++)
{
string str = separated[i];
str = str.Trim(); //trim the tailing spaces
if (i % 2 == 0 || str.Contains("dark")) // you can expand your condition
{
result += str+" "; // add space after each added string
}
}
result = result.Trim(); //trim the tailing space again

Regex to match a hyphen in a string 0 or 1 times

I am trying to build a regex that will check to see if a string has a hyphen 0 or 1 times.
So it would return the following strings as ok.
1-5
1,3-5
1,3
The following would be wrong.
1-3-5
I have tried the following, but it is fine with 1-3-5:
([^-]?-){0,1}[^-]

This works:
^[^-]*-?[^-]*$
^^ ^ ^ ^
|| | | |
|| | | |-- Match the end of string
|| | |------- Match zero or more non-hyphen characters
|| |--------- Match zero or one hyphens
||-------------- Match zero or more non-hyphen characters
|--------------- Match the beginning of string
In this case, you need to specify matching the beginning (^) and end ($) of the input strings, so that you don't get multiple matches for a string like 1-3-5.

Perhaps something simpler:
var hyphens = input.Count(cc => cc == '-');
Your regular expression works because it found the first instance of a hyphen, which meets your criteria. You could use the following regular expression, but it would not be ideal:
^[^-]*-?[^-]*$

If you have your strings in a collection, you could do this in one line of LINQ. It'll return a list of strings that have less than two hyphens in them.
var okStrings = allStrings.Where(s => s.Count(h => h == '-') < 2).ToList();
Judging by the way you've formatted the list of strings I assume you can't split on the comma because it's not a consistent delimiter. If you can then you can just using the String.Split method to get each string and replace the allStrings variable above with that array.

You could approach it this way:
string StringToSearch = "1-3-5";
MatchCollection matches = Regex.Matches("-", StringToSearch);
if(matches.Count == 0 || matches.Count == 1)
{
//...
}

I just tested your expression and it appears to give the result you want. It break 1-3-5 into {1-3} and {-5}
http://regexpal.com/

How can I find a Regex match at a specific location of the string in C#?

I want to find out whether a Regex matches at a specific location of a string.
Example:
Regex r = new Regex("d");
string s = "abcdefg";
I want the match function to find a match only if it is at the exact given location so that using the example above, matching at the locations 1, 3, and 5 should give no match, match, and no match, respectively. Unfortunately the C# Regex.Match method gives:
r.Match(s, 1); // => match ("d")
r.Match(s, 3); // => match ("d")
r.Match(s, 5); // => no match
I understand this is because the Regex.Match method searches forward for the first match, but how do I prevent this behavior without having to make substrings?

Add \G to the beginning of your regex:
Regex r = new Regex(#"\Gd");
string s = "abcdefg";
Console.WriteLine(r.Match(s, 1).Success); // False
Console.WriteLine(r.Match(s, 3).Success); // True
Console.WriteLine(r.Match(s, 5).Success); // False
\G anchors the match to the position where the previous match ended, or to the beginning of the string if there was no previous match. With the second argument to Match, you're effectively telling it there was a previous match, which ended at that location.

Use substring and the start-of-string anchor ^:
Regex r = new Regex("^d"); // Use the start of string anchor
string s = "abcdefg";
r.IsMatch(s.Substring(3)); // Match at exactly fourth character (0-based index 3)
Alternatively, to avoid copying the string in memory, use quantified .:
Regex r = new Regex("^.{3}d");
r.IsMatch("abcdefg");
The pattern ^.{3}d says
Start at the beginning of the string
Match exactly three characters of anything
Then match the letter 'd'

Well, if you're always looking for the same index, you can stuff a little more your regex by adding wildcards at the beginning to "pad" the result, i.e. :
Regex r = new Regex("^.{3}d");
r.isMatch("abcdefg"); // true
r.isMatch("adcffed"); // false
r.isMatch("abcddef"); // true
On the other hand, if you wanna use the same regex with different indexes, you can just use the ^ character to match the beginning of the string only :
Regex r = new Regex("^d");
r.isMatch("abcdefg".substring(3)); // true
r.isMatch("adcffed".substring(3)); // false
r.isMatch("abcddef".substring(1)); // false
NB : if you're just looking for a simple string and not a patter, you should simply use string.IndexOf

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex n 0 and then a 1 - c#

I am looking for a regex to match n zeros and then a 1. E.g: 0000100 -> matches 00200 -> does not math I thought it was something like that: var regex = new Regex(#"[0]*[2-9]+");

Linq solution (and no regular expression): string source = "0001"; bool isValid = source .SkipWhile(c => c == '0') .FirstOrDefault() == '1'; If you insist on regular expression: bool isValid = Regex.IsMatch(source, "^0*1"); In both cases zero or more 0 followed by 1

Related

Replace Nth regex match occurrence in string

Replace one character but not two in a string

How to remove only certain substrings from a string?

Regex to match a hyphen in a string 0 or 1 times

How can I find a Regex match at a specific location of the string in C#?

Categories

Resources