Regex: Combination of String Length and Starts With patterns - c#

I want to check if the string contains exactly 11 characters, not more or less, and also if it starts with the numbers '09', so my pattern is:
Regex rg = new Regex(#"^(?=09)(?={11})");
Console.WriteLine(rg.IsMatch("09123456789"));
^(?=09) is working correctly, but when I add second part, (?={11}), an exception will be thrown. What's the right pattern?

You may achieve that without a regex:
if (s.Length == 11 && s.StartsWith("09") && s.All(Char.IsDigit))
See the C# demo (not sure you need to have digits only. If not, remove s.All(Char.IsDigit)).
Note that ^(?=09)(?={11}) matches a start of string position (with ^), then checks if the string starts with 09 substring, and then requires {11} literal char sequence at the beginning of a string. That can't work since 09 != {1.
If you need a regex you may use
\A09[0-9]{9}\z
or, to match not only digits:
\A09.{9}\z
where
\A - asserts the start of a string
09 - matches a literal char sequence 09
.{9} - matches 9 chars other than LF
\z - the very end of the string.

new Regex(#"^09[0-9]{9}$");
pattern for valid string

You can use
Regex rg = new Regex(#"^09.{9}$", RegexOptions.Compiled);
bool CheckStringRegex(string str)
{
return rg.IsMatch(str);
}
but I suggest to do in simpler, without regexes
bool CheckString(string str)
{
return str.Length == 11 && str.StartsWith("09");
}

How about ^(?=09)\w{11}$ ?
DEMO click me

Related

Regex first digits occurrence

My task is extract the first digits in the following string:
GLB=VSCA|34|speed|1|
My pattern is the following:
(?x:VSCA(\|){1}(\d.))
Basically I need to extract "34", the first digits occurrence after the "VSCA". With my pattern I obtain a group but would be possibile to get only the number? this is my c# snippet:
string regex = #"(?x:VSCA(\|){1}(\d.))";
Regex rx = new Regex(regex);
string s = "GLB=VSCA|34|speed|1|";
if (rx.Match(s).Success)
{
var test = rx.Match(s).Groups[1].ToString();
}
You could match 34 (the first digits after VSCA) using a positive lookbehind (?<=VSCA\D*) to assert that what is on the left side is VSCA followed by zero or times not a digit \D* and then match one or more digits \d+:
(?<=VSCA\D*)\d+
If you need the pipe to be after VSCA the you could include that in the lookbehind:
(?<=VSCA\|)\d+
Demo
This regex pattern: (?<=VSCA\|)\d+?(?=\|) will match only the number. (If your number can be negative / have decimal places you may want to use (?<=VSCA\|).+?(?=\|) instead)
You don't need Regex for this, you can simply split on the '|' character:
string s = "GLB=VSCA|34|speed|1|";
string[] parts = s.Split('|');
if(parts.Length >= 2)
{
Console.WriteLine(parts[1]); //prints 34
}
The benefit here is that you can access all parts of the original string based on the index:
[0] - "GLB=VSCA"
[1] - "34"
[2] - "speed"
[3] - "1"
Fiddle here
While the other answers work really well, if you really must use a regular expression, or are interested in knowing how to get to that straight away you can use a named group for the number. Consider the following code:
string regex = #"(?x:VSCA(\|){1}(?<number>\d.?))";
Regex rx = new Regex(regex);
string s = "GLB:VSCA|34|speed|1|";
var match = rx.Match(s);
if(match.Success) Console.WriteLine(match.Groups["number"]);
How about (?<=VSCA\|)[0-9]+?
Try it out here

Replace one character but not two in a string

I want to replace single occurrences of a character but not two in a string using C#.
For example, I want to replace & by an empty string but not when the ocurrence is &&. Another example, a&b&&c would become ab&&c after the replacement.
If I use a regex like &[^&], it will also match the character after the & and I don't want to replace it.
Another solution I found is to iterate over the string characters.
Do you know a cleaner solution to do that?
To only match one & (not preceded or followed by &), use look-arounds (?<!&) and (?!&):
(?<!&)&(?!&)
See regex demo
You tried to use a negated character class that still matches a character, and you need to use a look-ahead/look-behind to just check for some character absence/presence, without consuming it.
See regular-expressions.info:
Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u).
Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (?<!a)b matches a "b" that is not preceded by an "a", using negative lookbehind. It doesn't match cab, but matches the b (and only the b) in bed or debt.
You can match both & and && (or any number of repetition) and only replace the single one with an empty string:
str = Regex.Replace(str, "&+", m => m.Value.Length == 1 ? "" : m.Value);
You can use this regex: #"(?<!&)&(?!&)"
var str = Regex.Replace("a&b&&c", #"(?<!&)&(?!&)", "");
Console.WriteLine(str); // ab&&c
You can go with this:
public static string replacement(string oldString, char charToRemove)
{
string newString = "";
bool found = false;
foreach (char c in oldString)
{
if (c == charToRemove && !found)
{
found = true;
continue;
}
newString += c;
}
return newString;
}
Which is as generic as possible
I would use something like this, which IMO should be better than using Regex:
public static class StringExtensions
{
public static string ReplaceFirst(this string source, char oldChar, char newChar)
{
if (string.IsNullOrEmpty(source)) return source;
int index = source.IndexOf(oldChar);
if (index < 0) return source;
var chars = source.ToCharArray();
chars[index] = newChar;
return new string(chars);
}
}
I'll contribute to this statement from the comments:
in this case, only the substring with odd number of '&' will be replaced by all the "&" except the last "&" . "&&&" would be "&&" and "&&&&" would be "&&&&"
This is a pretty neat solution using balancing groups (though I wouldn't call it particularly clean nor easy to read).
Code:
string str = "11&222&&333&&&44444&&&&55&&&&&";
str = Regex.Replace(str, "&((?:(?<2>&)(?<-2>&)?)*)", "$1$2");
Output:
11222&&333&&44444&&&&55&&&&
ideone demo
It always matches the first & (not captured).
If it's followed by an even number of &, they're matched and stored in $1. The second group is captured by the first of the pair, but then it's substracted by the second.
However, if there's there's an odd number of of &, the optional group (?<-2>&)? does not match, and the group is not substracted. Then, $2 will capture an extra &
For example, matching the subject "&&&&", the first char is consumed and it isn't captured (1). The second and third chars are matched, but $2 is substracted (2). For the last char, $2 is captured (3). The last 3 chars were stored in $1, and there's an extra & in $2.
Then, the substitution "$1$2" == "&&&&".

Regex to match a hyphen in a string 0 or 1 times

I am trying to build a regex that will check to see if a string has a hyphen 0 or 1 times.
So it would return the following strings as ok.
1-5
1,3-5
1,3
The following would be wrong.
1-3-5
I have tried the following, but it is fine with 1-3-5:
([^-]?-){0,1}[^-]
This works:
^[^-]*-?[^-]*$
^^ ^ ^ ^
|| | | |
|| | | |-- Match the end of string
|| | |------- Match zero or more non-hyphen characters
|| |--------- Match zero or one hyphens
||-------------- Match zero or more non-hyphen characters
|--------------- Match the beginning of string
In this case, you need to specify matching the beginning (^) and end ($) of the input strings, so that you don't get multiple matches for a string like 1-3-5.
Perhaps something simpler:
var hyphens = input.Count(cc => cc == '-');
Your regular expression works because it found the first instance of a hyphen, which meets your criteria. You could use the following regular expression, but it would not be ideal:
^[^-]*-?[^-]*$
If you have your strings in a collection, you could do this in one line of LINQ. It'll return a list of strings that have less than two hyphens in them.
var okStrings = allStrings.Where(s => s.Count(h => h == '-') < 2).ToList();
Judging by the way you've formatted the list of strings I assume you can't split on the comma because it's not a consistent delimiter. If you can then you can just using the String.Split method to get each string and replace the allStrings variable above with that array.
You could approach it this way:
string StringToSearch = "1-3-5";
MatchCollection matches = Regex.Matches("-", StringToSearch);
if(matches.Count == 0 || matches.Count == 1)
{
//...
}
I just tested your expression and it appears to give the result you want. It break 1-3-5 into {1-3} and {-5}
http://regexpal.com/

Replace all characters and first 0's (zeroes)

I am trying to replace all characters inside a Regular Expression expect the number, but the number should not start with 0
How can I achieve this using Regular Expression?
I have tried multiple things like #"^([1-9]+)(0+)(\d*)"and "(?<=[1-9])0+", but those does not work
Some examples of the text could be hej:\\\\0.0.0.22, hej:22, hej:\\\\?022 and hej:\\\\?22, and the result should in all places be 22
Rather than replace, try and match against [1-9][0-9]*$ on your string. Grab the matched text.
Note that as .NET regexes match Unicode number characters if you use \d, here the regex restricts what is matched to a simple character class instead.
(note: regex assumes matches at end of line only)
According to one of your comments hej:\\\\0.011.0.022 should yield 110022. First select the relevant string part from the first non zero digit up to the last number not being zero:
([1-9].*[1-9]\d*)|[1-9]
[1-9] is the first non zero digit
.* are any number of any characters
[1-9]\d* are numbers, starting at the first non-zero digit
|[1-9] includes cases consisting of only one single non zero digit
Then remove all non digits (\D)
Match match = Regex.Match(input, #"([1-9].*[1-9]\d*)|[1-9]");
if (match.Success) {
result = Regex.Replace(match.Value, "\D", "");
} else {
result = "";
}
Use following
[1-9][0-9]*$
You don't need to do any recursion, just match that.
Here is something that you can try The87Boy you can play around with or add to the pattern as you like.
string strTargetString = #"hej:\\\\*?0222\";
string pattern = "[\\\\hej:0.?*]";
string replacement = " ";
Regex regEx = new Regex(pattern);
string newRegStr = Regex.Replace(regEx.Replace(strTargetString, replacement), #"\s+", " ");
Result from the about Example = 22

Regex 11 digit string capturing

String pattern = #"^(\d{11})$";
String input = "You number is:11126564312 and 12234322121 \n\n23211212345";
Match match = Regex.Match(input,pattern);
From the above code I am planning to capture the 11 digit strings present in above text but match.Success is always returning false. Any ideas.
This is because you have used ^ and $.
Explaination: The meaning of your regular expression is "match any string that contains exactly 11 digits from start to end". The string You number is:11126564312 and 12234322121 \n\n23211212345 is not a string like that. 01234567890 is like that string.
What you need: You need regular expression for match any string that contains exactly 11 digits. start to end is omitted. ^ and $ is used for this. So you need this regex.
String pattern = #"(\d{11})";
As the sub-pattern to capture contains the whole regex you dont need () at all. Just the regex ill do.
String pattern = #"\d{11}";
String pattern = #"^(\d{11})$";
String input = "11126564312"
Match match = Regex.Match(input,pattern);
will pass.
Your Regex specify it has to be 11 numbers ONLY
^ = starts with
$ = ends with
if you want to check if it contains 11 numbers change the regex to
String pattern = #"\d{11}";
Your Regex matches a string that has exactly 11 digits, but no text before, between or after. That is why you don't get any matches here.
To match 11 digits anywhere in the string, simply use:
string pattern = #"\d{11}";

Categories

Resources