C# regex for contains neither this and that

C# regex for contains neither this and that - c#

I try to create an regex in C# for allow only string with more than 3 char BUT if it starts with 'sch' it schould have a minimal length of 6 and if it starts with 'st' or 'ch' it should have a minimal length of 5.
The second part is pritty easey but the first part (all others length of 3) is more complicated:
"(^(SCH).{3})|(^(ST).{3})|(^(CH).{3})|^(!SCH).{3}"
Thanks for your help!

Seems like you want something like this,
#"^SCH.{3,}|^(?:ST|CH).{3,}|^(?!S?CH|ST).{3,}"
{3,} in .{3,} would repeat the previous token that is . (which matches any character) 3 or more times.
DEMO
^(?!S?CH|ST).{3,} if the string doesn't startswith SCH or ST or CH, then match those strings only if it has at-least three characters.

Personally I wouldn't use a regex for that. Just use standard string operations.
bool IsValid(string str)
{
if(str.StartsWith("st") || str.StartsWith("ch"))
return str.Length >= 5;
if(str.StartsWIth("sch"))
return str.Length >= 6;
return str.Length > 3;
}

Related

Regex n 0 and then a 1

I am looking for a regex to match n zeros and then a 1.
E.g:
0000100 -> matches
00200 -> does not math
I thought it was something like that:
var regex = new Regex(#"[0]*[2-9]+");

^[0]+[1] is what you want:
^ start of line
[0] match 0
+ at least once
[1] match 1
You could also add a $ at the end, if you want it to match a complete line.
Note: if you want to be able to match n=0 (i.e. just a 1), you need:
^[0]*[1]
Note: the brackets [] are optional as they only contain one character, but I think they make it easier to read. So you could have ^0+1 if you prefer, for example.
See also http://regexstorm.net/reference (for example) for a complete C# regex reference

Linq solution (and no regular expression):
string source = "0001";
bool isValid = source
.SkipWhile(c => c == '0')
.FirstOrDefault() == '1';
If you insist on regular expression:
bool isValid = Regex.IsMatch(source, "^0*1");
In both cases zero or more 0 followed by 1

There is no need of regex here: left-trim the string from 0, and if it is not null or empty, check the first char.
var s = "00001";
if (!string.IsNullOrEmpty(s.TrimStart('0')) &&
s.TrimStart('0').Substring(0, 1) == "1")
{
/* Valid, otherwise, not */
}
This will work if you have the digits at the beginning of the string and just need a boolean result.

Replace one character but not two in a string

I want to replace single occurrences of a character but not two in a string using C#.
For example, I want to replace & by an empty string but not when the ocurrence is &&. Another example, a&b&&c would become ab&&c after the replacement.
If I use a regex like &[^&], it will also match the character after the & and I don't want to replace it.
Another solution I found is to iterate over the string characters.
Do you know a cleaner solution to do that?

To only match one & (not preceded or followed by &), use look-arounds (?<!&) and (?!&):
(?<!&)&(?!&)
See regex demo
You tried to use a negated character class that still matches a character, and you need to use a look-ahead/look-behind to just check for some character absence/presence, without consuming it.
See regular-expressions.info:
Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u).
Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (?<!a)b matches a "b" that is not preceded by an "a", using negative lookbehind. It doesn't match cab, but matches the b (and only the b) in bed or debt.

You can match both & and && (or any number of repetition) and only replace the single one with an empty string:
str = Regex.Replace(str, "&+", m => m.Value.Length == 1 ? "" : m.Value);

You can use this regex: #"(?<!&)&(?!&)"
var str = Regex.Replace("a&b&&c", #"(?<!&)&(?!&)", "");
Console.WriteLine(str); // ab&&c

You can go with this:
public static string replacement(string oldString, char charToRemove)
{
string newString = "";
bool found = false;
foreach (char c in oldString)
{
if (c == charToRemove && !found)
{
found = true;
continue;
}
newString += c;
}
return newString;
}
Which is as generic as possible

I would use something like this, which IMO should be better than using Regex:
public static class StringExtensions
{
public static string ReplaceFirst(this string source, char oldChar, char newChar)
{
if (string.IsNullOrEmpty(source)) return source;
int index = source.IndexOf(oldChar);
if (index < 0) return source;
var chars = source.ToCharArray();
chars[index] = newChar;
return new string(chars);
}
}

I'll contribute to this statement from the comments:
in this case, only the substring with odd number of '&' will be replaced by all the "&" except the last "&" . "&&&" would be "&&" and "&&&&" would be "&&&&"
This is a pretty neat solution using balancing groups (though I wouldn't call it particularly clean nor easy to read).
Code:
string str = "11&222&&333&&&44444&&&&55&&&&&";
str = Regex.Replace(str, "&((?:(?<2>&)(?<-2>&)?)*)", "$1$2");
Output:
11222&&333&&44444&&&&55&&&&
ideone demo
It always matches the first & (not captured).
If it's followed by an even number of &, they're matched and stored in $1. The second group is captured by the first of the pair, but then it's substracted by the second.
However, if there's there's an odd number of of &, the optional group (?<-2>&)? does not match, and the group is not substracted. Then, $2 will capture an extra &
For example, matching the subject "&&&&", the first char is consumed and it isn't captured (1). The second and third chars are matched, but $2 is substracted (2). For the last char, $2 is captured (3). The last 3 chars were stored in $1, and there's an extra & in $2.
Then, the substitution "$1$2" == "&&&&".

finding middle character in string using regex only

How can I find middle character with regex only
For example,this shows the expected output
Hello -> l
world -> r
merged -> rg (see this for even number of occurances)
hi -> hi
I -> I
I tried
(?<=\w+).(?=\w+)

Regular expressions cannot count in the way that you are looking for. This looks like something regular expressions cannot accomplish. I suggest writing code to solve this.

String str="Hello";
String mid="";
int len = str.length();
if(len%2==1)
mid= Character.toString(str.getCharAt(len/2));
else
mid= Character.toString(str.getChatAt(len/2))+ Character.toStringstr.getCharAt((len/2)-1));
This should probably work.

public static void main(String[] args) {
String s = "jogijogi";
int size = s.length() / 2;
String temp = "";
if (s.length() % 2 == 0) {
temp = s.substring(size - 1, (s.length() - size) + 1);
} else if (s.length() % 2 != 0) {
temp = s.substring(size, (s.length() - size));
} else {
temp = s.substring(1);
}
System.out.println(temp);
}

Related: How to match the middle character in a string with regex?
The following regex is based on #jaytea's approach and works well with e.g. PCRE, Java or C#.
^(?:.(?=.+?(.\1?$)))*?(^..?$|..?(?=\1$))
Here is the demo at regex101 and a .NET demo at RegexPlanet (click the green ".NET" button)
Middle character(s) will be found in the second capturing group. The goal is to capture two middle characters if there is an even amount of characters, else one. It works by a growing capture towards the end (first group) while lazily going through the string until it ends with the captured substring that grows with each repitition. ^..?$ is used to match strings with one or two characters length.
This "growing" works with capturing inside a repeated lookahead by placing an optional reference to the same group together with a freshly captured character into that group (further reading here).
A PCRE-variant with \K to reset and full matches: ^(?:.(?=.+?(.\1?$)))+?\K..?(?=\1$)|^..?
Curious about the "easy solution using balancing groups" that #Qtax mentions in his question.

check if char isletter

i want to check if a string only contains correct letters.
I used Char.IsLetter for this.
My problem is, when there are chars like é or á they are also said to be correct letters, which shouldn't be.
is there a possibility to check a char as a correct letter A-Z or a-z without special-letters like á?

bool IsEnglishLetter(char c)
{
return (c>='A' && c<='Z') || (c>='a' && c<='z');
}
You can make this an extension method:
static bool IsEnglishLetter(this char c) ...

You can use Char.IsLetter(c) && c < 128 . Or just c < 128 by itself, that seems to match your problem the closest.
But you are solving an Encoding issue by filtering chars. Do investigate what that other application understands exactly.
It could be that you should just be writing with Encoding.GetEncoding(someCodePage).

You can use regular expression \w or [a-zA-Z] for it

// Create the regular expression
string pattern = #"^[a-zA-Z]+$";
Regex regex = new Regex(pattern);
// Compare a string against the regular expression
return regex.IsMatch(stringToTest);

In C# 9.0 you can use pattern matching enhancements.
public static bool IsLetter(this char c) =>
c is >= 'a' and <= 'z' or >= 'A' and <= 'Z';

As of .NET 7 there is an Char.IsAsciiLetter function which would exactly meet the requirements
https://learn.microsoft.com/en-za/dotnet/api/system.char.isasciiletter?view=net-7.0

Use Linq for easy access:
if (yourString.All(char.IsLetter))
{
//just letters are accepted.
}

How to check first character of a string if a letter, any letter in C#

I want to take a string and check the first character for being a letter, upper or lower doesn't matter, but it shouldn't be special, a space, a line break, anything. How can I achieve this in C#?

Try the following
string str = ...;
bool isLetter = !String.IsNullOrEmpty(str) && Char.IsLetter(str[0]);

Try the following
bool isValid = char.IsLetter(name.FirstOrDefault());

return (myString[0] >= 'A' && myString[0] <= 'Z') || (myString[0] >= 'a' && myString[0] <= 'z')

You should look up the ASCII table, a table which systematically maps characters to integer values. All lower-case characters are sequential (97-122), as are all upper-case characters (65-90). Knowing this, you do not even have to cast to the int values, just check if the first char of the string is within one of those two ranges (inclusive).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# regex for contains neither this and that - c#

Personally I wouldn't use a regex for that. Just use standard string operations. bool IsValid(string str) { if(str.StartsWith("st") || str.StartsWith("ch")) return str.Length >= 5; if(str.StartsWIth("sch")) return str.Length >= 6; return str.Length > 3; }

Related

Regex n 0 and then a 1

Replace one character but not two in a string

finding middle character in string using regex only

check if char isletter

How to check first character of a string if a letter, any letter in C#

Categories

Resources