Using RegEx to ignore empty strings and spaces - c#

I have the following simple test which doesnt return true for some reason.
string[] test = new string[] { "A", " ", " ", "D", "" };
Regex reg = new Regex(#"^[A-Z]\s$");
bool ok = test.All(x => reg.IsMatch(x));
I've also tried putting the \s inside the square brackets but that doesn't work either
I want to make sure that all characters in the array that are not empty or blank spaces match A-Z.
I realise I could do a Where(x=>!String.IsNullorEmpty(x) && x != " ") before the All but I thought Regex could handle this scenario

I think you want:
Regex reg = new Regex(#"^[A-Z\s]*$");
That basically says "the string consists entirely of whitespace or A-Z".
If you want to force it to be a single character or empty, just change it to:
Regex reg = new Regex(#"^[A-Z\s]?$");

Enumerable.All<TSource> Method Determines whether all elements of a sequence satisfy a condition.

The regular expression ^[A-Z]\s$ says: two-character string, whose first character is A-Z and the second is a white space. What you actually want is ^[A-Z\s]*$.

Related

C# Regex split() without removing the split condition character

I am splitting a string with regex using its Split() method.
var splitRegex = new Regex(#"[\s|{]");
string input = "/Tests/ShowMessage { 'Text': 'foo' }";
//second version of the input:
//string input = "/Tests/ShowMessage{ 'Text': 'foo' }";
string[] splittedText = splitRegex.Split(input, 2);
The string is just a sample pattern of the input. There are two different structures of input, once with a space before the { or without the space. I want to split the input on the { bracket in order to get the following result:
/Tests/ShowMessage
{ 'Text': 'foo' }
If there is a space, the string gets splitted there (space gets removed) and i get my desired result. But if there isnt a space i split the string on the {, so the { gets removed, what i dont want though. How can i use Regex.Split() without removing the split condition character?
The square brackets create a character set, so you want it to match exactly one of those inner characters. For your desire start off by removing them.
So to match it a random count of whitespaces you have to add *, the result is this one\s*.
\s is a whitespace
* means zero-or-more
That you don't remove the split condition character, you can use lookahead assertion (?=...).
(?=...) or (?!...) is a lookahead assertion
The combined Regex looks like this: \s*(?={)
This is a really good and detailed documentation of all the different Regex parts, you might have a look at it. Furthermore you can test your Regex easy and for free here.
In order to not include the curly brace in the match you can put it into a look ahead
\s*(?={)
That will match any number of white spaces up to the position before a open curly brace.
You can use regular string split, on "{" and trim the spaces off:
var bits = "/Tests/ShowMessage { 'Text': 'foo' }".Split("{", StringSplitOptions.RemoveEmptyEntries);
bits[0] = bits[0].TrimEnd();
bits[1] = "{" + bits[1];
If you want to use the RegEx route, you can add the { back if you change the regex a bit:
var splitRegex = new Regex(#"\s*{");
string input = "/Tests/ShowMessage { 'Text': 'foo' }";
//second version of the input:
//string input = "/Tests/ShowMessage{ 'Text': 'foo' }";
string[] splittedText = splitRegex.Split(input, 2);
splittedText[1] = "{" + splittedText[1];
It means "split at occurrence of (zero or more whitespace followed by {)" - so the split operation nukes your spaces (you want), and your { (you don't want) but you can put the { back with certainty that it will mean you get what you want
var splitedList = srt.Text.Replace(".", ".#").Replace("?", "?#").Replace("!", "!#").Split(new[] { "#"}, StringSplitOptions.RemoveEmptyEntries).ToList();
This will split text for .!? and will not remove condition chars. For better result just replace # with some uniq char. Like this one for example '®' That is all. Simple as it is. No regex.split which is slow and difficult due to many different task criterias, etc...
passing-> "Hello. I'am dev!"
result (split condition character exist )
"Hello."
"I'am dev!"

To check whether there are words matching in a sentence with c# regular expression

I have a function that accepts 2 parameters.
Parameter 1: SearchTerm,
Parameter 2: ProductName
How do I check whether the words in SearchTerm exists in ProductName, doesn't matter it occurs at the beginning, middle or end of ProductName?
It has to be word by word match, let say SearchTerm = "cano", ProductName = "canon", it should be return false, not match.
If you want to match only complete words you need word boundaries \b, to add before and after your search term.
\b is a zero width assertion that matches on a change from a word to a non-word character or from a non-word to a word character.
String term = "Foo";
String[] text = { "This contains Foo bar.", "Foo.", "Foobar", "BarFoo", "foo" };
Regex reg = new Regex(#"\b" + Regex.Escape(term) + #"\b");
foreach (var item in text) {
Match word = reg.Match(item);
if (word.Success) {
Console.WriteLine(item + ": valid");
}
else {
Console.WriteLine(item + ": invalid");
}
}
Output:
This contains Foo bar. => valid
Foo. => valid
Foobar => invalid
BarFoo => invalid
foo => invalid
Because you want to be able to specify that it's a seperate word and not a sub-word, you'll need to use regexes.
Your regex will probably look like this, if the word you're searching for is stored in the variable "lol":
Regex regex1 = new Regex(lol + #"[^a-Z]"); // include grammar marks to avoid issues like "can." not matching
Essentially, you want to try to match just that word, and make sure that there's a character after it that isn't another letter. That way, you know it's not another word.
Edit: Try this beauty instead. Learned something myself.
string sPattern = #"\b" + lol + #"\b";
Here's some example usage.
Edit2: Looks like stema got it first. Here's the page I used, for reference.
You don't need regex for simple string search.
ProductName.Contains(searchTerm);
http://msdn.microsoft.com/en-us/library/dy85x1sa.aspx

How to match a string, ignoring ending newline?

I want to be able to match an entire string (hence the word boundaries) against a pattern "ABC" ("ABC" is just used for convenience, I don't want to check for equality with a fixed string), so newlines are significant to me. However it appears that a single "\n" when put at the end of a string is ignored. Is there something wrong with my pattern?
Regex r = new Regex(#"^ABC$");
string[] strings =
{
"ABC",//True
"ABC\n",//True: But, I want it to say false.
"ABC\n\n",//False
"\nABC",//False
"ABC\r",//False
"ABC\r\n",//False
"ABC\n\r"//False
};
foreach(string s in strings)
{
Console.WriteLine(r.IsMatch(s));
}
Try this (not tested):
Regex r = new Regex(#"\AABC\z");
\A = Anchor for beginning of string
\z = Anchor for end of string
^ = Anchor for beginning of line
$ = Anchor for end of line

How do I use a regular expression to match strings that don't start with an empty space?

I want a regular expression that checks that a string doesn't start with an empty space.
I want to do something like this:
Is the following ValidationExpression right for it?
string ValidationExpression = #"/^[^ ]/";
if (!String.IsNullOrEmpty(GroupName) && !Regex.IsMatch(GroupName, ValidationExpression))
{
}
How about "^\S"
This will make sure that the first character is not a whitespace character.
You can also use:
if(GroupName.StartsWith(string.Empty)); // where GroupName == any string
Regex rx = new Regex(#"^\s+");
You can check with
Match m1 = rx.Match(" "); //m1.Success should be true
Match m2 = rx.Match("qwerty "); //m2.Success should be false
Something like this, maybe :
/^[^ ]/
And, for a couple of notes about that :
The first ^ means "string starts with"
The [^ ] means "one character that is not a space"
And the // are regex delimiter -- not sure if they are required in C#, though.

C# Regex.Split - Subpattern returns empty strings

Hey, first time poster on this awesome community.
I have a regular expression in my C# application to parse an assignment of a variable:
NewVar = 40
which is entered in a Textbox. I want my regular expression to return (using Regex.Split) the name of the variable and the value, pretty straightforward. This is the Regex I have so far:
var r = new Regex(#"^(\w+)=(\d+)$", RegexOptions.IgnorePatternWhitespace);
var mc = r.Split(command);
My goal was to do the trimming of whitespace in the Regex and not use the Trim() method of the returned values. Currently, it works but it returns an empty string at the beginning of the MatchCollection and an empty string at the end.
Using the above input example, this is what's returned from Regex.Split:
mc[0] = ""
mc[1] = "NewVar"
mc[2] = "40"
mc[3] = ""
So my question is: why does it return an empty string at the beginning and the end?
Thanks.
The reson RegEx.Split is returning four values is that you have exactly one match, so RegEx.Split is returning:
All the text before your match, which is ""
All () groups within your match, which are "NewVar" and "40"
All the text after your match, which is ""
RegEx.Split's primary purpose is to extract any text between the matched regex, for example you could use RegEx.Split with a pattern of "[,;]" to split text on either commas or semicolons. In NET Framework 1.0 and 1.1, Regex.Split only returned the split values, in this case "" and "", but in NET Framework 2.0 it was modified to also include values matched by () within the Regex, which is why you are seeing "NewVar" and "40" at all.
What you were looking for is Regex.Match, not Regex.Split. It will do exactly what you want:
var r = new Regex(#"^(\w+)=(\d+)$");
var match = r.Match(command);
var varName = match.Groups[0].Value;
var valueText = match.Groups[1].Value;
Note that RegexOptions.IgnorePatternWhitespace means you can include extra spaces in your pattern - it has nothing to do with the matched text. Since you have no extra whitespace in your pattern it is unnecesssary.
From the docs, Regex.Split() uses the regular expression as the delimiter to split on. It does not split the captured groups out of the input string. Also, the IgnorePatternWhitespace ignore unescaped whitespace in your pattern, not the input.
Instead, try the following:
var r = new Regex(#"\s*=\s*");
var mc = r.Split(command);
Note that the whitespace is actually consumed as a part of the delimiter.

Categories

Resources