C# Regular expression hard drive issue

C# Regular expression hard drive issue - c#

I'm trying to have a regular expression in a case of if the user chose a hard drive (e.g. "C:\" drive).
I've tried:
Match reg = Regex.Match(location, #"/[A-Z][:][\\]/");
And:
Match reg = Regex.Match(location, "/[A-Z][:][\\]/");
The 1st line doesn't detect, the 2nd line ends with an exception: System.ArgumentException

Presumably, you want to check that the string is exactly something like C:\, but not something like ABC:\\ and my dog. You need the anchors ^ and $:
^[A-Z]:\\$
In code:
foundMatch = Regex.IsMatch(yourstring, #"^[A-Z]:\\$");
Note that I have removed the brackets you had in [:] and [\\] (not necessary since in each of these cases we are matching a single literal character, not one character from a class of several possible characters).

Remove the leading and trailing / characters from the pattern; they're not part of the .NET regex syntax.

It's much simpler than you've got. All you need is this:
Match reg = Regex.Match(location, #"^[A-Z]:\\$");
The #"..." syntax is a verbatim string, which simplifies regexes (and paths).
^ will force the match to succeed only if it's at the start of the string
[A-Z] is as you had, matching the drive letter.
:\\ are the literal characters : and \, with the backslash doubled up so the regex doesn't try to treat it specially.
$ will force the match to succeed only if it's at the end of the string
The ^ and $ thus force it to match the whole input string, rather than potentially matching a string in the middle.

Related

C# - Removing single word in string after certain character

I have string that I would like to remove any word following a "\", whether in the middle or at the end, such as:
testing a\determiner checking test one\pronoun
desired result:
testing a checking test one
I have tried a simple regex that removes anything between the backslash and whitespace, but it gives the following result:
string input = "testing a\determiner checking test one\pronoun";
Regex regex = new Regex(#"\\.*\s");
string output = regex.Replace(input, " ");
Result:
testing a one\pronoun
It looks like this regex matches from the backslash until the last whitespace in the string. I cannot seem to figure out how to match from the backlash to the next whitespace. Also, I am not guaranteed a whitespace at the end, so I would need to handle that. I could continue processing the string and remove any text after the backslash, but I was hoping I could handle both cases with one step.
Any advice would be appreciated.

Change .* which match any characters, to \w*, which only match word characters.
Regex regex = new Regex(#"\\\w*");
string output = regex.Replace(input, "");

".*" matches zero or more characters of any kind. Consider using "\w+" instead, which matches one or more "word" characters (not including whitespace).
Using "+" instead of "*" would allow a backslash followed by a non-"word" character to remain unmatched. For example, no matches would be found in the sentence "Sometimes I experience \ an uncontrollable compulsion \ to intersperse backslash \ characters throughout my sentences!"

With your current pattern, .* tells the parser to be "greedy," that is, to take as much of the string as possible until it hits a space. Adding a ? right after that * tells it instead to make the capture as small as possible--to stop as soon as it hits the first space.
Next, you want to end at not just a space, but at either a space or the end of the string. The $ symbol captures the end of the string, and | means or. Group those together using parentheses and your group collectively tells the parser to stop at either a space or the end of the string. Your code will look like this:
string input = #"testing a\determiner checking test one\pronoun";
Regex regex = new Regex(#"\\.*?(\s|$)");
string output = regex.Replace(input, " ");

Try this regex (\\[^\s]*)
(\\[^\s]*)
1st Capturing group (\\[^\s]*)
\\ matches the character \ literally
[^\s]* match a single character not present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\s match any white space character [\r\n\t\f ].

regex capture multi character delimiter

I'm trying to learn regex, but still have no clue. I have this line of code, which successfully seperates the placeholder 'FirstWord' by the '{' delimiter from all following text:
var regexp = new Regex(#"(?<FirstWord>.*?)\{(?<TextBetweenCurlyBrackets>.*?)\}");
Which reads this string with no problem:
Greetings{Hello World}
What I want to do is to replace the '{' with a character chain like for instance '/>>'
so I tried this:
var regexp = new Regex(#"(?<FirstWord>.*?)\/>>(?<OtherText>.*?)\");
I removed the last bracket and replaced the first one with '/>>' But it throws an ArgumentException. How would the correct character combination look like?

/ does not need to be escaped, unless you use it as the pattern-delimiter.:
#"(?<FirstWord>.*?)/>>(?<OtherText>.*?)\"
Also your last \ will basically escape the " which should end the String (c#-wise: remove it):
#"(?<FirstWord>.*?)/>>(?<OtherText>.*?)"
And since you want most likely fetch until the END of the String (.*? will fetch as less characters as required to satisfy the expression), you should use the $ at the end or use any other sort of delimiter (whitspace, linebreak, etc...).
#"(?<FirstWord>.*?)/>>(?<OtherText>.*?)$"
Example:
(.*?)/>>(.*?)$
Debuggex Demo
Removing the trailing $ will fetch the empty string for the second match group, because "" is the shortest string possible satisfying the expression .*?
(.*?)/>>(.*?)$ on This/>>Test One will match This and Test One
(.*?)/>>(.*?)\s on This/>>Test One will match This and Test
(.*?)/>>(.*?) on This/>>Test One will match This and ""
Note: I'm saying "" is the shortest string possible satisfying the expression .?* on purpose! A frequent Misstake is to interpret .*?a as "everything until a":
Regex is greedy by default!
Searching for the expressiong (.*?)a$ on "caba" will NOT fail to match - it will return cab!, because cab followed by a is satisfying the expression AND cab is the shortest string possible for any match.
One might also expect b to be matched - but regex is working from left to right, hence aborting once it found cab - even if b would be shorter.

Regex-like construction to match %([text]) where [text] can contain escaped parens

I'm trying to resolve tokens in a string.
What I would like is given input like this:
string input = "asdf %(text) %(123) %(a\)a) asdf";
That I could run that through regex.Replace() and have it replace on "%(text)", "%(123)" and "%(a\)a)".
That is, that it would match everything between a starting "%(" and a closing ")" unless the closing ")" was escaped. (But of course, then you could escape the slash with another slash, which would prevent it from escaping the end paren...)
I'm pretty sure standard regular expressions can't do this, but I'm wondering if any of the various fancy expanded capabilities of the C# regular expression library could, rather than just iterating across the string totally manually? Or some other method that could do this? I feel like it's a common enough program that there has to be some way to solve it without implementing the solution from scratch, given the immensity of the .net framework? If I do have to implement iterating through the string and replacing with string.Replace(), I will, but it just seems so inelegant.

How about
var regex = new Regex(#"%\(.*?(?<!\\)(?:\\\\)*\)");
var result = regex.Replace(source,"");
%\( match literal %(
.*? match anything non-greedy
(?<!\\) preceding character to next match must not be \
(?:\\\\)* match zero or more literal \\ (i.e. match escaped \
\) match literal )

This is working for me :
String something = "\"asdf %(text) %(123) %(a\\)a) asdf\";";
String change = something.replaceAll("%\\(.*\\)", "");
System.out.println(change);
The output
"asdf asdf";

Regex Expression Only Numbers and Characters

I created the following regex expression for my C# file. Bascily I want the user's input to only be regular characters (A-Z lower or upper) and numbers. (spaces or symbols ).
[a-zA-Z0-9]
For some reason it only fails when its a symbol on its own. if theres characters mixed with it then the expression passes.
I can show you my code of how I implment it but I think its my expression.
Thanks!

The problem is that it can match anywhere. You need anchors:
^[a-zA-Z0-9]+\z
^ matches the start of a string, and \z matches the end of a string.
(Note: in .NET regex, $ matches the end of a string with an optional newline.)

This is because it will match any character in the string you need the following.
Forces it to match the entire string not just part of it
^[0-9a-zA-Z]*$

That regex will match every single alphanumeric character in the string as separate matches.
If you want to make sure the whole string the user entered only has alphanumeric characters you need to do something like:
^[a-zA-Z0-9]+$

Are you making sure to check the whole string? That is are you using an expression like
^[a-zA-Z0-9]*$
where ^ means the start of the string and $ means the end of the string?

Regex Expressions for all non alphanumeric symbols

I am trying to make a regular expression for a string that has at least 1 non alphanumeric symbol in it
The code I am trying to use is
Regex symbolPattern = new Regex("?[!##$%^&*()_-+=[{]};:<>|./?.]");
I'm trying to match only one of !##$%^&*()_-+=[{]};:<>|./?. but it doesn't seem to be working.

If you want to match non-alphanumeric symbols then just use \W|_.
Regex pattern = new Regex(#"\W|_");
This will match anything except 0-9 and a-z. Information on the \W character class and others available here (c# Regex Cheet Sheet).
https://www.mikesdotnetting.com/article/46/c-regular-expressions-cheat-sheet

You could also avoid regular expressions if you want:
return s.Any(c => !char.IsLetterOrDigit(c))

Can you check for the opposite condition?
Match match = Regex.Match(#"^([a-zA-Z0-9]+)$");
if (!match.Success) {
// it's alphanumeric
} else {
// it has one of those characters in it.
}

I didn't get your entire question, but this regex will match those strings that contains at least one non alphanumeric character. That includes whitespace (couldn't see that in your list though)
[^\w]+

Your regex just needs little tweaking. The hyphen is used to form ranges like A-Z, so if you want to match a literal hyphen, you either have to escape it with a backslash or move it to the end of the list. You also need to escape the square brackets because they're the delimiters for character class. Then get rid of that question mark at the beginning and you're in business.
Regex symbolPattern = new Regex(#"[!##$%^&*()_+=\[{\]};:<>|./?,-]");
If you only want to match ASCII punctuation characters, this is probably the simplest way. \W matches whitespace and control characters in addition to punctuation, and it matches them from the entire Unicode range, not just ASCII.
You seem to be missing a few characters, though: the backslash, apostrophe and quotation mark. Adding those gives you:
#"[!##$%^&*()_+=\[{\]};:<>|./?,\\'""-]"
Finally, it's a good idea to always use C#'s verbatim string literals (#"...") for regexes; it saves you a lot of hassle with backslashes. Quotation marks are escaped by doubling them.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Regular expression hard drive issue - c#

Remove the leading and trailing / characters from the pattern; they're not part of the .NET regex syntax.

Related

C# - Removing single word in string after certain character

regex capture multi character delimiter

Regex-like construction to match %([text]) where [text] can contain escaped parens

Regex Expression Only Numbers and Characters

Regex Expressions for all non alphanumeric symbols

Categories

Resources