Regex to exclude all chars except letters - c#

I'm a real regex n00b so I ask your help:
I need a regex witch match only letters and numbers and exclude punctations, non ascii characters and spaces.
"ilikestackoverflow2012" would be a valid string.
"f### you °§è" not valid.
"hello world" not valid
"hello-world" and "*hello_world*" not valid
and so on.
I need it to make a possibly complex business name url friendly.
Thanks in advance!

You don't need regex for this.
string s = "......"
var isValid = s.All(Char.IsLetterOrDigit);
-
I need it to make a possibly complex business name url friendly
You can also use HttpUtility.UrlEncode
var urlFriendlyString = HttpUtility.UrlEncode(yourString);

To validate a string you can use the following regular expression with Regex.IsMatch:
"^[0-9A-Za-z]+$"
Explanation:
^ is a start of string anchor.
[...] is a character class.
+ means one or more.
$ is an end of string anchor.
I need it to make a possibly complex business name url friendly
Then you want to replace the characters that don't match. Use Regex.Replace with the following regular expression:
"[^0-9A-Za-z]+"
Explanation:
[^...] is a negated character class.
Code:
string result = Regex.Replace(input, "[^0-9A-Za-z]+" , "");
See it working online: ideone
Note that different business names could give the same resulting string. For example, businesses whose names contain only Chinese characters will all give the empty string.

You can use below regex.
^[a-zA-Z0-9]+$

^[0-9a-zA-Z]+$
Matches one or more alphanumeric characters with no spaces or non-alpha characters.

Try this:
var regex = new Regex(#"^[a-zA-Z0-9]+$");
var test = new[] {"ilikestack", "hello world", "hello-world", "###"};
foreach (var s in test)
Console.WriteLine("{0}: {1}", s, regex.IsMatch(s));
EDIT: If you want something like #Andre_Miller said, you should use the same regex with Regex.Replace();
Regex.Replace(s, #"[^a-zA-Z0-9]+", "")
OR
var regex = new Regex(#"^[a-zA-Z0-9]+$");
regex.Replace("input-string-##$##");

Try
^[a-zA-Z0-9]+$
www.regexr.com is a GREAT resource.

What's wrong with [:alnum:]? It's a posix standard. So your whole regex would be: ^[:alnum:]+$.
The wikipedia article for regular expressions includes lots of examples and details.

Related

Regex to remove certain repeating characters but ignore others [duplicate]

I'm trying to find a regexp that only matches strings if they don't contain a dot, e.g. it matches stackoverflow, 42abc47 or a-bc-31_4 but doesn't match: .swp, stackoverflow or test..
^[^.]*$
or
^[^.]+$
Depending on whether you want to match empty string. Some applications may implicitly supply the ^ and $, in which case they'd be unnecessary. For example: the HTML5 input element's pattern attribute.
You can find a lot more great information on the regular-expressions.info site.
Use a regex that doesn't have any dots:
^[^.]*$
That is zero or more characters that are not dots in the whole string. Some regex libraries I have used in the past had ways of getting an exact match. In that case you don't need the ^ and $. Having a language in your question would help.
By the way, you don't have to use a regex. In java you could say:
!someString.contains(".");
Validation Require: First Character must be Letter and then Dot '.' is not allowed in Target String.
// The input string we are using
string input = "1A_aaA";
// The regular expression we use to match
Regex r1 = new Regex("^[A-Za-z][^.]*$"); //[\t\0x0020] tab and spaces.
// Match the input and write results
Match match = r1.Match(input);
if (match.Success)
{
Console.WriteLine("Valid: {0}", match.Value);
}
else
{
Console.WriteLine("Not Match");
}

Regex to test for int pattern

Fairly simple I assume. I need to build a regex pattern to match and pull out the pattern int,int,int form a string. I would like it to include negative ints too though (I then want to sub into this string a computed value);
e.g.
1,2,3
-1,3,5
100,-2,-3
etc
Regex regex = new Regex(#"\d,\d,\d");
However, I dont think his takes into account negatives?
an example string maybe
value={2,3,4},value2=test,value3={-13,0,0},anothervalue=234,nextvalue={0,0,2}
According to the information you have provided
To include negative numbers, Change your regex as below:
Regex regex = new Regex(#"\-?\d,\-?\d,\-?\d");
To include more than one unit digits
Regex regex = new Regex(#"\-?\d+,\-?\d+,\-?\d+");
Here is another option in addition to Waqar's pattern.
-?\d[0-9]*,-?\d[0-9]*,-?\d[0-9]*
\d only matches a single numeric character
Regex regex = new Regex(#"-?\d+,-?\d+,-?\d+");

Regular expression with a limited length, spaces and other limitations

I'm looking for the following regex:
The match can be empty.
If it is not empty, it must contain at least 2 characters which are English letters or digits.
The regex must allow spaces between words.
This is what I come up with:
^[a-zA-Z0-9]{2,}$
It works fine, but it does not except spaces between words.
Here, you can use this regex to make sure we match all kind of spaces (even a hard space), and make sure we allow an empty string match:
(?i)^(?:[a-z0-9]{2}[a-z0-9\p{Zs}]*|)$
C#:
var rg11x = new Regex(#"(?i)^(?:[a-z0-9]{2}[a-z0-9\p{Zs}]*|)$");
var tst = rg11x.IsMatch(""); // true
var tst1 = rg11x.Match("Mc Donalds").Value; // Mc Donalds
You can use ^[a-zA-Z\d]{2}[a-zA-Z\d\s]*?$
Here is also an useful site for learning and testing regex patterns.
http://regex101.com/

Regex to match a word beginning with a period and ending with an underscore?

I'm quite the Regex novice, but I have a series of strings similar to this "[$myVar.myVar_STATE]" I need to replace the 2nd myVar that begins with a period and ends with an underscore. I need it to match it exactly, as sometimes I'll have "[$myVar.myVar_moreMyVar_STATE]" and in that case I wouldn't want to replace anything.
I've tried things like "\b.myVar_\b", "\.\bmyVar_\b" and several more, all to no luck.
How about this:
\[\$myVar\.([^_]+)_STATE\]
Matches:
[$myVar.myVar_STATE] // matches and captures 'myvar'
[$myVar.myVar_moreMyVar_STATE] // no match
Working regex example:
http://regex101.com/r/yM9jQ3
Or if _STATE was variable, you could use this: (as long as the text in the STATE part does not have underscores in it.)
\[\$myVar\.([^_]+)_[^_]+\]
Working regex example:
http://regex101.com/r/kW8oE1
Edit: Conforming to OP's comments below, This should be what he's going for:
(\[\$myVar\.)([^_]+)(_[^_]+\])
Regex replace example:
http://regex101.com/r/pU6yL8
C#
var pattern = #"(\[\$myVar\.)([^_]+)(_[^_]+\])";
var replaced = Regex.Replace(input, pattern, "$1"+ newVar + "$3")
What about something like:
.*.(myVar_).*
This looks for anything then a . and "myVar_" followed by anything.
It matches:
"[$myVar.myVar_STATE]"
And only the first myVar_ here:
"[$myVar.myVar_moremyVar_STATE]"
See it in action.
This should do it:
\[\$myVar\.(.*?)_STATE\]
You can use this little trick to pick out the groups, and build the replacement at the end, like so:
var replacement = "something";
var input = #"[$myVar.myVar_STATE]";
var pattern = #"(\[\$myVar\.)(.*?)_(.*?)]";
var replaced = Regex.Replace(input, pattern, "$1"+ replacement + "_$2]")
C# already has builtin method to do this
string text = ".asda_";
Response.Write((text.StartsWith(".") && text.EndsWith("_")));
Is Regex really required?
string input = "[$myVar.myVar_STATE]";
string oldVar = "myVar";
string newVar = "myNewVar";
string result = input.Replace("." + oldVar + "_STATE]", "." + newVar + "_STATE]");
In case "STATE" is a variable part, then we'll need to use Regex. The easiest way is to use this Regex pattern which matches a position between a prefix and a suffix. Prefix and suffix are used for searching but are not included in the resulting match:
(?<=prefix)find(?=suffix)
result =
Regex.Replace(input, #"(?<=\.)" + Regex.Escape(oldVar) + "(?=_[A-Z]+])", newVar);
Explanation:
The prefix part is \., which stand for ".".
The find part is the escaped old variable to be replaced. Regex escaping makes sure that characters with a special meaning in Regex are escaped.
The suffix part is _[A-Z]+], an underscore followed by at least one letter followed by "]". Note: the second ] needs not to be escaped. An opening bracket [ would have to be escaped like this: \[. We cannot use \w for word characters for the STATE-part as \w includes underscores. You might have to adapt the [A-Z] part to exactly match all possible states (e.g. if state has digits, use [A-Z0-9].

Regular Expression to replace "ABC42" by "Douglas" but leave ABC421 or ABC4244 intact

Well the title says it all ...
I have a file containing different flavors of the same string followed by different integers. Let's say I have ABC42 a couple of times, a few ABC422 and one ABC4244.
I want to replace "ABC42" by "Douglas" and keep the ABS422 and ABC4244 intact in the text.
I'm using the .Net Regular Expression parser.
Thanks in advance
You can use word boundaries (the \b metacharacter) to match the intended text exactly. Your pattern would be: \bABC42\b
string input = " Let's say I have ABC42 a couple of times, a few ABC422 and one ABC4244.";
string pattern = #"\bABC42\b";
string result = Regex.Replace(input, pattern, "Douglas");
EDIT: in response to the comment asking whether this would work for "zzABC42_"...
It won't work in that case since the entire point of using \b is to match a word boundary. Since the pattern surrounds "ABC42" with \b, it matches the whole word. To match "zzABC42_" we can't use word boundaries anymore.
Instead, we need to partially match it and come up with a new criteria. Let's assume this criteria is partially match "ABC42" as long as no other digits follow "42". I can drop the \b and use a negative look-ahead to prevent extra digits from being matched. This would resemble the following pattern: ABC42(?!\d)
string input = "Hello zzABC42_, ABC422 and ABC4244.";
string pattern = #"ABC42(?!\d)";
string result = Regex.Replace(input, pattern, "Douglas");
You can use the following code provided that ABC42 is a word on its own (regex below matches based on word boundaries).
String input = "ABC42 a couple of times, a few ABC422 and one ABC4244 ABC42.";
String pattern = #"\bABC42\b";
String output = Regex.Replace(input, pattern, "Douglas");

Categories

Resources