Regex to test for int pattern - c#

Fairly simple I assume. I need to build a regex pattern to match and pull out the pattern int,int,int form a string. I would like it to include negative ints too though (I then want to sub into this string a computed value);
e.g.
1,2,3
-1,3,5
100,-2,-3
etc
Regex regex = new Regex(#"\d,\d,\d");
However, I dont think his takes into account negatives?
an example string maybe
value={2,3,4},value2=test,value3={-13,0,0},anothervalue=234,nextvalue={0,0,2}

According to the information you have provided
To include negative numbers, Change your regex as below:
Regex regex = new Regex(#"\-?\d,\-?\d,\-?\d");
To include more than one unit digits
Regex regex = new Regex(#"\-?\d+,\-?\d+,\-?\d+");

Here is another option in addition to Waqar's pattern.
-?\d[0-9]*,-?\d[0-9]*,-?\d[0-9]*

\d only matches a single numeric character
Regex regex = new Regex(#"-?\d+,-?\d+,-?\d+");

Related

Regular Expression to replace "ABC42" by "Douglas" but leave ABC421 or ABC4244 intact

Well the title says it all ...
I have a file containing different flavors of the same string followed by different integers. Let's say I have ABC42 a couple of times, a few ABC422 and one ABC4244.
I want to replace "ABC42" by "Douglas" and keep the ABS422 and ABC4244 intact in the text.
I'm using the .Net Regular Expression parser.
Thanks in advance
You can use word boundaries (the \b metacharacter) to match the intended text exactly. Your pattern would be: \bABC42\b
string input = " Let's say I have ABC42 a couple of times, a few ABC422 and one ABC4244.";
string pattern = #"\bABC42\b";
string result = Regex.Replace(input, pattern, "Douglas");
EDIT: in response to the comment asking whether this would work for "zzABC42_"...
It won't work in that case since the entire point of using \b is to match a word boundary. Since the pattern surrounds "ABC42" with \b, it matches the whole word. To match "zzABC42_" we can't use word boundaries anymore.
Instead, we need to partially match it and come up with a new criteria. Let's assume this criteria is partially match "ABC42" as long as no other digits follow "42". I can drop the \b and use a negative look-ahead to prevent extra digits from being matched. This would resemble the following pattern: ABC42(?!\d)
string input = "Hello zzABC42_, ABC422 and ABC4244.";
string pattern = #"ABC42(?!\d)";
string result = Regex.Replace(input, pattern, "Douglas");
You can use the following code provided that ABC42 is a word on its own (regex below matches based on word boundaries).
String input = "ABC42 a couple of times, a few ABC422 and one ABC4244 ABC42.";
String pattern = #"\bABC42\b";
String output = Regex.Replace(input, pattern, "Douglas");

Regex to exclude all chars except letters

I'm a real regex n00b so I ask your help:
I need a regex witch match only letters and numbers and exclude punctations, non ascii characters and spaces.
"ilikestackoverflow2012" would be a valid string.
"f### you °§è" not valid.
"hello world" not valid
"hello-world" and "*hello_world*" not valid
and so on.
I need it to make a possibly complex business name url friendly.
Thanks in advance!
You don't need regex for this.
string s = "......"
var isValid = s.All(Char.IsLetterOrDigit);
-
I need it to make a possibly complex business name url friendly
You can also use HttpUtility.UrlEncode
var urlFriendlyString = HttpUtility.UrlEncode(yourString);
To validate a string you can use the following regular expression with Regex.IsMatch:
"^[0-9A-Za-z]+$"
Explanation:
^ is a start of string anchor.
[...] is a character class.
+ means one or more.
$ is an end of string anchor.
I need it to make a possibly complex business name url friendly
Then you want to replace the characters that don't match. Use Regex.Replace with the following regular expression:
"[^0-9A-Za-z]+"
Explanation:
[^...] is a negated character class.
Code:
string result = Regex.Replace(input, "[^0-9A-Za-z]+" , "");
See it working online: ideone
Note that different business names could give the same resulting string. For example, businesses whose names contain only Chinese characters will all give the empty string.
You can use below regex.
^[a-zA-Z0-9]+$
^[0-9a-zA-Z]+$
Matches one or more alphanumeric characters with no spaces or non-alpha characters.
Try this:
var regex = new Regex(#"^[a-zA-Z0-9]+$");
var test = new[] {"ilikestack", "hello world", "hello-world", "###"};
foreach (var s in test)
Console.WriteLine("{0}: {1}", s, regex.IsMatch(s));
EDIT: If you want something like #Andre_Miller said, you should use the same regex with Regex.Replace();
Regex.Replace(s, #"[^a-zA-Z0-9]+", "")
OR
var regex = new Regex(#"^[a-zA-Z0-9]+$");
regex.Replace("input-string-##$##");
Try
^[a-zA-Z0-9]+$
www.regexr.com is a GREAT resource.
What's wrong with [:alnum:]? It's a posix standard. So your whole regex would be: ^[:alnum:]+$.
The wikipedia article for regular expressions includes lots of examples and details.

Need some C# Regular Expression Help

I'm trying to come up with a regular expression that will stop at the first occurence of </ol>. My current RegEx sort of works, but only if </ol> has spaces on either end. For instance, instead of stopping at the first instance in the line below, it'd stop at the second
some random text and HTML</ol></b> bla </ol>
Here's the pattern I'm currently using: string pattern = #"some random text(.|\r|\n)*</ol>";
What am I doing wrong?
string pattern = #"some random text(.|\r|\n)*?</ol>";
Note the question mark after the star -- that tells it to be non greedy, which basically means that it will capture as little as possible, rather than the greedy as much as possible.
Make your wild-card "ungreedy" by adding a ?. e.g.
some random text(.|\r|\n)*?</ol>
^- Addition
This will make regex match as few characters as possible, instead of matching as many (standard behavior).
Oh, and regex shouldn't parse [X]HTML
While not a Regex, why not simply use the Substring functions, like:
string returnString = someRandomText.Substring(0, someRandomText.IndexOf("</ol>") - 1);
That would seem to be a lot easier than coming up with a Regex to cover all the possible varieties of characters, spaces, etc.
This regex matches everything from the beginning of the string up to the first </ol>. It uses Friedl's "unrolling-the-loop" technique, so is quite efficient:
Regex pattern = new Regex(
#"^[^<]*(?:(?!</ol\b)<[^<]*)*(?=</ol\b)",
RegexOptions.IgnoreCase);
resultString = pattern.Match(text).Value;
Others had already explained the missing ? to make the quantifier non greedy. I want to suggest also another change.
I don't like your (.|\r|\n) part. If you have only single characters in your alternation, its simpler to make a character class [.\r\n]. This is doing the same thing and its better to read (I don't know compiler wise, maybe its also more efficient).
BUT in your special case when the alternatives to the . are only newline characters, this is also not the correct way. Here you should do this:
Regex A = new Regex(#"some random text.*?</ol>", RegexOptions.Singleline);
Use the Singleline modifier. It just makes the . match also newline characters.

How can I group multiple e-mail addresses and user names using a regular expression

I have the following text that I am trying to parse:
"user1#emailaddy1.com" <user1#emailaddy1.com>, "Jane Doe" <jane.doe# addyB.org>,
"joe#company.net" <joe#company.net>
I am using the following code to try and split up the string:
Dim groups As GroupCollection
Dim matches As MatchCollection
Dim regexp1 As New Regex("""(.*)"" <(.*)>")
matches = regexp1 .Matches(toNode.InnerText)
For Each match As Match In matches
groups = match.Groups
message.CompanyName = groups(1).Value
message.CompanyEmail = groups(2).Value
Next
But this regular expression is greedy and is grabbing the entire string up to the last quote after "joe#company.net". I'm having a hard time putting together an expression that will group this string into the two groups I'm looking for: Name (in the quotes) and E-Mail (in the angle brackets). Does anybody have any advice or suggestions for altering the regexp to get what I need?
Rather than rolling your own regular expression, I would do this:
string[] addresses = toNode.InnerText.Split(",");
foreach(string textAddress in addresses)
{
textAddress = address.Trim();
MailAddress address = new MailAddress(textAddress);
message.CompanyName = address.DisplayName;
message.CompanyEmail = address.Address;
}
While your regular expression may work for the few test cases that you have shown. Using the MailAddress class will probably be much more reliable in the long run.
How about """([^""]*)"" <([^>]*)>" for the regex? I.e. make explicit that the matched part won't include a quote/closing paren. You may also want to use a more restrictive character-range instead.
Not sure what regexp engine ASP.net is running but try the non-greedy variant by adding a ? in the regex.
Example regex
""(.*?)"" <(.*?)>
You need to specify that you want the minimal matched expression.
You can also replace (.*) pattern by more precise ones:
For example you could exclude the comma and the space...
Usually it's better to avoid using .* in a regular expression, because it reduces performance !
For example for the email, you can use a pattern like [\w-]+#([\w-]+.)+[\w-]+ or a more complex one.
You can find some good patterns on : http://regexlib.com/

C# Regular Expression to match letters, numbers and underscore

I am trying to create a regular expression pattern in C#. The pattern can only allow for:
letters
numbers
underscores
So far I am having little luck (i'm not good at RegEx). Here is what I have tried thus far:
// Create the regular expression
string pattern = #"\w+_";
Regex regex = new Regex(pattern);
// Compare a string against the regular expression
return regex.IsMatch(stringToTest);
EDIT :
#"^[a-zA-Z0-9\_]+$"
or
#"^\w+$"
#"^\w+$"
\w matches any "word character", defined as digits, letters, and underscores. It's Unicode-aware so it'll match letters with umlauts and such (better than trying to roll your own character class like [A-Za-z0-9_] which would only match English letters).
The ^ at the beginning means "match the beginning of the string here", and the $ at the end means "match the end of the string here". Without those, e.g. if you just had #"\w+", then "##Foo##" would match, because it contains one or more word characters. With the ^ and $, then "##Foo##" would not match (which sounds like what you're looking for), because you don't have beginning-of-string followed by one-or-more-word-characters followed by end-of-string.
Try experimenting with something like http://www.weitz.de/regex-coach/ which lets you develop regex interactively.
It's designed for Perl, but helped me understand how a regex works in practice.
Regex
packedasciiRegex = new Regex(#"^[!#$%&'()*+,-./:;?#[\]^_]*$");

Categories

Resources