How do make regex match all or nothing - c#

I want a RegEx to match the string that composes a valid IP, colon, and port. If the string contains a valid IP and invalid port # or vice-versa, I want it to match nothing at all. I'm implementing this in a C# app.
To do this, I'm trying to integrate the following from How to Find or Validate an IP Address
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
with the following from regex for port number
((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{0,5})|([0-9]{1,4}))
Each of these work independently to match an IP address and port number just fine.
I combined them
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\:((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{0,5})|([0-9]{1,4}))
and the result is, for example:
256.250.139.193:1234 // bad IP, good port. The RegEx matches "56.250.139.193:1234". Fail. I want it to match nothing
1.1.1.1:65535 // good IP, good port #. The RegEx matches "1.1.1.1:65535". Pass. This is what I want it to do
1.1.1.1:65536 // good IP, bad port, matches "1.1.1.1:". Fail. I want it to match nothing
I can't figure out how to combine them to match all or nothing. I tried using repetition and grouping and it either didn't change what is matched or broke the RegEx entirely

Put word boundaries around the pattern.
Also, you had an error in your pattern for the port number. [0-5]{0,5} should be [0-5]{1,5}, otherwise it matches an empty port number.
\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?):((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{1,5})|([0-9]{1,4}))\b
DEMO

One reliable and readable way, using a specific Perl parser Regexp::Common:
perl -MRegexp::Common -lne '
my ($ip, $port) = /^($RE{net}{IPv4}):(\d+)$/;
print "$ip:$port" if defined $ip and defined $port and $port < 65536
' file

Related

Regular Expression to Match IP Subnet

I need a C# regular expression that will match an IP Subnet, like "127.65.231", but not match an IP Address on the subnet, like "127.65.231.111". I had found this Regex for an IP Address:
#"\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b"
and was thinking I could just delete the part that checks the last octet, like this:
#"\b\d{1,3}.\d{1,3}.\d{1,3}\b"
but this matches both the IP Address and the Subnet. Can anyone help with this?
You might try using a lookahead. Also, please escape the . characters—otherwise it would match any character:
#"\b\d{1,3}\.\d{1,3}\.\d{1,3}(?=\.\d{1,3})\b"
This will match any string like 127.65.231 as long as it's followed by a string like .111.
#"^\d{1,3}\.\d{1,3}\.\d{1,3}$"
use Line Anchors. Add ^ at the beginning of your Regex, and $ at the end, to verify the beginning and end of the input.
This will match 127.65.231 but not 127.65.231.111

simple regex for a series of numbers and dots. N{3}.N{3}.N{3}.N{3}

I have an ASP.NET 4.0 MVC app in C# and I need to create a regex that will match N{3}.N{3}.N{3}.{N{3} where N{3} is any 1, 2, or 3 digits(0-9) e.g.
1.1.1.1
111.111.111.111
1.111.111.1
I have tried
#"^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}$"
but this matches things I don't want it to like
111.1.1
1111.1.1
What am I doing wrong?
A . in a regular expression means "any character." Therefore if you want to match a literal . you need to escape it, as shown below:
#"^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$"
If you're trying to match an IP address, there are some great RexEx expressions here:
Regular expression to match DNS hostname or IP Address?

check to see if a string contains an ip address

I have a file loaded into a stream reader. The file contains ip addresses scattered about. Before and after each IP addresses there is "\" if this helps. Also, the first "\" on each line ALWAYS comes before the ip address, there are no other "\" before this first one.
I already know that I should use a while loop to cycle through each line, but I dont know the rest of the procedure :<
For example:
Powerd by Stormix.de\93.190.64.150\7777\False
Cupserver\85.236.100.100\8178\False
Euro Server\217.163.26.20\7778\False
in the first example i would need "93.190.64.150"
in the second example i would need "85.236.100.100"
in the third example i would need "217.163.26.20"
I really struggle with parsing/splicing/dicing :s
thanks in advance
*** I require to keep the IP in a string a bool return is not sufficient for what i want to do.
using System.Text.RegularExpressions;
…
var sourceString = "put your string here";
var match = Regex.Match(sourceString, #"\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b");
if(match.Success) Console.WriteLine(match.Captures[0]);
This will match any IP address, but also 999.999.999.999. If you need more exactness, see details here: http://www.regular-expressions.info/examples.html
The site has lots of great info an regular expressions, which is a domain-specific language used within most popular programming languages for text pattern matching. Actually, I think the site was put together by the author of Mastering Regular Expressions.
update
I modified the code above to capture the IP address, as you requested (by adding parentheses around the IP address pattern). Now we check to make sure there was a match using the Success property, and then you can get the IP address using Captures[0] (because we only have one capture group, we know to use the first index, 0).
EDIT: Edited to take account of the "slash at beginning and end" part.
Try to match each line against a regex of (all as one string; split for readability).
\\(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\
Full sample:
using System;
using System.Text.RegularExpressions;
class Program
{
private static readonly Regex Pattern = new Regex
(#"\\(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}" +
#"(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\");
static void Main(string[] args)
{
Console.WriteLine(ContainsAddress("Bad IP \\400.100.100.100\\ xyz"));
Console.WriteLine(ContainsAddress("Good IP \\200.255.123.100\\ xyz"));
Console.WriteLine(ContainsAddress("No IP \\but slashes\\ xyz"));
Console.WriteLine(ContainsAddress("Long IP \\123.100.100.100.100\\ x"));
}
static bool ContainsAddress(string line)
{
return Pattern.IsMatch(line);
}
}
Looks like, for each line, you're looking for "^.*?\\(?<address>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\\.*$"
To break this down:
^ - matches the beginning of a line, helping ensure you'll start matching at the right point.
.*? - matches any character zero or more times, but as few times as possible.
\ - matches the backslash character. Coupled with the two prior terms, this will get us to the first backslash of a line so we can capture the next term.
(?) - specifies a named group of characters that can be referred to from within matches. The text of the full match will be the entire line the way this is written, but this named group will be only what you're looking for out of the match.
[0-9]{1,3} - matches a sequence of between 1 and 3 digit characters. The [0-9] is equivalent to \d but I find that when a regex has fewer backslashes and more characters you'd normally see in the string, it's more understandable.
. - matches a period.
.* - matches any character zero or more times. Used to skip to the end.
$ - matches the end of a line.

Regular Expression to match IP address + wildcard

I'm trying to use a RegularexpressionValidator to match an IP address (with possible wildcards) for an IP filtering system.
I'm using the following Regex:
"([0-9]{1,3}\\.|\\*\\.){3}([0-9]{1,3}|\\*){1}"
Which works fine when running it in LINQPad with Regex.Matches, but doesn't seem to work when I'm using the validator.
Does anyone have a suggestion as to either a better Regex or why it would work in test but not in situ?
Cheers, Ed
This: \\.|\\*\\. looks like the dodgy bit. Do this instead:
#"^(([0-9]{1,3}|\*)\.){3}([0-9]{1,3}|\*)$"
And to only accept 0-255 (thanks, apoorv020):
^((([0-9]{1,2})|(1[0-9]{2,2})|(2[0-4][0-9])|(25[0-5])|\*)\.){3}(([0-9]{1,2})|(1[0-9]{2,2})|(2[0-4][0-9])|(25[0-5])|\*)$
asp:RegularExpressionValidator does not require you to double-escape backslashes. You should try:
([0-9]{1,3}\.|\*\.){3}([0-9]{1,3}|\*){1}
[0-9]{1,3} would allow IP addresses of the form 999.999.999.999 . Your IP address range should allow only 0-255.
Replace all occurences of [0-9]{1,3} with
([0-9]{1,2})|(1[0-9]{2,2})|(2[0-4][0-9])|(25[0-5])
This does seem very complicated to me, and probably there are better ways of doing this, but it seems correct at first glance.
How about putting start and end string characters on the expression
^([0-9]{1,3}\\.|\\*\\.){3}([0-9]{1,3}|\\*){1}$
My answer is general for .NET, not RegularExpressionValidator-specific.
Regex string for IP matching (use ExplicitCapture to avoid useless capturing and keep RE concise):
"\\b0*(2(5[0-5]|[0-4]\\d)|1?\\d{1,2})(\\.0*(2(5[0-5]|[0-4]\\d)|1?\\d{1,2})){3}\\b"
Depending on particular use case you may want to add appropriate anchors, i.e. \A or ^ at the beginning and \Z or $ at the end. Then you can remove word-boundaries requirement: \b.
(Remember about doubling \ inside the string)

Email entry regex validation

I am using the following regex to validate an email address:
"^[-a-zA-Z0-9][-.a-zA-Z0-9]*#[-.a-zA-Z0-9]+(\.[-.a-zA-Z0-9]+)*\.(com|edu|info|gov|int|mil|net|org|biz|name|museum|coop|aero|pro|[a-zA-Z]{2})$"
Unfortunately, this does not allow email addresses with hyphens underscores. Ex.:
first_last#abc.com
How can I modify this to allow hyphens underscores?
_ is not hyphen, it is underscore. Hyphen is -
If it is okay to start an email address with an underscore, add _ to both of the character classes that appear before #
^[-a-zA-Z0-9_][-.a-zA-Z0-9_]*#...
If the email id cannot start with an _, add it only to the second character class:
^[-a-zA-Z0-9][-.a-zA-Z0-9_]*#...
That said, your regex has a couple of issues:
It accepts email addresses starting with a hyphen; is this intended? If not, remove the - from the first character class to make it [a-zA-Z0-9]
It accepts consecutive periods after the first character thereby making 3...#example.com a valid id - is this status-by-design?
RFC specification for email address is quite complicated. See these threads for more information. Also don't forget to check the one and only perfect and the official regex for validating email addresses (be warned that you might find it a little longer than what sanity would suggest)
"^[-_a-zA-Z0-9][-_.a-zA-Z0-9]*#[-_.a-zA-Z0-9]+(\.[_-.a-zA-Z0-9]+)*\.(com|edu|info|gov|int|mil|net|org|biz|name|museum|coop|aero|pro|[a-zA-Z]{2})$"
Possibly?
^[-a-zA-Z0-9_][-.a-zA-Z0-9_]*#[-.a-zA-Z0-9]+(\.[-.a-zA-Z0-9]+)*\.(com|edu|info|gov|int|mil|net|org|biz|name|museum|coop|aero|pro|[a-zA-Z]{2})$
I added "_" to your two character classes.
Regular-expressions.info has a very good discussion of e-mail address validation by regex, including his preferred regex for "99% of all e-mail addresses in use today", and another to match e-mail addresses as defined by RFC-2822.
I won't do the author a disservice by copying his work here. But I do think it's worthy of a read since it's directly related to your question.
There is also an interesting blog post about email validation on Larry Osterman's website.
This is a followup post to the original post in which he attempts to generate a regular expression to validate an email address. His RegExp is:
string strRegex = #"^([a-zA-Z0-9_\-\.]+)#((\[[0-9]{1,3}" +
#"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" +
#".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$";
His notes:
The key thing to note in this grammar is that the local-part is almost free-form when it comes to the local part. And there are characters allowed in the local part like !, *, $, etc that are totally legal according to RFC2822 that aren't allowed.
and ...
Adi Oltean pointed out that V2 of the .Net framework contains the System.Net.MailAddress class which contains a built-in validator.
It looks like the System.Net.Mail.MailAddress constructor validates email addresses and you can catch a FormatException to ensure that the email is valid.

Categories

Resources