Regex matching numbers without letters in front of it - c#

I want to match numbers like "100", "1.1", "5.404", IF they do not include a letter in front like this: "V102".
Here is my current regular expression:
(?<![A-Za-z])[0-9.]+
This is supposed to match any character 0-9. one or more repetitions, if prefix is absent (A-Za-z).
But what it does is match V102, as 02, so it just chips away V and one more letter and then the rest fits while it actually shouldn't match that case at all. How can I make it so it grabs all numbers, and then checks if the prefix is non existent?

Add digits and decimal point to your negative lookbehind:
(?<![A-Za-z0-9.])[0-9.]+
This will force all matches to start with a non-digit and non-letter (i.e., a space or other separator). That way the end of a number will not be a valid match either.
Demo: http://www.rubular.com/r/EDuI2D9jnW

could you possibly be able to use word boundaries?
\b[0-9\.]+\b

Try the regex:
(?<![A-Za-z0-9])[0-9.]+

If you don't want letters or spaces anywhere in your string, then this should work:
^[0-9.]+$

A Non-Regex solution.
If you have the following string, then you can use double.TryParse to see if the string is a double. Try:
string str = "100 1.1 V100 d333 ABC 1.1";
double temp;
string[] result = str.Split().Where(r => (double.TryParse(r, out temp))).ToArray();
Or if you need a double array in return then:
double[] numberArray = str.Split()
.Where(r => double.TryParse(r, out temp))
.Select(r => double.Parse(r))
.ToArray();

Try using the caret ^ operator. This operator indicates that you want your pattern to start at the beginning of the input. For example ^[0-9.]+ will match inputs that begin with a digit or a . and has any number of them.
Note that this pattern does not match only numbers, as it matches also patterns with more then 1 dot, for example 2.00.2, which is not a valid number.

Related

Regex Match all characters until reach character, but also include last match

I'm trying to find all Color Hex codes using Regex.
I have this string value for example - #FF0000FF#0038FFFF#51FF00FF#F400FFFF and I use this:
#.+?(?=#)
pattern to match all characters until it reaches #, but it stops at the last character, which should be the last match.
I'm kind of new to this Regex stuff. How could I also get the last match?
Your regex does not match the last value because your regex (with the positive lookahead (?=#)) requires a # to appear after an already consumed value, and there is no # at the end of the string.
You may use
#[^#]+
See the regex demo
The [^#] negated character class matches any char but # (+ means 1 or more occurrences) and does not require a # to appear immediately to the right of the currently matched value.
In C#, you may collect all matches using
var result = Regex.Matches(s, #"#[^#]+")
.Cast<Match>()
.Select(x => x.Value)
.ToList();
A more precise pattern you may use is #[A-Fa-f0-9]{8}, it matches a # and then any 8 hex chars, digits or letters from a to f and A to F.
Don't rely upon any characters after the #, match hex characters and it
will work every time.
(?i)#[a-f0-9]+

Split string by char, but skip certain char combinations

Say I have a string in a form similar to this:
"First/Second//Third/Fourth" (notice the double slash between Second and Third)
I want to be able to split this string into the following substrings "First", "Second//Third", "Fourth". Basically, what I want is to split the string by a char (in this case /), but not by double of that char (in this case //). I though of this in a number of ways, but couldn't get it working.
I can use a solution in C# and/or JavaScript.
Thanks!
Edit: I would like a simple solution. I have already thought of parsing the string char by char, but that is too complicated in my real live usage.
Try with this C# solution, it uses positive lookbehind and positive lookahead:
string s = #"First/Second//Third/Fourth";
var values = Regex.Split(s, #"(?<=[^/])/(?=[^/])", RegexOptions.None);
It says: delimiter is / which is preceded by any character except / and followed by any character except /.
Here is another, shorter, version that uses negative lookbehind and lookahead:
var values = Regex.Split(s, #"(?<!/)/(?!/)", RegexOptions.None);
This says: delimiter is / which is not preceded by / and not followed by /
You can find out more about 'lookarounds' here.
In .NET Regex you can do it with negative assertions.(?<!/)/(?!/) will work. Use Regex.Split method.
ok one thing you can do is to split the string based on /. The array you get back will contain empty allocations for all the places // were used. loop through the array and concatenate i-1 and i+1 allocations where i is the pointer to the empty allocation.
How about this:
var array = "First/Second//Third/Fourth".replace("//", "%%").split("/");
array.forEach(function(element, index) {
array[index] = element.replace("%%", "//");
});

Regex to isolate a specific substring

I have this string I have retrieved from a File.ReadAllText:
6 11 rows processed
As you can see there is always an integer specifying the line number in this document. What I am interested in is the integer that comes after it and the words "rows processed". So in this case I am only interested in the substring "11 rows processed".
So, knowing that each line will start with an integer and then some white space, I need to be able to isolate the integer that follows it and the words "rows processed" and return that to a string by itself.
I have been told this is easy to do with Regex, but so far I haven't the faintest clue how to build it.
You don't need regular expressions for this. Just split on the whitespace:
var fields = s.Split(new char[0], StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(String.Join(" ", fields.Skip(1));
Here, I am using the fact that if you pass an empty array as the char [] parameter to String.Split, it splits on all whitespace.
This should work for what you need:
\d+(.*)
This searches for 1 or more digits (\d+) and then it puts everything afterwards in a group:
. = any character
* = repeater (zero or more of the preceding value (which is any character in the above
() = grouping
However, Jason is correct in that you only need to use a split function
If you need to use a Regex it would be like this:
string result = null;
Match match = Regex.Match(row, #"^\s*\d+\s*(.*)");
if (match.Success)
result = match.Groups[1].Value;
The regex matches from start of row: first spaces if any, then digits and then more spaces. Last it extracts rest of line and return it as result.
This is done easily with Regex.Replace() using the following regular expression...
^\d+\s+
So it'd be something like this:
return Regex.Replace(text, #"^\d+\s+", "");
Basically you're just trimming the first number \d and the whitespace \s that follows.
Example in PHP(C# regex should be compatible):
$line = "6 11 rows processed";
$resp = preg_match("/[0-9]+\s+(.*)/",$line,$out);
echo $out[1];
I hope I catched your point.

c# Regex 0-9 and dash

I have a credit card number that I need to check if it has digits from 0 - 9 and also any dashes.
I have the following:
Match match = Regex.Match(CardNumber, "[0-9-]");
if (match.Success)
{
}
It works but wondering if I missed anything that may not make it work.
Thanks
Right now it only checks if there is at least one digit or dash inside the string CardNumber, so it would return True for the string "hello0!".
If you want to validate the string so it only consists of digits and dashes, you need to use
Match match = Regex.Match(CardNumber, #"^[0-9-]*$");
As a small note to what #Tim wrote, his regex will match -12--34-. Probably what you want is:
^([0-9]+-)*[0-9]+$
This will require at least a digit. If you want the empty string to match, use
^([0-9]+-)*[0-9]*$
(0 or more "groups" of one or more digits plus a - and a final "group" of digits)

How to split a string on numbers and it substrings?

How to split a string on numbers and substrings?
Input: 34AG34A
Expected output: {"34","AG","34","A"}
I have tried with Regex.Split() function, but I can not figure out what pattern would work.
Any ideas?
The regular expression (\d+|[A-Za-z]+) will return the groups you require.
I think you have to look for two patterns:
a sequence of digits
a sequence of letters
Hence, I'd use ([a-z]+)|([0-9]+).
For instance, System.Text.RegularExpressions.Regex.Matches("asdf1234be56qq78", "([a-z]+)|([0-9]+)") returns 6 groups, containing "asdf", "1234", "be", "56", "qq", "78".
First, you ask for "numbers" but don't specify what you mean by that.
If you mean "digits in 0-9" then you need the character class [0-9]. There is also the character class \d which in addition to 0-9 matches some other characters.
\d matches any decimal digit. It is equivalent to the \p{Nd} regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets.
I assume that you are not interested in negative numbers, numbers containing a decimal point, foreign numerals such as δΊ”, etc.
Split is not the right solution here. What you appear to want to do is tokenize the string, not split it. You can do this by using Matches instead of Split:
string[] output = Regex.Matches(s, "[0-9]+|[^0-9]+")
.Cast<Match>()
.Select(match => match.Value)
.ToArray();
Don't use Regex.Split, use Regex.Match:
var m = Regex.Match("34AG34A", "([0-9]+|[A-Z]+)");
while (m.Success) {
Console.WriteLine(m);
m = m.NextMatch();
}
Converting this to an array is left as an exercise to the reader. :-)

Categories

Resources