I have a string and want to use regex to match all the chars, but no spaces.
I tried to replace all the spaces with nothing, using:
Regex.Replace(seller, #"[A-z](.+)", m => m.Groups[1].Value);
//rating
var betyg = Regex.Replace(seller, #"[A-z](.+)", m => m.Groups[1].Value);`
I expect the output of
"Iris-presenter | 5"
but, the output is
"Iris-presenter"
seen in this also seen in this demo.
The string is:
<spaces>Iris-presenter
<spaces>|
<spaces>5
Great question! I'm not quite sure, if this would be what you might be looking for. This expression however matches your input string:
^((?!\s|\n).)*
Graph
The graph shows how it might work:
Edit
Based on revo's advice, the expression can be much simplified, because
^((?!\s|\n).)* is equal to ^((?!\s).)* and both are equal to ^\S*.
I used (\s(.*?)) for it to work. This removes all spaces and new lines seen here
Related
I've been struggling with this for quite awhile (not being a regex ninja), searching stackoverflow and through trial an error. I think I'm close, but there are still a few hiccups that I need help sorting out.
The requirements are such that a given equation, that includes variables, exponents, etc, are split by the regex pattern after variables, constants, values, etc. What I have so far
Regex re = new Regex(#"(\,|\(|\)|(-?\d*\.?\d+e[+-]?\d+)|\+|\-|\*|\^)");
var tokens = re.Split(equation)
So an equation such as
2.75423E-19* (var1-5)^(1.17)* (var2)^(1.86)* (var3)^(3.56)
should parse to
[2.75423E-19 ,*, (, var1,-,5, ), ^,(,1.17,),*....,3.56,)]
However the exponent portion is getting split as well which I think is due to the regex portion: |+|-.
Other renditions I've tried are:
Regex re1 = new Regex(#"([\,\+\-\*\(\)\^\/\ ])"); and
Regex re = new Regex(#"(-?\d*\.?\d+e[+-]?\d+)|([\,\+\-\*\(\)\^\/\ ])");
which both have there flaws. Any help would be appreciated.
For the equations like the one posted in the original question, you can use
[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?|[-^+*/()]|\w+
See regex demo
The regex matches:
[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)? - a float number
| - or...
[-^+*/()] - any of the arithmetic and logical operators present in the equation posted
| - or...
\w+ - 1 or more word characters (letters, digits or underscore).
For more complex tokenization, consider using NCalc suggested by Lucas Trzesniewski's comment.
C# sample code:
var line = "2.75423E-19* (var1-5)^(1.17)* (var2)^(1.86)* (var3)^(3.56)";
var matches = Regex.Matches(line, #"[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?|[-^+*/()]|\w+");
foreach (Match m in matches)
Console.WriteLine(m.Value);
And updated code for you to show that Regex.Split is not necessary here:
var result = Regex.Matches(line, #"\d+(?:[,.]\d+)*(?:e[-+]?\d+)?|[-^+*/()]|\w+", RegexOptions.IgnoreCase)
.Cast<Match>()
.Select(p => p.Value)
.ToList();
Also, to match formatted numbers, you can use \d+(?:[,.]\d+)* rather than [0-9]*\.?[0-9]+ or \d+(,\d+)*.
So I think I've got a solution thanks to #stribizhev solution lead me to the regex solution
Regex re = new Regex(#"(\d+(,\d+)*(?:.\d+)?(?:[eE][-+]?[0-9]+)?|[-^+/()]|\w+)");
tokenList = re.Split(InfixExpression).Select(t => t.Trim()).Where(t => t != "").ToList();
When split gives me the desired array.
I suppose this should be an old question, however, I didn't find suitable solution in the forums after several hours searching.
I'm using C# and I know the Regex.Split and String.Split methods can be used to achieve the expected results. For some reason, I need to use a regular expression to match the required fields by specifying an arbitrary delimiter. For example, here is the string:
#DIV#This#DIV#is#DIV#"A "#DIV#string#DIV#
Here, #DIV# is the delimiter and is going to be split as:
This
is
"A "
string
How can I use a regular expression to match these values?
By the way, the leading and trailing #DIV# could also be ignored, for example, below source string should also be same result with above:
#DIV#This#DIV#is#DIV#"A "#DIV#string
This#DIV#is#DIV#"A "#DIV#string#DIV#
This#DIV#is#DIV#"A "#DIV#string
UPDATE:
I think I found a way (mind it is not efficient!) to get rid of empty values with a regex.
var splits = Regex.Matches(strIn, #"(?<=#DIV#|^)(?:(?!#DIV#).)+?(?=$|#DIV#)");
See demo on regexstorm (mind the \r? is only to demo in Multiline mode, you do not need it when using in real life)
ORIGINAL ANSWER
Here is another approach using a regular Split:
var strIn = "#DIV#This#DIV#is#DIV#\"A # \"#DIV#string#DIV#";
var splitText = strIn.Split(new[] {"#DIV#"}, StringSplitOptions.RemoveEmptyEntries);
Or else, you can use a regex to match the fields you need and then remove empty items with LINQ:
var spltsTxt2 = Regex.Matches(strIn, #"(?<=#DIV#|^).*?(?=#DIV#|$)").Cast<Match>().Where(p => !string.IsNullOrEmpty(p.Value)).Select(p => p.Value).ToList();
Output:
#DIV#|(.+?)(?=#DIV#|$)
Try this.Grab the captures or groups.See demo.
https://www.regex101.com/r/fJ6cR4/21
You can use the following to match:
/#?DIV#?/g
And replace with ' ' (space)
But this will give trailing and leading spaces sometimes.. which can be removed by using String.Trim()
Edit1: If you want to match the field values you can use the following:
(?<=(#?DIV#?)|^)[^#]*?(?=(#?DIV#?)|$)
See DEMO
Edit2: More generalized regex for matching # in fields:
(?m)(?<=(^(?!#?DIV#)|(#?DIV#)))(.*?)(?=($|(#DIV#?)))
I'm looking for the following regex:
The match can be empty.
If it is not empty, it must contain at least 2 characters which are English letters or digits.
The regex must allow spaces between words.
This is what I come up with:
^[a-zA-Z0-9]{2,}$
It works fine, but it does not except spaces between words.
Here, you can use this regex to make sure we match all kind of spaces (even a hard space), and make sure we allow an empty string match:
(?i)^(?:[a-z0-9]{2}[a-z0-9\p{Zs}]*|)$
C#:
var rg11x = new Regex(#"(?i)^(?:[a-z0-9]{2}[a-z0-9\p{Zs}]*|)$");
var tst = rg11x.IsMatch(""); // true
var tst1 = rg11x.Match("Mc Donalds").Value; // Mc Donalds
You can use ^[a-zA-Z\d]{2}[a-zA-Z\d\s]*?$
Here is also an useful site for learning and testing regex patterns.
http://regex101.com/
I am trying to separate the following string into a separate lines with regular expression
[property1=text1][property2=text2]
and the desired result should be
property1=text1
property2=text2
here is my code
string[] attrs = Regex.Split(attr_str, #"\[(.+)\]");
Results is incorrect, am probably doing something wrong
UPDATE: after applying the suggested answers. Now it shows spaces and empty string
.+ is a greedy match, so it grabs as much as possible.
Use either
\[([^]]+)\]
or
\[(.+?)\]
In the first case, matching ] is not allowed, so "as much as possible" becomes shorter. The second uses a non-greedy match.
Your dot is grabbing the braces as well. You need to exclude braces:
\[([^]]+)\]
The [^]] matches any character except a close brace.
Try adding the 'lazy' specifier:
Regex.Split(attr_str, #"\[(.+?)\]");
Try:
var s = "[property1=text1][property2=text2]";
var matches = Regex.Matches(s, #"\[(.+?)\]")
.Cast<Match>()
.Select(m => m.Groups[1].Value);
I have a text file which is in the format:
key1:val1,
key2:val2,
key3:val3
and I am trying to parse the key/value pairs out with a regex. Here is the regex code I am using with the same example:
string input = #"key1:val1,
key2:val2,
key3:val3";
var r = new Regex(#"^(?<name>\w+):(?<value>\w+),?$", RegexOptions.Multiline | RegexOptions.ExplicitCapture);
foreach (Match m in r.Matches(input))
{
Console.WriteLine(m.Groups["name"].Value);
Console.WriteLine(m.Groups["value"].Value);
}
When I loop through r.Matches, sometimes certain key/value pairs don't appear, and it seems to be the ones with the comma at the end of the line - but I should be taking that into account with the ,?. What am I missing here?
this might be a good situation for String.Split rather than a regex:
foreach(string pair in input.Split(new Char [] {','}))
{
string [] items = pair.Split(new Char [] {':'});
Console.WriteLine(items[0]);
Console.WriteLine(items[1]);
}
The problem is that your regular expression is not matching the newline in the first two lines.
Try changing it to
#"^(?<name>\w+):(?<value>\w+),?(\n|\r|\r\n)?$"
and it should work.
By the way, I love regular expressions, but given the problem you are trying to solve, go for the string.Split solution. It will be much easier to read...
EDIT: after reading your comment, where you say that this is a simplified version of your problem, then maybe you could simplify the expression by adding some "tolerance" for spaces / newline at the end of the match with
#"^(?<name>\w+):(?<value>\w+),?\s*$"
Also, when you play with regular expressions, test them with a tool like Expresso, it saves a lot of time.
Get rid of the RegexOptions.Multiline option.