I have a string like
[123,234,321,....]
Can i use a regular expression to extract only the numbers ?
this is the code i am using now
public static string[] Split()
{
string s = "[123,234,345,456,567,678,789,890,100]";
var temp = s.Replace("[", "").Replace("]", "");
char[] separator = { ',' };
return temp.Split(separator);
}
You can use string.Split for this - no need for a regular expression, though your code can be simplified:
var nums = s.Split('[', ']', ',');
Thought you may want to exclude empty entries in the returned array:
var nums = s.Split(new[] { '[', ']', ',' },
StringSplitOptions.RemoveEmptyEntries);
There's an overload to Trim() that takes a character.
You could do this.
string s = "[123,234,345,456,567,678,789,890,100]";
var nums = s.Trim('[').Trim(']').Split(',');
If you want to use a regular expression, try:
string s = "[123,234,345,456,567,678,789,890,100]";
var matches = Regex.Matches(s, #"[0-9]+", RegexOptions.Compiled);
However, regular expressions tend to make your code less readable, so you might stick with your original approach.
Try with using string.Split method;
string s = "[123,234,345,456,567,678,789,890,100]";
var numbers = s.Split('[',']', ',');
foreach(var i in numbers )
Console.WriteLine(i);
Here is a DEMO.
EDIT: As Oded mentioned, you may want to use StringSplitOptions.RemoveEmptyEntries also.
string s = "[123,234,345,456,567,678,789,890,100]";
MatchCollection matches = Regex.Matches(s, #"(\d+)[,\]]");
string[] result = matches.OfType<Match>().Select(m => m.Groups[1].Value).ToArray();
Here the # is used to signify a verbatim string literal and allows the escape character '\' to be used directly in Regular expression notation without escaping itself "\".
\d is a digit, \d+ mean 1 or more digits. The parenthesis signify a group so (\d+) means I want a group of digits. (*See group used a little later)
[,\]] square brackets, in brief, mean choose any one of my element so it will choose either the comma , or a square bracket ] which I had to escape.
So the regular expression will find the expressions of sequential digits followed by a , or ]. The Matches will return the set of matches (which we use because there are multiple set) then we go through each match - with some LINQ - and grab the index 1 group which is the second group, "But we only made one group?" We only specified one group, the first group (index 0) is the entire regular expression match, which in our case, will include the , or ] which we don't want.
while you can and probably should use string.Split as other answers indicate, the question specifically asks if you can do it with regex, and yes, you can :-
var r = new Regex(#"\d+", RegexOptions.Compiled );
var matches = r.Matches("[123,234,345,456,567,678,789,890,100]");
I need to separate strings with semicolon (;) as delimiter. The semicolon inside parenthesis should be ignored.
Example:
string inputString = "(Apple;Mango);(Tiger;Horse);Ant;Frog;";
The output list of strings should be:
(Apple;Mango)
(Tiger;Horse)
Ant
Frog
The other valid input strings can be :
string inputString = "The fruits are (mango;apple), and they are good"
The above string should split to a single string
"The fruits are (mango;apple), and they are good"
string inputString = "The animals in (African (Lion;Elephant) and Asian(Panda; Tiger)) are endangered species; Some plants are endangered too."
The above string should split to two strings as shown below:
"The animals in (African (Lion;Elephant) and Asian(Panda; Tiger)) are endangered species"
"Some plants are endangered too."
I searched a lot but could not find the answer to the above scenario.
Does anybody know how to achieve this without reinventing the wheel?
Use a regular expression that matches what you want to keep, not the separators:
string inputString = "(Apple;Mango);(Tiger;Horse);Ant;Frog;";
MatchCollection m = Regex.Matches(inputString, #"\([^;)]*(;[^;)]*)*\)|[^;]+");
foreach (Match x in m){
Console.WriteLine(x.Value);
}
Output:
(Apple;Mango)
(Tiger;Horse)
Ant
Frog
Expression comments:
\( opening parenthesis
[^;)]* characters before semicolon
(;[^;)]*)* optional semicolon and characters after it
\) closing parenthesis
| or
[^;]+ text with no semicolon
Note: The expression above also accepts values in parentheses without a semicolon, e.g. (Lark) and mulitple semicolons, e.g. (Lark;Pine;Birch). It will also skips empty values, e.g. ";;Pine;;;;Birch;;;" will be two items, not ten.
Handle the paranthesized case separately from the "normal" case, to ensure that semicolons are omitted in the former.
A regex to achieve this (matching a single element in your input) may look like the following (not tested):
"\([A-Za-z;]+\)|[A-Za-z]+"
Say I have a string such as
abc123def456
What's the best way to split the string into an array such as
["abc", "123", "def", "456"]
string input = "abc123def456";
Regex re = new Regex(#"\D+|\d+");
string[] result = re.Matches(input).OfType<Match>()
.Select(m => m.Value).ToArray();
string[] result = Regex.Split("abc123def456", "([0-9]+)");
The above will use any sequence of numbers as the delimiter, though wrapping it in () says that we still would like to keep our delimiter in our returned array.
Note: In the example snippet we will get an empty element as the last entry of our array.
The boundary you look for can be described as "A position where a digit follows a non-digit, or where a non-digit follows a digit."
So:
string[] result = Regex.Split("abc123def456", #"(?<=\D)(?=\d)|(?<=\d)(?=\D)");
Use [0-9] and [^0-9], respectively, if \d and \D are not specific enough.
Add space around digitals, then split it. So there is the solution.
Regex.Replace("abc123def456", #"(\d+)", #" \1 ").Split(' ');
I hope it works.
You could convert the string to a char array and then loop through the characters. As long as the characters are of the same type (letter or number) keep adding them to a string. When the next character no longer is of the same type (or you've reached the end of the string), add the temporary string to the array and reset the temporary string to null.
Is there an elegant way to split up string at first (and only first) occurence of two or more whitespaces ?
Or at least finding the index of this two or more whitespaced string.
Thank you very much.
You can construct an instance instead of using the static method and utilize the overload which restricts the number of splits performed:
Regex regex = new Regex(#"\s{2,}");
string[] result = regex.Split(input, 2); // only 1 split, two parts
Check this out:
String.Split only on first separator in C#?
Or:
http://msdn.microsoft.com/en-us/library/c1bs0eda.aspx
String.Split(separator, number of strings to return)
Use regex splitting as shown here
I guess you'd end up with something like this:
RegexOptions options = RegexOptions.None;
Regex regex = new Regex(#"[ ]{2,}", options);
string[] operands = Regex.Split(operation, regex);
So I'm trying to match up a regex and I'm fairly new at this. I used a validator and it works when I paste the code but not when it's placed in the codebehind of a .NET2.0 C# page.
The offending code is supposed to be able to split on a single semi-colon but not on a double semi-colon. However, when I used the string
"entry;entry2;entry3;entry4;"
I get a nonsense array that contains empty values, the last letter of the previous entry, and the semi-colons themselves. The online javascript validator splits it correctly. Please help!
My regex:
((;;|[^;])+)
Split on the following regular expression:
(?<!;);(?!;)
It means match semicolons that are neither preceded nor succeeded by another semicolon.
For example, this code
var input = "entry;entry2;entry3;entry4;";
foreach (var s in Regex.Split(input, #"(?<!;);(?!;)"))
Console.WriteLine("[{0}]", s);
produces the following output:
[entry]
[entry2]
[entry3]
[entry4]
[]
The final empty field is a result of the semicolon on the end of the input.
If the semicolon is a terminator at the end of each field rather than a separator between consecutive fields, then use Regex.Matches instead
foreach (Match m in Regex.Matches(input, #"(.+?)(?<!;);(?!;)"))
Console.WriteLine("[{0}]", m.Groups[1].Value);
to get
[entry]
[entry2]
[entry3]
[entry4]
Why not use String.Split on the semicolon?
string sInput = "Entry1;entry2;entry3;entry4";
string[] sEntries = sInput.Split(';');
// Do what you have to do with the entries in the array...
Hope this helps,
Best regards,
Tom.
As tommieb75 wrote, you can use String.Split with StringSplitOptions Enumeration so you can control your output of newly created splitting array
string input = "entry1;;entry2;;;entry3;entry4;;";
char[] charSeparators = new char[] {';'};
// Split a string delimited by characters and return all non-empty elements.
result = input.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
The result would contain only 4 elements like this:
<entry1><entry2><entry3><entry4>