Regex + Convert line of numbers separated by white space into array - c#

I'm trying to convert a string that contains multiple numbers, where each number is separated by white space, into a double array.
For example, the original string looks like:
originalString = "50 12.2 30 48.1"
I've been using Regex.Split(originalString, #"\s*"), but it's returning an array that looks like:
[50
""
12
"."
2
""
...]
Any help is much appreciated.

Using this instead
originalString.Split(new char[]{'\t', '\n', ' ', '\r'}, StringSplitOptions.RemoveEmptyEntries);
No need to rush RegEx everytime :)

What about string[] myArray = originalString.Split(' ');
I don't see the need for a RegEx here..
If you really want to use a RegEx, use the pattern \s+ instead of \s*.
The * means zero or more, but you want to split on one or more space character.
Working example with a RegEx:
string originalString = "50 12.2 30 48.1";
string[] arr = Regex.Split(originalString, #"\s+");
foreach (string s in arr)
Console.WriteLine(s);

Regex.Split(originalString, #"\s+").Where(s => !string.IsNullOrWhiteSpace(s))
The Where returns an IEnumerable with the null/whitespace filtered out. if you want it as an array still, then just add .ToArray() to that chain of calls.
The + character is necessary because you need a MINIMUM of one to make this a correct match.

I would stick with String.Split, supplying all whitespace characters that you are expecting.
In regular expressions, \s is equivalent to [ \t\r\n] (plus some other characters specific to the flavour in use); we can represent these through a char[]:
string[] nums = originalString.Split(
new char[] { ' ', '\t', '\r', '\n' },
StringSplitOptions.RemoveEmptyEntries);

The default behaviour if you pass null as a separator to String.Split is to split on whitespace. That includes anything that matches the Unicode IsWhiteSpace test. Within the ASCII range that means tab, line feed, vertical tab, form feed, carriage return and space.
Also you can avoid empty fields by passing the RemoveEmptyEntries option.
originalString = "50 12.2 30 48.1";
string[] fields = originalString.Split(null as char[], StringSplitOptions.RemoveEmptyEntries);

Related

Split string with specific requirements

Let's say I have the string
string Song = "The-Sun - Is Red";
I need to split it from the '-' char, but only if the char before and after is a space.
I don't want it to split at "The-Sun"'s dash, but rather at "Sun - Is"'s dash.
The code I was using to split was
string[] SongTokens = Song.Split('-');
But that obviously splits at the first I believe. I only need to split if it has a space before and after the '-'
Thanks
I need to split it from the '-' char, but only if the char before and after is a space.
You can use a non-regex solution like this:
string[] SongTokens = Song.Split(new[] {" - "}, StringSplitOptions.RemoveEmptyEntries);
Result:
See more details about String.Split Method (String[], StringSplitOptions) at MSDN. The first argument is separator that represent a string array that delimits the substrings in this string, an empty array that contains no delimiters, or null.
The StringSplitOptions.RemoveEmptyEntries removes all empty elements from the resulting array. You may use StringSplitOptions.None to keep the empty elements.
Yet there can be a problem if you have a hard space or a regular space on both ends. Then, you'd rather choose a regex solution like this:
string[] SongTokens = Regex.Split(Song, #"\p{Zs}+-\p{Zs}+")
.Where(x => !String.IsNullOrWhiteSpace(x))
.ToArray();
The \p{Zs}+ pattern matches any Unicode "horizontal" whitespace, 1 or more occurrences.
string[] SongTokens = Song.Split(new string[] {" - "}, StringSplitOptions.None);

Is it possible to split a string into an array of strings and remove sections not between delimiters using String.Split or regex?

I was wanting to split a string with a known delimiter between different parts into an array of strings using a method (e.g. MethodToSplitIntoArray(String toSplit) like in the example below. The values are string values which can have any character except for '{', '}', or ',' so am unable to delimit on any other character. The string can also contain undesired white space at the start and end as the file can be generated from multiple different sources, the desired information will also be inbetween "{" "}" and separated by a comma.
String valueCombined = " {value},{value1},{value2} ";
String[] values = MethodToSplitIntoArray(valueCombined);
foreach(String value in values)
{
//Do something with array
Label.Text += "\r\nString: " + value;
}
Where the label would show:
String: value
String: value1
String: value2
My current implementation of splitting method is below. It splits the values but includes any spaces before the first parenthesis and anything between them.
private String[] MethodToSplitIntoArray(String toSplit)
{
return filesPassed.Split(new string[] { "{", "}" }, StringSplitOptions.RemoveEmptyEntries);
}
I though this would separate out the strings between the curly braces and remove the rest of the string, but my output is:
String:
String: value
String: ,
String: value1
String: ,
String: value2
String:
What am I doing wrong in my split that I'm still getting the string values outside of the parenthesis? Ideally I would like to use regex or String.Split if its possible
For those with similar problems check out DotNet Perls on splitting
Making the assumption that commas are not permitted inside a curly brace pair, and that outside a curly brace pair only commas or whitespace will appear, it seems to me that the most straightforward, easy-to-read way to approach this is to first split on commas, then trim the results of that (to remove whitespace), and then finally to remove the first and last characters (which at that point should only be the curly braces):
valuesCombined.Split(',').Select(s => s.Trim().Substring(1, s.Length - 2)).ToArray();
I believe that including the curly braces in the initial split operation just makes everything harder, and is more likely to break in hard-to-identify ways (i.e. bad data will result in weirder results than if you use something like the above).
Add , to delimeters:
return filesPassed.Split(new char[] { '{', '}', ',' }, StringSplitOptions.RemoveEmptyEntries);
Not sure if you are expecting those spaces in the front and end so added some trimming to prevent empty results for those.
private String[] MethodToSplitIntoArray(String toSplit)
{
return toSplit.Trim().Split(new char[] { '{', '}', ',' }, StringSplitOptions.RemoveEmptyEntries);
}
This might be one of the way to get all the values as u are looking for
String valueCombined = " {value},{value1},{value2} ";
String[] values = valueCombined.Split(new string[] { "},{" }, StringSplitOptions.RemoveEmptyEntries);
int lastVal = values.Count() - 1;
values[0] = values[0].Replace("{", "");
values[lastVal] = values[lastVal].Replace("}", "");
What I did here is that splited the string with "},{" and then removed { from the first array item and } from the last array item.
Try regex and linq.
return Regex.Split(toSplit, "[.{.}.,]").Where(x => !string.IsNullOrWhiteSpace(x)).ToArray();
Though very late but can you try this:
Regex.Split(" { value},{ value1},{ value2};", #"\s*},{\s*|{\s*|},?;?").Where(s => string.IsNullOrWhiteSpace(s) == false).ToArray()

Is there a way to do a string.Split on whitespace

I have a string "mystring theEnd" but I want to do a string.Split on white space, not just on a space because I want to get a string[] that contains "mystring" and "theEnd" between "mystring" and "theEnd" there is an unknown amount of spaces, this is why I need to split on whitespace. Is there a way to do this?
How about:
string[] bits = text.Split(new[] {' '}, StringSplitOptions.RemoveEmptEntries);
(Or text.Split specifying the exact whitespace characters you want to split on, or using null as Henk suggested.)
Or you could use a regex to handle all whitespace characters:
Regex regex = new Regex(#"\s+");
string[] bits = regex.Split(text);
Simplest is to do:
a.Split(new [] {' ', '\t'},StringSplitOptions.RemoveEmptyEntries)
Thanks Jon :)

Quick way of splitting a mixed alphanum string into text and numeric parts?

Say I have a string such as
abc123def456
What's the best way to split the string into an array such as
["abc", "123", "def", "456"]
string input = "abc123def456";
Regex re = new Regex(#"\D+|\d+");
string[] result = re.Matches(input).OfType<Match>()
.Select(m => m.Value).ToArray();
string[] result = Regex.Split("abc123def456", "([0-9]+)");
The above will use any sequence of numbers as the delimiter, though wrapping it in () says that we still would like to keep our delimiter in our returned array.
Note: In the example snippet we will get an empty element as the last entry of our array.
The boundary you look for can be described as "A position where a digit follows a non-digit, or where a non-digit follows a digit."
So:
string[] result = Regex.Split("abc123def456", #"(?<=\D)(?=\d)|(?<=\d)(?=\D)");
Use [0-9] and [^0-9], respectively, if \d and \D are not specific enough.
Add space around digitals, then split it. So there is the solution.
Regex.Replace("abc123def456", #"(\d+)", #" \1 ").Split(' ');
I hope it works.
You could convert the string to a char array and then loop through the characters. As long as the characters are of the same type (letter or number) keep adding them to a string. When the next character no longer is of the same type (or you've reached the end of the string), add the temporary string to the array and reset the temporary string to null.

Split a string by word using one of any or all delimiters?

I may have just hit the point where i;m overthinking it, but I'm wondering: is there a way to designate a list of special characters that should all be considered delimiters, then splitting a string using that list? Example:
"battlestar.galactica-season 1"
should be returned as
battlestar galactica season 1
i'm thinking regex but i'm kinda flustered at the moment, been staring at it for too long.
EDIT:
Thanks guys for confirming my suspicion that i was overthinking it lol: here is what i ended up with:
//remove the delimiter
string[] tempString = fileTitle.Split(#"\/.-<>".ToCharArray());
fileTitle = "";
foreach (string part in tempString)
{
fileTitle += part + " ";
}
return fileTitle;
I suppose i could just replace delimiters with " " spaces as well... i will select an answer as soon as the timer is up!
The built-in String.Split method can take a collection of characters as delimiters.
string s = "battlestar.galactica-season 1";
string[] words = s.split('.', '-');
The standard split method does that for you. It takes an array of characters:
public string[] Split(
params char[] separator
)
You can just call an overload of split:
myString.Split(new char[] { '.', '-', ' ' }, StringSplitOptions.RemoveEmptyEntries);
The char array is a list of delimiters to split on.
"battlestar.galactica-season 1".Split(new string[] { ".", "-" }, StringSplitOptions.RemoveEmptyEntries);
This may not be complete but something like this.
string value = "battlestar.galactica-season 1"
char[] delimiters = new char[] { '\r', '\n', '.', '-' };
string[] parts = value.Split(delimiters,
StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < parts.Length; i++)
{
Console.WriteLine(parts[i]);
}
Are you trying to split the string (make multiple strings) or do you just want to replace the special characters with a space as your example might also suggest (make 1 altered string).
For the first option just see the other answers :)
If you want to replace you could use
string title = "battlestar.galactica-season 1".Replace('.', ' ').Replace('-', ' ');
For more information split with easy examples you may see following Url:
This also include split on words (multiple chars).
C# Split Function explained

Categories

Resources