C# String splitting - breaking string up at second comma - c#

I have a string like so:
mystring = "test1, 1, anotherstring, 5, yetanother, 400";
myarray can be of varying length. What I would like to do is split the string up like so:
{"test1, 1"}
{"anotherstring, 5}
{"yetanother, 400"}
Is this possible? I tried string[] newArray = mystring.Split(',') but that splits it at every comma, and not the second comma which is what I'd like to do.
Thanks for your help
Zaps

You can use a regular expression to match two items in the string:
string[] parts =
Regex.Matches(myarray[0], "([^,]*,[^,]*)(?:, |$)")
.Cast<Match>()
.Select(m => m.Groups[1].Value)
.ToArray();
This gets the items from the first string in the array. I don't know why you have the string in an array and if you have more than one string, in that case you have to loop through them and get the items from each string.

There is no direct way to make String.Split do this.
If performance is not a concern, you can use LINQ:
var input = "test1, 1, anotherstring, 5, yetanother, 400";
string[] result = input.Split(',');
result = result.Where((s, i) => i % 2 == 0)
.Zip(result.Where((s, i) => i % 2 == 1), (a, b) => a + ", " + b)
.ToArray();
Otherwise you'll probably have to split the string manually using String.IndexOf, or using a regular expression.

Another LINQ-based solution here. (Perhaps not the most efficient, but it allows for concise code and works for grouping into arbitrary group sizes).
1) Define a new query operator, InGroupsOf:
public static IEnumerable<T[]> InGroupsOf<T>(this IEnumerable<T> parts,
int groupSize)
{
IEnumerable<T> partsLeft = parts;
while (partsLeft.Count() >= groupSize)
{
yield return partsLeft.Take(groupSize).ToArray<T>();
partsLeft = partsLeft.Skip(groupSize);
}
}
2) Second, apply it to your input:
// define your input string:
string input = "test1, 1, anotherstring, 5, yetanother, 400";
// split it, remove excessive whitespace from all parts, and group them together:
IEnumerable<string[]> pairedInput = input
.Split(',')
.Select(part => part.Trim())
.InGroupsOf(2); // <-- used here!
// see if it worked:
foreach (string[] pair in pairedInput)
{
Console.WriteLine(string.Join(", ", pair));
}

Not with Split alone, but it can certainly be achieved.
I take it that myarray is actually a string, and not an array...?
In that case, you could perhaps do something like this:
string myarray = "test1, 1, anotherstring, 5, yetanother, 400";
string[] sourceArray = myarray.Split(',');
string[] newArray = sourceArray.Select((s,i) =>
(i % 2 == 0) ? "" : string.Concat(sourceArray[i-1], ",", s).Trim()
).Where(s => !String.IsNullOrEmpty(s)).ToArray();

You could probably use a Regular Expression on the original string to replace every other comma with a different 'token', e.g. ';'
Then just call string.split on the new token instead.

Interesting question... I'd do it like this:
string input = "test1, 1, anotherstring, 5, yetanother, 400";
string pattern = #"([^,]*,[^,]*),";
string[] substrings = Regex.Split(input, pattern).Where(s => s!="").Select(s => s.Trim()).ToArray();
You get exactly what you want. Only its dirty... =P =)

using IEnumerator..
var myarray = "test1, 1, anotherstring, 5, yetanother, 400";
System.Collections.IEnumerator iEN = myarray.Split(',').GetEnumerator();
var strList = new List<string>();
while (iEN.MoveNext())
{
var first = iEN.Current;
iEN.MoveNext();
strList.Add((string)first + "," + (string)iEN.Current);
}
:)

Related

extracting strings between 2 chars - all occurrences

I would like to do something like this:
My string example: "something;123:somethingelse;156:somethingelse2;589:somethingelse3"
I would like to get an array with values extracted from the string example. These values lies between ";" and ":" : 123, 156, 589
I have tried this, but I do not know how to iterate to get all occurrences:
string str = stringExample.Split(';', ':')[1];
string[i] arr = str;
Thank you for helping me.
LINQ is your friend here, something like this would do:
str.Split(';').Select(s => s.Split(':')[0]).Skip(1)
I would work with named groups:
string stringExample = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
Regex r = new Regex(";(?<digit>[0-9]+):");
foreach (Match item in r.Matches(stringExample))
{
var digit = item.Groups["digit"].Value;
}
You can use a regular expression like this:
Regex r = new Regex(#";(\d+):");
string s = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
foreach(Match m in r.Matches(s))
Console.WriteLine(m.Groups[1]);
;(\d+): matches one or more digits standing between ; and : and Groups[1] selects the content inside the brackest, ergo the digits.
Output:
123
156
589
To get these strings into an array use:
string[] numberStrings = r.Matches(s).OfType<Match>()
.Select(m => m.Groups[1].Value)
.ToArray();
So you want to extract all 3 numbers, you could use this approach:
string stringExample = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
string[] allTokens = stringExample.Split(';', ':'); // remove [1] since you want the whole array
string[] allNumbers = allTokens.Where(str => str.All(Char.IsDigit)).ToArray();
Result is:
allNumbers {string[3]} string[]
[0] "123" string
[1] "156" string
[2] "589" string
This sounds like a perfect case for a regular expression.
var sample = "something;123:somethingelse;156:somethingelse2;589:somethingelse3";
var regex = new Regex(#"(?<=;)(\d+)(?=:)");
var matches = regex.Matches(sample);
var array = matches.Cast<Match>().Select(m => m.Value).ToArray();

Extract integer from the end of a string

I have multiple IDs in a List<string>()
List<string> IDList = new List<string>() {
"ID101", //101
"I2D102", //102
"103", //103
"I124D104", //104
"ID-105", //105
"-1006" }; //1006
Rule: The string always ends with the id which has length 1 to n and is int only
I need to extract them to int values. But my solution doesn't work
List<int> intList = IDList.Select(x => int.Parse(Regex.Match(x, #".*\d*").Value)).ToList();
If ID is always at the end, you could use LINQ solution instead of Regex:
var query = IDList.Select(id =>
int.Parse(new string(id.Reverse()
.TakeWhile(x => char.IsNumber(x))
.Reverse().ToArray())));
The idea is to take the characters from the last till it finds no number. Whatever you get, you convert it into int. The good thing about this solution is it really represents what you specify.
Well, according to
Rule: The string always ends with the id which has length 1 to n and
is int only
the pattern is nothing but
[0-9]{1,n}$
[0-9] - ints only
{1,n} - from 1 to n (both 1 and n are included)
$ - string always ends with
and possible implementation could be something like this
int n = 5; //TODO: put actual value
String pattern = "[0-9]{1," + n.ToString() + "}$";
List<int> intList = IDList
.Select(line => int.Parse(Regex.Match(line, pattern).Value))
.ToList();
In case there're some broken lines, say "abc" (and you want to filter them out):
List<int> intList = IDList
.Select(line => Regex.Match(line, pattern))
.Where(match => match.Success)
.Select(match => int.Parse(match.Value))
.ToList();
Here's another LINQ approach which works if the number is always at the end and negative values aren't possible. Skips invalid strings:
List<int> intList = IDList
.Select(s => s.Reverse().TakeWhile(Char.IsDigit))
.Where(digits => digits.Any())
.Select(digits => int.Parse(String.Concat(digits.Reverse())))
.ToList();
( Edit: similar to Ian's approach )
This below code extract last id as integer from collection and ignore them which end with none integer value
List<int> intList = IDList.Where(a => Regex.IsMatch(a, #"\d+$") == true)
.Select(x => int.Parse(Regex.Match(x, #"\d+$").Value)).ToList();
i assume you want the last numbers :
var res = IDList.Select(x => int.Parse(Regex.Match(x, #"\d+$").Value)).ToList();

How convert values in line c#

input (string) : 1 2 3 4 5
I want be :
string line = "1 2 3 4 5";
list<int>list = new list<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);
You may use a Regular Expression to identify the occurrences of integer numbers within the text. Here's a working example.
This may prove to be much more reliable depending on your scenario, e.g. you could type arbitrary words/text in there, and it would still find all numbers.
The C# code to do this would be as follows:
List<int> FindIntegers(string input)
{
Regex regex = new Regex(#"(\d+)");
List<int> result = new List<int>();
foreach (Match match in regex.Matches(input))
{
result.Add(int.Parse(match.Value));
}
return result;
}
You can use the Split method with the StringSplitOptions overload:
string line = "1 2 3 4 5";
char[] delim = {' '};
var list = line.Split(delim, StringSplitOptions.RemoveEmptyEntries)
.Select(i => Convert.ToInt32(i)).ToList();
RemoveEmptyEntries will skip over the blank entries and your output will be:
List<Int32> (5 items)
1
2
3
4
5
See String.Split Method on MSDN for more info.
You can convert multiple spaces to single ones and then split on ' ':
var result = Regex.Replace(line, #"\s+", " ").Split(' ').Select(x => int.Parse(x.ToString())).ToList();
Or directly select all numbers by one of these Regexes:
\d{1} --> If the numbers will always consist of a single digit.
\d+ --> If the numbers might contain more than one digit.
Like this:
var result = new List<int>();
foreach (Match i in Regex.Matches(line, #"\d{1}")) // Or the other Regex...
{
result.Add(int.Parse(i.Value));
}
You can split the line on spaces removing empty matches with StringSplitOptions.RemoveEmptyEntries.
string line = "1 2 3 4 5";
var list = line.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.Select(n => Convert.ToInt32(n)).ToList();
So the result will have only the numbers then you can convert to int.
Try this using lambda expression and string split
string line = "1 2 3 4 5";
List<int> list= line.Split(" ").Where(x => x.Trim().length > 0).Select(x => Convert.ToInt32(x)).ToList();

Split a string into an array

I want to split a string to an array of sub-strings. The string is delimited by space, but space may appear inside the sub-strings too. And spliced strings must be of the same length.
Example:
"a b aab bb aaa" -> "a b", "aab", "bb ", "aaa"
I have the following code:
var T = Regex.Split(S, #"(?<=\G.{4})").Select(x => x.Substring(0, 3));
But I need to parameterize this code, split by various length(3, 4, 5 or n) and I don't know how do this. Please help.
If impossible to parameterize Regex, fully linq version ok.
You can use the same regex, but "parameterize" it by inserting the desired number into the string.
In C# 6.0, you can do it like this:
var n = 5;
var T = Regex.Split(S, $#"(?<=\G.{{{n}}})").Select(x => x.Substring(0, n-1));
Prior to that you could use string.Format:
var n = 5;
var regex = string.Format(#"(?<=\G.{{{0}}})", n);
var T = Regex.Split(S, regex).Select(x => x.Substring(0, n-1));
It seems rather easy with LINQ:
var source = "a b aab bb aaa";
var results =
Enumerable
.Range(0, source.Length / 4 + 1)
.Select(n => source.Substring(n * 4, 3))
.ToList();
Or using Microsoft's Reactive Framework's team's Interactive Extensions (NuGet "Ix-Main") and do this:
var results =
source
.Buffer(3, 4)
.Select(x => new string(x.ToArray()))
.ToList();
Both give you the output you require.
A lookbehind (?<=pattern) matches a zero-length string. To split using spaces as delimiters, the match has to actually return a "" (the space has to be in the main pattern, outside the lookbehind).
Regex for length = 3: #"(?<=\G.{3}) " (note the trailing space)
Code for length n:
var n = 3;
var S = "a b aab bb aaa";
var regex = #"(?<=\G.{" + n + #"}) ";
var T = Regex.Split(S, regex);
Run this code online

Losing characters in strings after performing a split with RegEx

I want to split a string into 2 strings,
my string looks like:
HAMAN*3 whitespaces*409991
I want to have two strings the first with 'HAMAN' and the second should contain '409991'
PROBLEM: My second string has '09991' instead of '409991' after implementing this code:
string str = "HAMAN 409991 ";
string[] result = Regex.Split(str, #"\s4");
foreach (var item in result)
{
MessageBox.Show(item.ToString());
}
CURRENT SOLUTION with PROBLEM:
Split my original string when I find whitespace followed by the number 4. The character '4' is missing in the second string.
PREFERED SOLUTION:
To split when I find 3 whitespaces followed by the digit 4. And have '4' mentioned in my second string.
Try this
string[] result = Regex.Split(str, #"\s{3,}(?=4)");
Here is the Demo
Positive lookahead is your friend:
Regex.Split(str, #"\s+(?=4)");
Or you could not use Regex:
var str = "HAMAN 409991 ";
var result = str.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
EXAMPLE
Alternative if you need it to start with SPACESPACESPACE4:
var str = new[] {
"HAMAN 409991 ",
"HAMAN 409991",
"HAMAN 509991"
};
foreach (var s in str)
{
var result = s.Trim()
.Split(new[] {" "}, StringSplitOptions.RemoveEmptyEntries)
.Select(a => a.Trim())
.ToList();
if (result.Count != 2 || !result[1].StartsWith("4"))
continue;
Console.WriteLine("{0}, {1}", result[0], result[1]);
}
That's because you're splitting including the 4. If you want to split by three-consecutive-spaces then you should specify exactly that:
string[] result = Regex.Split(str, #"\s{3}");

Categories

Resources