Manual string split in C# - c#

In my code, I am attempting to manipulate a string:
Some text - 04.09.1996 - 40-18
I'd like to split this into three substrings: Some text, 04.09.1996, and 40-18.
When I use the Split method with a hyphen as a separator, the return value is an array of four strings: Some text, 04.09.1996, 40, and 18. How can I make this code work as described above?

You should just split with spaces around -:
.Split(new[] {" - "}, StringSplitOptions.RemoveEmptyEntries);
See C# demo
var res = "Some text - 04.09.1996 - 40-18".Split(new[] {" - "}, StringSplitOptions.RemoveEmptyEntries);
foreach (var s in res)
Console.WriteLine(s);
Result:
Some text
04.09.1996
40-18

Use this overload of string split to only get 3 parts:
var s = "Some text - 04.09.1996 - 40-18";
var parts = s.Split(new[] { '-' }, 3);
I'm assuming you also want to trim the spaces too:
var parts = s.Split(new[] { '-' }, 3)
.Select(p => p.Trim());

I would be wary of "-" or " - " appearing in "Some text", as I assume that you are interested in that as a place holder. If you are certain that "Some text" will not contain "-" than the other answers here are good, simple and readable. Otherwise we need to rely on something that we know is constant about the string. It looks to me like the thing that is constant is the last 3 hyphens. So I would try split on "-" and put the last pair back together like
string input = "Some text - 04.09.1996 - 40-18";
string[] foo = input.Split(new[] { " - " }, StringSplitOptions.RemoveEmptyEntries);
int length = foo.Length;
string[] bar = new string[3];
//put "some text" back together
for(int i=0; i< length - 3;i++)
{
bar[0] += foo[i];
}
bar[1] = foo[length - 3];
bar[2] = foo[length - 2] + "-" + foo[length - 1];

In current case you can use Split with extra space like
string.Split(" - ")
In term of "good practice" can't recommend this solution.

I am replaced character sequence '--------------------' in your string to special character "&" like below. and then split using special character "&"
string str = "Hello, my- name -------------------- is Philip J. Fry -------------------- and i like cartoons".Replace("--------------------","&");
string[] ss=str.Split('&');
string result=ss[0] + "," + ss[1]+ "," +ss[2];
then output string looks like "Hello, my- name, is Philip J. Fry, and i like cartoons"

Related

The "Why" Behind String.Empty at the End of a String

Background
I am working with a delimited string and was using String.Split to put each substring into an array when I noticed that the last spot in the array was "". It was throwing off my results since I was looking for a specific substring at the last index in the array and I eventually came across this post explaining all strings end with string.Empty.
Example
The following shows this behavior in action. When I split my sentence and write each substring to the console, we can see the last element is the empty string:
public class Program
{
static void Main(string[] args)
{
const string mySentence = "Hello,this,is,my,string!";
var wordArray = mySentence.Split(new[] {",", "!"}, StringSplitOptions.None);
foreach (var word in wordArray)
{
var message = word;
if (word == string.Empty) message = "Empty string";
Console.WriteLine(message);
}
Console.ReadKey();
}
}
Question & "Fix"
I get conceptually that there are empty strings between every character, but why does String behave like this even for the end of a string? It seems confusing that "ABC" is equivalent to "ABC" + "" or ABC + "" + "" + "" so why not treat the string literally as only "ABC"?
There is a "fix" around it to get the "true" substrings I wanted:
public class Program
{
static void Main(string[] args)
{
const string mySentence = "Hello,this,is,my,string!";
var wordArray = mySentence.Split(new[] {",", "!"}, StringSplitOptions.None);
var wordList = new List<string>();
wordList.AddRange(wordArray);
wordList.RemoveAt(wordList.LastIndexOf(string.Empty));
foreach (var word in wordList)
{
var message = word;
if (word == string.Empty) message = "Empty string";
Console.WriteLine(message);
}
Console.ReadKey();
}
}
But I still don't understand why the end of the string gets treated with the same behavior since there is not another character following it where an empty string would be needed. Does it serve some purpose for the compiler?
Empty strings are the 0 of strings. There are literally infinity of them everywhere.
It's only natural that "ABC" is equivalent to "ABC" + "" or ABC + "" + "" + "". Just like it's natural that 3 is equivalent to 3 + 0 or 3 + 0 + 0 + 0.
and the fact that you have an empty string after "Hello,this,is,my,string!".Split('!')" does mean something. It means that your string ended with a "!"
This is happening because you are using StringSplitOptions.None while one of your delimiter values occurs at the end of the string. The entire purpose of that option is to create the behavior you are observing: it splits a string containing N delimiters into exactly N + 1 pieces.
To see the behavior you want, use StringSplitOptions.RemoveEmptyEntries:
var wordArray = mySentence.Split(new[] {",", "!"}, StringSplitOptions.RemoveEmptyEntries);
As for why you are seeing what you're seeing. The behavior StringSplitOptions.None is to find all the places where the delimiters are in the input string and return an array of each piece before and after the delimiters. This could be useful, for example, if you're parsing a string that you know to have exactly N parts, but where some of them could be blank. So for example, splitting the following on a comma delimiter, they would each yield exactly 3 parts:
a,b,c
a,b,
a,,c
a,,
,b,c
,b,
,,c
,,
If you want to allow empty values between delimiters, but not at the beginning or end, you can strip off delimiters at the beginning or end of the string before splitting:
var wordArray = Regex
.Replace(mySentence, "^[,!]|[,!]$", "")
.Split(new[] {",", "!"}, StringSplitOptions.None);
"" is the gap in-between each letter of Hello,this,is,my,string! So when the string is split by , and ! the result is Hello, this, is, my, string, "". The "" being the empty character between the end of the string and !.
If you replaced "" with a visible character (say #) your string would look like this #H#e#l#l#o#,#t#h#i#s#,#i#s#,#m#y#,#s#t#r#i#n#g#!#.

Split and add specific charcters to a string in C#

I am working with key-wording modules where each word end with , and remove spaces from the string example is given below:
if string is
one man sitting with girl , baby girl , Smiling girl
then result should be
,one man sitting with girl,baby girl,Smiling girl,
I am trying this
string[] strArray = str15.Split(new char[] { ',', ';' });
if (strArray.Length >= 1)
{
foreach (string str3 in strArray)
{
if (str3 != string.Empty)
{
strr1 = strr1 + str3 + ",";
}
}
}
But not able to remove spaces from string.
The first thing you want to do is tokenise the string, using Split().
string input = "some words , not split , in a sensible , way";
string[] sections = input.Split(',');
This gives you an array of strings split by the comma delimiter, which would look something like this:
"some words "
" not split "
" in a sensible "
" way"
Now you want to trim those spaces off. The string class has a great little function called Trim() which removes all whitespace characters (spaces, tabs, etc) from the start and end of strings.
for (int i = 0; i < sections.Length; i++)
sections[i] = sections[i].Trim();
Now you have an array with strings like this:
"some words"
"not split"
"in a sensible"
"way"
Next, you want to join them back together with a comma delimiter.
string result = string.Join(",", sections);
This gives you something along the lines of this:
"some words,not split,in a sensible,way"
And finally, you can add the commas at the start and end:
result = "," + result + ",";
Of course, this isn't the cleanest way of doing it. It's just an easy way of describing the individual steps. You can combine all of this together using LINQ extensions:
string result = "," + string.Join(",", input.Split(',').Select(s => s.Trim())) + ",";
This takes the input, splits it on the comma delimiter, then for each item in the list executes a lambda expression s => s.Trim(), which selects the trimmed version of the string for each element. This resulting enumerable is then passed back into string.Join(), and then we add the two commas at the start and end. It has the same function as the above steps, but does it one line.
Try this:
var input = "one man sitting with girl , baby girl , Smiling girl";
var output = string.Join(",", input.Split(',').Select(x => x.Trim()));
output = string.Concat(",", output, ",");
This should work:
string test = "one man sitting with girl , baby girl , Smiling girl";
Regex regex = new Regex(#"\s+,\s+");
string result = regex.Replace(test, ",");
Split your string and join it again by removing the white-spaces:
var input = "one man sitting with girl , baby girl , Smiling girl";
var output = string.Join(",", input.Split(',').Select(x => x.Trim()));
// If you wanna enclose it with commas
output = string.Format(",{0},",output);

Take all elements of .Split (' ') and take last OR exclude last item

I have an inconsistend address field which I am reading from excel. I need to split this field and enter those 2 values into two list properties.
the content could be something like this
At the gates 42
I need to split the number from the rest and add "At the gates" to the property "Street" and "42" to the property "number"
I have solved the problem with taking the number with this:
Number = Convert.ToString(ws.Cells[j + 3, i + 2].Value).Split(' ').LastOrDefault()
How can I split this string with Linq and just exclude the last number to get the street?
Kind regards
Use regex so you can also have it flexible to your needs.
Here's a sample:
static void Main(string[] args)
{
Regex regex = new Regex(#"(?<words>[A-Za-z ]+)(?<digits>[0-9]+)");
string input = "At the gates 42";
var match = regex.Match(input);
var a = match.Groups["words"].Value; //At the gates
var b = match.Groups["digits"].Value; //42
}
Splitting the string is a bit too much here. You just need to find the last occurence of " " and split at this position:
var address = "At the gates 42";
var idx = address.LastIndexOf(' ');
var street = address.Substring(0, idx);
var number = address.Substring(idx + 1);
Also consider using regular expressions if you need more flexibility.
You could just do this:
String[] sa = Convert.ToString(ws.Cells[j + 3, i + 2].Value).Split(' ');
Int32 number = Convert.ToInt32(sa[sa.Length - 1]);
String street = String.Join(" ", sa.Take(sa.Length - 1));
I would use LastIndexOf for this (if you can safely assume that your house numbers don't have spaces in):
var original = Convert.ToString(ws.Cells[j + 3, i + 2].Value).Split().LastOrDefault()
var street = original.Substring(1, original.LastIndexOf(' '))
var number = original.Substring(original.LastIndexOf(' ') + 1)
Note that it might be easier to get Excel to do the work for you though, you could use a formula in your sheet to perform the same function, so...
This gives you the street name (assuming your original value is in cell A1, just replace all occurrences of A1 otherwise):
=MID(A1,1,LOOKUP(9.9999999999E+307,FIND(" ",A1,ROW($1:$1024)))-1)
and this gives you the number:
=MID(A1,LOOKUP(9.9999999999E+307,FIND(" ",A1,ROW($1:$1024)))+1,LEN(A1))

Insert space between characters

I have a string,
string aString = "a,aaa,aaaa,aaaaa,,,,,";
Where i want to insert to a List..But when i do using the following method,
List<string> aList = new List<string>();
aList.AddRange(aString.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries));
MessageBox.Show(aList.Count.ToString());
I get the count as only 4, But there are actually 8 elements even the final characters in between the (,) sign is blank.
But if i pass the string as shown below,
string aString = "a,aaa,aaaa,aaaaa, , , , ,";
It will be shown as 8 elements..Please help me on this, the default way thw program retrieves the string is like so,
a,aaa,aaaa,aaaaa,,,,,
Please help on this one, It would be great if i could add spaces to the empty area or any other way so that i could add all these characters in between (,) sign to the list.. even the blank areas. Thank you :)
Don't use StringSplitOptions.RemoveEmptyEntries
string aString = "a,aaa,aaaa,aaaaa,,,,,";
var newStr = String.Join(", ", aString.Split(','));
I think you must remove StringSplitOptions.RemoveEmptyEntries
aList.AddRange(aString.Replace(",,", ", ,").Split(new string[] { "," }));
You can just Replace the space before split it.
aList.AddRange(aString.Replace(" ", "").Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries));

String concatenation using C#

I've a input string:
"risk management, portfolio management, investment planning"
How do I convert this string into:
"risk management" + "portfolio management" + "investment planning"
Thanks.
Split and Trim
// include linq library like this:
// using System.Linq;
// then
"test1, test2".Split(',').Select(o => o.Trim());
or
"test1, test2".Split(',').Select(o => o.Trim()).ToArray(); // returns array
and
"test1, test2".Split(',').Select(o => "\"" + o.Trim() + "\"")
.Aggregate((s1, s2) => s1 + " + " + s2);
// returns a string: "test1" + "test2"
Use the Split() method:
string[] phrases = s.Split(',');
Now you have a string array of each comma separated value.
To remove the spaces, use the Trim() method on each string (thanks John Feminella)
You can't use String.Split() in your case because you have a comma, then a space. So your strings will look like { "risk management", " portfolio management", " investment planning" }. Instead, use Regex.Split:
string[] investmentServices = Regex.Split(inputString, ", ");
var results = from s in string.Split("risk management, portfolio management, investment planning", new char[] { ',' })
select s.Trim();
Your question is not clear on whether you want to replace the ',' for '+' or just a simple split.
Here are the 2 possibilities:
string s = "risk management, portfolio management, investment planning";
string transformedString = s.Replace(", ", "\" + \"");
string[] parts = s.Split(new [] {", "}, StringSplitOptions.None);
If you want to split the input, you can use string.Split, using comma as a delimiter or , even better ", " for taking into account the space after comma,
string[] array = inputString.Split(", ");
However, you can be wanting to replace the comma inside the string for a plus sign, this is how you could be achieving that also:
inputString = inputString.Replace(", ", "\" + \"");
It actually looks like you're trying to perform a split, rather than concatenation.
If you're looking to take that input string and convert it into three strings containing "risk management", "portfolio management", and "investment planning", then use string.Split(inputString, ','), then trim each string from the resulting array when you use it.
It is not very clear what you mean. If you need to access the CSV values then this will output each value separately...
string input = "risk management, portfolio management, investment planning";
string[] words = text.Split(new Char[] {','});
foreach(string word in words)
{
Console.WriteLine(word.Trim());
}
//risk management
//portfolio management
//investment planning
Reply to Jhonny D. Cano
(Sorry, don't have 50 rep for a comment.)
Your first recommendation
string[] array = inputString.Split(", ");
Doesn't work because you can't split on a string. The closest possible overload is a char[], so you would have to write it as...
string[] array = inputString.Split(",
".ToCharArray());

Categories

Resources