I have a String example:
#5/r/n#12/r/n#23/r/n#43/r/n#54/r/n#23/r/n#77/r/n
I need to pass these values to a list and get the values between # and /r/n
So far I have the following code:
List<string> result = Regex.Split(String, #"/r/n").ToList();
This separates each value, leaving #, how can I remove #, to each value from the list?
You can do this in one line using LINQ:
List<string> result = Regex.Split(String, #"/r/n").Select(s => s.Replace("#", "")).ToList();
You can use the trim function to remove special characters from the front and end of your strings.
myString.Trim( new Char[] { '#', ' '} )
User the string null or empty operator to cleanse any empty strings as well:
List<string> result = Regex.Split(myString, #"/r/n").Select(a => a.Trim(new Char[] { '#', ' ' })).Where(b => !String.IsNullOrEmpty(b)).ToList();
You can use trim
char[] charsToTrim = { '#' };
List<string> result = Regex.Split(String, #"/r/n").
.Select(x => x.Trim(charsToTrim))
.ToList();
You could also split on the # and trim the other. Which ever makes sense.
I believe that Trim will be faster than Replace -- but I did not test it.
Related
I was wanting to split a string with a known delimiter between different parts into an array of strings using a method (e.g. MethodToSplitIntoArray(String toSplit) like in the example below. The values are string values which can have any character except for '{', '}', or ',' so am unable to delimit on any other character. The string can also contain undesired white space at the start and end as the file can be generated from multiple different sources, the desired information will also be inbetween "{" "}" and separated by a comma.
String valueCombined = " {value},{value1},{value2} ";
String[] values = MethodToSplitIntoArray(valueCombined);
foreach(String value in values)
{
//Do something with array
Label.Text += "\r\nString: " + value;
}
Where the label would show:
String: value
String: value1
String: value2
My current implementation of splitting method is below. It splits the values but includes any spaces before the first parenthesis and anything between them.
private String[] MethodToSplitIntoArray(String toSplit)
{
return filesPassed.Split(new string[] { "{", "}" }, StringSplitOptions.RemoveEmptyEntries);
}
I though this would separate out the strings between the curly braces and remove the rest of the string, but my output is:
String:
String: value
String: ,
String: value1
String: ,
String: value2
String:
What am I doing wrong in my split that I'm still getting the string values outside of the parenthesis? Ideally I would like to use regex or String.Split if its possible
For those with similar problems check out DotNet Perls on splitting
Making the assumption that commas are not permitted inside a curly brace pair, and that outside a curly brace pair only commas or whitespace will appear, it seems to me that the most straightforward, easy-to-read way to approach this is to first split on commas, then trim the results of that (to remove whitespace), and then finally to remove the first and last characters (which at that point should only be the curly braces):
valuesCombined.Split(',').Select(s => s.Trim().Substring(1, s.Length - 2)).ToArray();
I believe that including the curly braces in the initial split operation just makes everything harder, and is more likely to break in hard-to-identify ways (i.e. bad data will result in weirder results than if you use something like the above).
Add , to delimeters:
return filesPassed.Split(new char[] { '{', '}', ',' }, StringSplitOptions.RemoveEmptyEntries);
Not sure if you are expecting those spaces in the front and end so added some trimming to prevent empty results for those.
private String[] MethodToSplitIntoArray(String toSplit)
{
return toSplit.Trim().Split(new char[] { '{', '}', ',' }, StringSplitOptions.RemoveEmptyEntries);
}
This might be one of the way to get all the values as u are looking for
String valueCombined = " {value},{value1},{value2} ";
String[] values = valueCombined.Split(new string[] { "},{" }, StringSplitOptions.RemoveEmptyEntries);
int lastVal = values.Count() - 1;
values[0] = values[0].Replace("{", "");
values[lastVal] = values[lastVal].Replace("}", "");
What I did here is that splited the string with "},{" and then removed { from the first array item and } from the last array item.
Try regex and linq.
return Regex.Split(toSplit, "[.{.}.,]").Where(x => !string.IsNullOrWhiteSpace(x)).ToArray();
Though very late but can you try this:
Regex.Split(" { value},{ value1},{ value2};", #"\s*},{\s*|{\s*|},?;?").Where(s => string.IsNullOrWhiteSpace(s) == false).ToArray()
I need to split a string in C# using a set of delimiter characters. This set should include the default whitespaces (i.e. what you effectively get when you String.Split(null, StringSplitOptions.RemoveEmptyEntries)) plus some additional characters that I specify like '.', ',', ';', etc. So if I have a char array of those additional characters, how to I add all the default whitespaces to it, in order to then feed that expanded array to String.Split? Or is there a better way of splitting using my custom delimiter set + whitespaces? Thx
Just use the appropriate overload of string.Split if you're at least on .NET 2.0:
char[] separator = new[] { ' ', '.', ',', ';' };
string[] parts = text.Split(separator, StringSplitOptions.RemoveEmptyEntries);
I guess i was downvoted because of the incomplete answer. OP has asked for a way to split by all white-spaces(which are 25 on my pc) but also by other delimiters:
public static class StringExtensions
{
static StringExtensions()
{
var whiteSpaceList = new List<char>();
for (int i = char.MinValue; i <= char.MaxValue; i++)
{
char c = Convert.ToChar(i);
if (char.IsWhiteSpace(c))
{
whiteSpaceList.Add(c);
}
}
WhiteSpaces = whiteSpaceList.ToArray();
}
public static readonly char[] WhiteSpaces;
public static string[] SplitWhiteSpacesAndMore(this string str, IEnumerable<char> otherDeleimiters, StringSplitOptions options = StringSplitOptions.None)
{
var separatorList = new List<char>(WhiteSpaces);
separatorList.AddRange(otherDeleimiters);
return str.Split(separatorList.ToArray(), options);
}
}
Now you can use this extension method in this way:
string str = "word1 word2\tword3.word4,word5;word6";
char[] separator = { '.', ',', ';' };
string[] split = str.SplitWhiteSpacesAndMore(separator, StringSplitOptions.RemoveEmptyEntries);
The answers above do not use all whitespace characters as delimiters, as you state in your request, only the ones specified by the program. In the solution examples above, this is only SPACE, but not TAB, CR, LF, and all the other Unicode-defined whitespace chars.
I have not found a way to retrieve the default whitespace chars from String. However, they are defined in Regex, and you can use that instead of String. In your case, adding period and comma to the Regex whitespace set:
Regex regex = new Regex(#"[\s\.,]+"); // The "+" will remove blank entries
input = #"1.2 3, 4";
string[] tokens = regex.Split(input);
will produce
tokens[0] "1"
tokens[1] "2"
tokens[2] "3"
tokens[3] "4"
str.Split(" .,;".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
I use something like the following to ensure I'm always splitting on Split's default whitespace characters:
public static string[] SplitOnWhitespaceAnd(this string value,
char[] separator, StringSplitOptions options = StringSplitOptions.RemoveEmptyEntries)
=> value.Split().SelectMany(s => s.Split(separator, options)).ToArray();
Note that to be consistent with Microsoft's naming conventions, you'd want to use WhiteSpace rather than Whitespace.
Refer to Microsoft's Char.IsWhiteSpace documentation to see the whitespace characters split on by default.
string[] splitSentence(string sentence)
{
return sentence
.Replace(",", " , ")
.Replace(".", " . ")
.Split(' ', StringSplitOptions.RemoveEmptyEntries)
}
or
string[] result = test.Split(new string[] {"\n", "\r\n"},
StringSplitOptions.RemoveEmptyEntries);
I am trying to split a string into a string[] made of the words the string originally held using the fallowing code.
private string[] ConvertWordsFromFile(String NewFileText)
{
char[] delimiterChars = { ' ', ',', '.', ':', '/', '|', '<' , '>','/','#','#','$','%','^','&','*','"','(',')',';'};
string[] words = NewFileText.Split(delimiterChars);
return words;
}
I am then using this to add the words to a dictionary that keeps up with word keys and their frequency value. All other duplicated words are not added as keys and only the value is affected. However the last word is counted as a different word and is therefore made into a new key. How can i fix this?
This is the code I have for adding words to the dictionary :
public void AddWord(String newWord)
{
newWord = newWord.ToLower();
try
{
MyWords.Add(newWord, 1);
}
catch (ArgumentException)
{
MyWords[newWord]++;
}
}
To clarify the problem i am having is that even if the word at the end of a string is a duplicate it is still treated like a new word and therefore a new string.
Random guess - space at the end makes empty word that you don't expect. If yes - use correct option for Split:
var words = newFileText.Split(delimiterChars,
StringSplitOptions.RemoveEmptyEntries);
Split is not the best choice to do what you want to do because you end having this kind of problems and you also have to specify all the delimiters, etc.
A much better option is using a regular expressions instead of your ConvertWordsFromFile method as follow:
Regex.Split(theTextToBeSplitted, #"\W+")
This line will return an array containing all the 'words'. Once you have that, the next step should be create your dictionary so, if you can use linq in your code, the easiest and cleaner way to do what you want is this one:
var theTextToBeSplitted = "#Hi, this is a 'little' test: <I hope it is useful>";
var myDictionary = Regex.Split(theTextToBeSplitted, #"\W+")
.GroupBy(x => x)
.ToDictionary(x => x.Key, x => x.Count());
That´s all that you need.
Good luck!
I have a string with the following text:
:0c4b7fcdffc38322555a9e35c22c9469:Nick:194176015020283762507:
How do I parse the final number? i.e.:
194176015020283762507
You should first use String.Split() to separate the string by the colon (':') separators. Then access the correct element.
var input = ":0c4b7fcdffc38322555a9e35c22c9469:Nick:194176015020283762507:";
var split = input.Split(':');
var final = split[3];
Note that by default, Split() keeps empty entries. You will have one at the beginning and end, because of the initial and ending colons. You could also use:
var split = input.Split(new[] {':'}, StringSplitOptions.RemoveEmptyEntries);
var final = split[2];
which, as the option implies, removes empty entries from the array. So your number would be at index 2 instead of 3.
string str = ":0c4b7fcdffc38322555a9e35c22c9469:Nick:194176015020283762507:";
string num = str.Split(':')[3];
var finalNumber = input.Split(new char[] { ':' }, StringSplitOptions.RemoveEmptyEntries)
.Last()
This code will split your input string into strings, separated by : (empty strings are removed from start and end of sequence). And last string is returned, which is your finalNumber.
I may have just hit the point where i;m overthinking it, but I'm wondering: is there a way to designate a list of special characters that should all be considered delimiters, then splitting a string using that list? Example:
"battlestar.galactica-season 1"
should be returned as
battlestar galactica season 1
i'm thinking regex but i'm kinda flustered at the moment, been staring at it for too long.
EDIT:
Thanks guys for confirming my suspicion that i was overthinking it lol: here is what i ended up with:
//remove the delimiter
string[] tempString = fileTitle.Split(#"\/.-<>".ToCharArray());
fileTitle = "";
foreach (string part in tempString)
{
fileTitle += part + " ";
}
return fileTitle;
I suppose i could just replace delimiters with " " spaces as well... i will select an answer as soon as the timer is up!
The built-in String.Split method can take a collection of characters as delimiters.
string s = "battlestar.galactica-season 1";
string[] words = s.split('.', '-');
The standard split method does that for you. It takes an array of characters:
public string[] Split(
params char[] separator
)
You can just call an overload of split:
myString.Split(new char[] { '.', '-', ' ' }, StringSplitOptions.RemoveEmptyEntries);
The char array is a list of delimiters to split on.
"battlestar.galactica-season 1".Split(new string[] { ".", "-" }, StringSplitOptions.RemoveEmptyEntries);
This may not be complete but something like this.
string value = "battlestar.galactica-season 1"
char[] delimiters = new char[] { '\r', '\n', '.', '-' };
string[] parts = value.Split(delimiters,
StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < parts.Length; i++)
{
Console.WriteLine(parts[i]);
}
Are you trying to split the string (make multiple strings) or do you just want to replace the special characters with a space as your example might also suggest (make 1 altered string).
For the first option just see the other answers :)
If you want to replace you could use
string title = "battlestar.galactica-season 1".Replace('.', ' ').Replace('-', ' ');
For more information split with easy examples you may see following Url:
This also include split on words (multiple chars).
C# Split Function explained