Concating String Array with semicolons - c#

It is a pattern that occurs quite often in one part of our Framework.
Given an Array of Strings, we have to concat all of them, seperated by Semicolons.
I´d like to know in which elegant way it can be done.
I`ve seen some variations across our codebase, and always, when i have to do this, i have to rethink again.
My current pattern is this:
String[] values = new String[] {"a","b","c","d"};
String concat = String.Empty;
foreach(String s in values)
{
if(String.IsEmptyOrNullString(s) == false)
concat + = ", ";
concat += s;
}
What negs me is the if statement, i could insert the first item before the loop and start with a for loop, starting at index 1, but this doesn´t increase the readability.
What are your suggestions?

You can use string.Join():
String[] values = new String[] {"a","b","c","d"};
var concat = string.Join(", ", values);
This will result in something looking like this:
a, b, c, d

try:
var result = string.Join(",", values.Where(s => !string.IsNullOrEmpty(s)));

Related

C# Concat multiline string

I got two strings A and B.
string A = #"Hello_
Hello_
Hello_";
string B = #"World
World
World";
I want to add these two strings together with a function call which could look like this:
string AB = ConcatMultilineString(A, B)
The function should return:
#"Hello_World
Hello_World
Hello_World"
The best way to do this for me was splitting the strings into an array of lines and then adding all lines together with "\r\n" and then returning it. But that seems bad practice to me since mulitple lines are not always indicated with "\r\n".
Is there a way to do this which is more reliable?
For a one line solution:
var output = string.Join(System.Environment.NewLine, A.Split('\n')
.Zip(B.Split('\n'), (a,b) => string.Join("", a, b)));
We split on \n because regardless of whether it's \n\r or just \n, it will contain \n. Left over \r seem to be ignored, but you can add a call to Trim for a and b if you feel safer for it.
Environment.NewLine is a platform-agnostic alternative to using "\r\n".
Environment.NewLine might be helpful to resolve the "mulitple lines are not always indicated with "\r\n""-issue
https://msdn.microsoft.com/de-de/library/system.environment.newline(v=vs.110).aspx
Edit:
If you dont know if multiple lines are separated as "\n" or "\r\n" this might help:
input.Split(new string[] {"\n", "\r\n"}, StringSplitOptions.RemoveEmptyEntries);
Empty lines are removed. If you dont want this use: StringSplitOptions.None instead.
See also here: How to split strings on carriage return with C#?
This does as you asked:
static string ConcatMultilineString(string a, string b)
{
string splitOn = "\r\n|\r|\n";
string[] p = Regex.Split(a, splitOn);
string[] q = Regex.Split(b, splitOn);
return string.Join("\r\n", p.Zip(q, (u, v) => u + v));
}
static void Main(string[] args)
{
string A = "Hello_\rHello_\r\nHello_";
string B = "World\r\nWorld\nWorld";
Console.WriteLine(ConcatMultilineString(A, B));
Console.ReadLine();
}
Outputs:
Hello_World
Hello_World
Hello_World
I think a generic way is very impossible, if you will load the string that is created from different platforms (Linux + Mac + Windows) or even get strings that contain HTML Line Breaks or what so ever
I think you will have to define the line break you self.
string a = getA();
string b = getB();
char[] lineBreakers = {'\r', '\n'};
char replaceWith = '\n';
foreach(string lineBreaker in lineBreakers)
{
a.Replace(lineBreaker, replaceWith);
b.Replace(lineBreaker, replaceWith);
}
string[] as = a.Split(replaceWith);
string[] bs = a.Split(replaceWith);
string newLine = Environment.NewLine;
if(as.Length == bs.Length)
{
return as.Zip(bs (p, q) => $"{p}{q}{newLine }")
}

Shortcut for splitting only once in C#?

Okay, lets say I have a string:
string text = "one|two|three";
If I do string[] texts = text.Split('|'); I will end up with a string array of three objects. However, this isn't what I want. What I actually want is to split the string only once... so the two arrays I could would be this:
one
two|three
Additionally, is there a way to do a single split with the last occurrence in a string? So I get:
one|two
three
As well, is there a way to split by a string, instead of a character? So I could do Split("||")
Split method takes a count as parameter, you can pass 2 in that position, which basically says that you're interested in only 2 elements maximum. You'll get the expected result.
For second question: There is no built in way AFAIK. You may need to implement it yourself by splitting all and joining first and second back.
C#'s String.Split() can take a second argument that can define the number of elements to return:
string[] texts = text.Split(new char[] { '|' }, 2);
For your first scenario, you can pass a parameter of how many strings to split into.
var text = "one|two|three";
var result = text.Split(new char[] { '|' }, 2);
Your second scenario requires a little more magic.
var text = "one|two|three";
var list = text.Split('|');
var result = new string[] { string.Join("|", list, 0, list.Length - 1), list[list.Length - 1] };
Code has not been verified to check results before using.
Well, I took it as a challenge to do your second one in one line. The result is... not pretty, mostly because it's surprisingly difficult to reverse a string and keep it as a string.
string text = "one|two|three";
var result = new String(text.Reverse().ToArray()).Split(new char[] {'|'}, 2).Reverse().Select(c => new String(c.Reverse().ToArray()));
Basically, you reverse it, then follow the same procedure as the first one, then reverse each individual one, as well as the resulting array.
You can simply do like this as well...
//To split at first occurence of '|'
if(text.Containts('|')){
beginning = text.subString(0,text.IndexOf('|'));
ending = text.subString(text.IndexOf('|');
}
//To split at last occurence of '|'
if(text.Contains('|')){
beginning = text.subString(0,text.LastIndexOf('|'));
ending = text.subString(text.LastIndexOf('|');
}
Second question was fun. I solved it this way:
string text = "one|two|three";
var result =
new []
{
string.Concat(text.ToCharArray().TakeWhile((c, i) => i <= text.LastIndexOf("|"))),
string.Concat(text.ToCharArray().SkipWhile((c, i) => i <= text.LastIndexOf("|")))
};

Regular Expression split string and get whats in brackets [ ] put into array

I am trying to use regex to split the string into 2 arrays to turn out like this.
String str1 = "First Second [insideFirst] Third Forth [insideSecond] Fifth";
How do I split str1 to break off into 2 arrays that look like this:
ary1 = ['First Second','Third Forth','Fifth'];
ary2 = ['insideFirst','insideSecond'];
here is my solution
string str = "First Second [insideFirst] Third Forth [insideSecond] Fifth";
MatchCollection matches = Regex.Matches(str,#"\[.*?\]");
string[] arr = matches.Cast<Match>()
.Select(m => m.Groups[0].Value.Trim(new char[]{'[',']'}))
.ToArray();
foreach (string s in arr)
{
Console.WriteLine(s);
}
string[] arr1 = Regex.Split(str,#"\[.*?\]")
.Select(x => x.Trim())
.ToArray();
foreach (string s in arr1)
{
Console.WriteLine(s);
}
Output
insideFirst
insideSecond
First Second
Third Forth
Fifth
Plz Try below code. Its working fine for me.
String str1 = "First Second [insideFirst] Third Forth [insideSecond] Fifth";
var output = String.Join(";", Regex.Matches(str1, #"\[(.+?)\]")
.Cast<Match>()
.Select(m => m.Groups[1].Value));
string[] strInsideBreacket = output.Split(';');
for (int i = 0; i < strInsideBreacket.Count(); i++)
{
str1 = str1.Replace("[", ";");
str1 = str1.Replace("]", "");
str1 = str1.Replace(strInsideBreacket[i], "");
}
string[] strRemaining = str1.Split(';');
Plz look at below screen shot of output while debugging code:
Here,
strInsideBreacket is array of breacket value like insideFirst andinsideSecond
and strRemaining is array of First Second,Third Forth and Fifth
Thanks
Try this solution,
String str1 = "First Second [insideFirst] Third Forth [insideSecond] Fifth";
var allWords = str1.Split(new char[] { '[', ']' }, StringSplitOptions.RemoveEmptyEntries);
var result = allWords.GroupBy(x => x.Contains("inside")).ToArray();
The idea is that, first get all words and then the group it.
It seems to me that "user2828970" asked a question with an example, not with literal text he wanted to parse. In my mind, he could very well have asked this question:
I am trying to use regex to split a string like so.
var exampleSentence = "I had 185 birds but 20 of them flew away";
var regexSplit = Regex.Split(exampleSentence, #"\d+");
The result of regexSplit is: I had, birds but, of them flew away.
However, I also want to know the value which resulted in the second string splitting away from its preceding text, and the value which resulted in the third string splitting away from its preceding text. i.e.: I want to know about 185 and 20.
The string could be anything, and the pattern to split by could be anything. The answer should not have hard-coded values.
Well, this simple function will perform that task. The code can be optimized to compile the regex, or re-organized to return multiple collections or different objects. But this is (nearly) the way I use it in production code.
public static List<Tuple<string, string>> RegexSplitDetail(this string text, string pattern)
{
var splitAreas = new List<Tuple<string, string>>();
var regexResult = Regex.Matches(text, pattern);
var regexSplit = Regex.Split(text, pattern);
for (var i = 0; i < regexSplit.Length; i++)
splitAreas.Add(new Tuple<string, string>(i == 0 ? null : regexResult[i - 1].Value, regexSplit[i]));
return splitAreas;
}
...
var result = exampleSentence.RegexSplitDetail(#"\d+");
This would return a single collection which looks like this:
{ null, "I had "}, // First value, had no value splitting it from a predecessor
{"185", " birds but "}, // Second value, split from the preceding string by "185"
{ "20", " of them flew away"} // Third value, split from the preceding string by "20"
Being that this is a .NET Question and, apart from my more favoured approach in my other answer, you can also capture the Split Value another VERY Simple way. You just then need to create a function to utilize the results as you see fit.
var exampleSentence = "I had 185 birds but 20 of them flew away";
var regexSplit = Regex.Split(exampleSentence, #"(\d+)");
The result of regexSplit is: I had, 185, birds but, 20, of them flew away. As you can see, the split values exist within the split results.
Note the subtle difference compared to my other answer. In this regex split, I used a Capture Group around the entire pattern (\d+) You can't do that!!!?.. can you?
Using a Capture Group in a Split will force all capture groups of the Split Value between the Split Result Capture Groups. This can get messy, so I don't suggest doing it. It also forces somebody using your function(s) to know that they have to wrap their regexes in a capture group.

What's the best way to merge strings?

Let's say I have a foreach-loop with strings like this:
String newStr='';
String str='a b c d e';
foreach(String strChar in str.split(' ')) {
newStr+=strChar+',';
}
the result would be something like: a,b,c,d,e, but what I want is a,b,c,d,e without the last comma. I normally split the last comma out but this seems ugly and overweight. Is there any lightweight way to do this?
Additional to this question: Is there any easy solution to add an "and" to the constellation that the result is something like: a, b, c, d and e for user output?
p.s.: I know that I can use the replace-method in the example but this is not what I'm looking because in most cases you can't use it (for example when you build a sql string).
I would use string.Join:
string newStr = string.Join(",", str.Split(' '));
Alternatively, you could add the separator at the start of the body of the loop, but not on the first time round.
I'd suggest using StringBuilder if you want to keep doing this by hand though. In fact, with a StringBuilder you could just unconditionally append the separator, and then decrement the length at the end to trim that end.
You also wrote:
for example when you build a sql string
It's very rarely a good idea to build a SQL string like this. In particular, you should absolutely not use strings from user input here - use parameterized SQL instead. Building SQL is typically the domain of ORM code... in which case it's usually better to use an existing ORM than to roll your own :)
you're characterizing the problem as appending a comma after every string except the last. Consider characterizing it as prepending a comma before every string but the first. It's an easier problem.
As for your harder version there are several dozen solutions on my blog and in this question.
Eric Lippert's challenge "comma-quibbling", best answer?
string.Join may be your friend:
String str='a b c d e';
var newStr = string.Join(",", str.Split(' '));
Here's how you can do it where you have "and" before the last value.
var vals = str.Split(' ');
var ans = vals.Length == 1 ?
str :
string.Join(", ", vals.Take(vals.Length - 1))) + ", and " + vals.Last();
newStr = String.Join(",", str.split(' '));
You can use Regex and replace whitespaces with commas
string newst = Regex.Replace(input, " ", ",");
First, you should be using a StringBuilder for string manipulations of this sort. Second, it's just an if conditional on the insert.
System.Text.StringBuilder newStr = new System.Text.StringBuilder("");
string oldStr = "a b c d e";
foreach(string c in oldStr.Split(' ')) {
if (newStr.Length > 0) newStr.Append(",");
newStr.Append(c);
}

string replace using a List<string>

I have a List of words I want to ignore like this one :
public List<String> ignoreList = new List<String>()
{
"North",
"South",
"East",
"West"
};
For a given string, say "14th Avenue North" I want to be able to remove the "North" part, so basically a function that would return "14th Avenue " when called.
I feel like there is something I should be able to do with a mix of LINQ, regex and replace, but I just can't figure it out.
The bigger picture is, I'm trying to write an address matching algorithm. I want to filter out words like "Street", "North", "Boulevard", etc. before I use the Levenshtein algorithm to evaluate the similarity.
How about this:
string.Join(" ", text.Split().Where(w => !ignoreList.Contains(w)));
or for .Net 3:
string.Join(" ", text.Split().Where(w => !ignoreList.Contains(w)).ToArray());
Note that this method splits the string up into individual words so it only removes whole words. That way it will work properly with addresses like Northampton Way #123 that string.Replace can't handle.
Regex r = new Regex(string.Join("|", ignoreList.Select(s => Regex.Escape(s)).ToArray()));
string s = "14th Avenue North";
s = r.Replace(s, string.Empty);
Something like this should work:
string FilterAllValuesFromIgnoreList(string someStringToFilter)
{
return ignoreList.Aggregate(someStringToFilter, (str, filter)=>str.Replace(filter, ""));
}
What's wrong with a simple for loop?
string street = "14th Avenue North";
foreach (string word in ignoreList)
{
street = street.Replace(word, string.Empty);
}
If you know that the list of word contains only characters that do not need escaping inside a regular expression then you can do this:
string s = "14th Avenue North";
Regex regex = new Regex(string.Format(#"\b({0})\b",
string.Join("|", ignoreList.ToArray())));
s = regex.Replace(s, "");
Result:
14th Avenue
If there are special characters you will need to fix two things:
Use Regex.Escape on each element of ignore list.
The word-boundary \b will not match a whitespace followed by a symbol or vice versa. You may need to check for whitespace (or other separating characters such as punctuation) using lookaround assertions instead.
Here's how to fix these two problems:
Regex regex = new Regex(string.Format(#"(?<= |^)({0})(?= |$)",
string.Join("|", ignoreList.Select(x => Regex.Escape(x)).ToArray())));
If it's a short string as in your example, you can just loop though the strings and replace one at a time. If you want to get fancy you can use the LINQ Aggregate method to do it:
address = ignoreList.Aggregate(address, (a, s) => a.Replace(s, String.Empty));
If it's a large string, that would be slow. Instead you can replace all strings in a single run through the string, which is much faster. I made a method for that in this answer.
LINQ makes this easy and readable. This requires normalized data though, particularly in that it is case-sensitive.
List<string> ignoreList = new List<string>()
{
"North",
"South",
"East",
"West"
};
string s = "123 West 5th St"
.Split(' ') // Separate the words to an array
.ToList() // Convert array to TList<>
.Except(ignoreList) // Remove ignored keywords
.Aggregate((s1, s2) => s1 + " " + s2); // Reconstruct the string
Why not juts Keep It Simple ?
public static string Trim(string text)
{
var rv = text.trim();
foreach (var ignore in ignoreList) {
if(tv.EndsWith(ignore) {
rv = rv.Replace(ignore, string.Empty);
}
}
return rv;
}
You can do this using and expression if you like, but it's easier to turn it around than using a Aggregate. I would do something like this:
string s = "14th Avenue North"
ignoreList.ForEach(i => s = s.Replace(i, ""));
//result is "14th Avenue "
public static string Trim(string text)
{
var rv = text;
foreach (var ignore in ignoreList)
rv = rv.Replace(ignore, "");
return rv;
}
Updated For Gabe
public static string Trim(string text)
{
var rv = "";
var words = text.Split(" ");
foreach (var word in words)
{
var present = false;
foreach (var ignore in ignoreList)
if (word == ignore)
present = true;
if (!present)
rv += word;
}
return rv;
}
If you have a list, I think you're going to have to touch all the items. You could create a massive RegEx with all your ignore keywords and replace to String.Empty.
Here's a start:
(^|\s+)(North|South|East|West){1,2}(ern)?(\s+|$)
If you have a single RegEx for ignore words, you can do a single replace for each phrase you want to pass to the algorithm.

Categories

Resources