Split string into substrings based on the separator from the left - c#

In c#, is there an elegant way to split a string like "a.b.c" into a, a.b, a.b.c
The number of separators are not fixed so it could be "a.b" which will output {a, a.b} or "a.b.c.d" which will output {a, a.b, a.b.c, a.b.c.d}.
The only thing I can think of is split the string into individual components and then concatenate it again.
This is what I have so far:
var fieldNames = new List<string>();
var fieldSeparator ='.';
var myString = "a.b.c.d";
var individualFields = myString.Split(fieldSeparator);
string name = "";
foreach(var fieldName in individualFields)
{
name = string.IsNullOrEmpty(name) ? fieldName : $"{name}{fieldSeparator}{fieldName}";
fieldNames.Add(name);
}

Maybe this extension?
public static string[] SplitCombineFirst(this string str, params string[] delimiter)
{
string[] tokens = str.Split(delimiter, StringSplitOptions.RemoveEmptyEntries);
var allCombinations = new List<string>(tokens.Length);
for(int take = 1; take <= tokens.Length; take++)
{
string combination = string.Join(delimiter[0], tokens.Take(take));
allCombinations.Add(combination);
}
return allCombinations.ToArray();
}
Call:
string[] result = "a.b.c".SplitCombineFirst(".");

This looks like a classic case for recursion.
List<string> splitCombine(string source, string delimiter, int startIndex)
{
List<string> result = new List<string>();
var indx = source.IndexOf(delimiter, startIndex);
if (indx >= 0)
{
if (indx > 0)
{
result.Add(source.Substring(0, indx));
}
result.AddRange(splitCombine(source, delimiter, ++indx));
}
else
{
result.Add(source);
}
return result;
}
Call:
var result = splitCombine("a.b.c.d.e", ".", 0);

Related

Extract index numbers from list of string

I have a list of strings where those strings have an index in each string and I need to extract the index from that string and put it in a List<int>.
Here's is a list example:
List<string> values = new List<string>();
values.Add("cohabitantGender");
values.Add("additionalDriver0LastName");
values.Add("additionalDriver0AgeWhenLicensed");
values.Add("vehicle0City");
values.Add("vehicle1City");
values.Add("vehicle2City");
values.Add("vehicle3City");
from this list I need to extract the indexes from the values vehicleXCity.
I have this code right now:
public static List<int> FormObjectIndexExtractor(List<string> values, string prefix, string suffix)
{
var selectedMatches = values.Where(v => v.StartsWith(prefix) && v.EndsWith(suffix)).Select(v=> v).ToList();
var indexes = new List<int>();
foreach (var v in selectedMatches) indexes.Add(int.Parse(Regex.Match(v, #"\d+").Value));
return indexes;
}
And I'm using it like this:
List<int> indexes = FormObjectIndexExtractor(values, "vehicle", "City");
But if I have a value like vehicle4AnotherCity the code will work in the wrong way.
Does anyone have some alternative to this code that may help?
Below is an extension helper class incase you have a more robust list and you require multiple exclusion options
public class INumberList : List<string>
{
public List<int> GetNumberList()
{
List<int> numberList = new List<int>();
for (int i = 0; i < this.Count; i++)
{
numberList.Add(GetIntFromString(this[i]));
}
return numberList;
}
public INumberList ExcludeIndex(string prefix, string suffix)
{
for (int i = 0; i < this.Count; i++)
{
if (this[i].StartsWith(prefix) && this[i].EndsWith(suffix))
{
//remove non needed indexes
this.RemoveAt(i);
}
}
return this;
}
public static int GetIntFromString(String input)
{
// Replace everything that is no a digit.
String inputCleaned = Regex.Replace(input, "[^0-9]", "");
int value = 0;
// Tries to parse the int, returns false on failure.
if (int.TryParse(inputCleaned, out value))
{
// The result from parsing can be safely returned.
return value;
}
return 0; // Or any other default value.
}
}
Then use like this:
INumberList values = new INumberList();
values.Add("cohabitantGender");
values.Add("additionalDriver0LastName");
values.Add("additionalDriver0AgeWhenLicensed");
values.Add("vehicle0City");
values.Add("vehicle1City");
values.Add("vehicle2City");
values.Add("vehicle3City");
//Get filtered index list with multiple exclusion option
List<int> indexList = values.ExcludeIndex("cohabitantGender","")
.ExcludeIndex("additionalDriver","AgeWhenLicensed")
.GetNumberList();
//will return [0,0,1,2,3]
Try this:
public static List<int> FormObjectIndexExtractor(List<string> values, string prefix, string suffix)
{
var s = "^" + prefix + #"(\d+)\.*?" + suffix + "$";
return values
.Where(v => Regex.Match(v, s).Success)
.Select(v=> int.Parse(Regex.Match(v, s).Groups[1].Value))
.ToList();
}
Here is a solution (using modern C# features) that does not use Regex:
public static List<int> FormObjectIndexExtractor(IEnumerable<string> values, string prefix, string suffix)
{
int? TryParseItem(string val)
{
if (val.Length <= prefix.Length + suffix.Length || !val.StartsWith(prefix) || !val.EndsWith(suffix))
return null;
var subStr = val.Substring(prefix.Length, val.Length - prefix.Length - suffix.Length);
if (int.TryParse(subStr, out var number))
return number;
return null;
}
return values.Select(TryParseItem).Where(v => v.HasValue).Select(v => v.Value).ToList();
}
This version splits all the parts of string.
public static List<int> FormObjectIndexExtractor(List<string> values, string prefix, string suffix)
{
List<int> ret = new List<int>();
Regex r = new Regex("^([a-zA-Z]+)(\\d+)([a-zA-Z]+)$");
foreach (var s in values)
{
var match = r.Match(s);
if (match.Success)
{
if (match.Groups[1].ToString() == prefix && match.Groups[3].ToString() == suffix)
{
ret.Add(int.Parse(match.Groups[2].ToString()));
}
}
}
return ret;
}
or alternatively:
public static List<int> FormObjectIndexExtractor(List<string> values, string prefix, string suffix)
{
List<int> ret = new List<int>();
Regex r = new Regex($"^{prefix}(\d+){suffix}$");
foreach (var s in values)
{
var match = r.Match(s);
if (match.Success)
{
ret.Add(int.Parse(match.Groups[1].ToString()));
}
}
return ret;
}
This is the more generic version.
Regex match:
starts with 'vehicle'
matches a number
ends with 'city'.
Parse and return as List<int>
var indexes = values.Where(a => Regex.IsMatch(a, #"^vehicle\d+City$")).
Select(k => int.Parse(Regex.Match(k, #"\d+").Value)).ToList();

Split the string after a word

I want to split a string after a word, not after the character.
Example string:
A-quick-brown-fox-jumps-over-the-lazy-dog
I want to split the string after "jumps-"
can I use the stringname.Split("jumps-") function?
I want the following output:
over-the-lazy-dog.
I suggest using IndexOf and Substring since you actually want a suffix ("String after word"), not a split:
string source = "A-quick-brown-fox-jumps-over-the-lazy-dog";
string split = "jumps-";
// over-the-lazy-dog
string result = source.Substring(source.IndexOf(split) + split.Length);
var theString = "A-quick-brown-fox-jumps-over-the-lazy-dog.";
var afterJumps = theString.Split(new[] { "jumps-" }, StringSplitOptions.None)[1]; //Index 0 would be what is before 'jumps-', index 1 is after.
I usually use extension methods:
public static string join(this string[] strings, string delimiter) { return string.Join(delimiter, strings); }
public static string[] splitR(this string str, params string[] delimiters) { return str.Split(delimiters, StringSplitOptions.RemoveEmptyEntries); }
//public static string[] splitL(this string str, string delimiter = " ", int limit = -1) { return vb.Strings.Split(str, delimiter, limit); }
public static string before(this string str, string delimiter) { int i = (str ?? ""). IndexOf(delimiter ?? ""); return i < 0 ? str : str.Remove (i ); } // or return str.splitR(delimiter).First();
public static string after (this string str, string delimiter) { int i = (str ?? "").LastIndexOf(delimiter ?? ""); return i < 0 ? str : str.Substring(i + delimiter.Length); } // or return str.splitR(delimiter).Last();
sample use:
stringname.after("jumps-").splitR("-"); // splitR removes empty entries
You could extend the Split() method. In fact, I did this a few months ago. Probably not the prettiest code but it gets the job done. This method splits at every jumps-, not just at the first one.
public static class StringExtensions
{
public static string[] Split(this String Source, string Separator)
{
if (String.IsNullOrEmpty(Source))
throw new Exception("Source string is null or empty!");
if (String.IsNullOrEmpty(Separator))
throw new Exception("Separator string is null or empty!");
char[] _separator = Separator.ToArray();
int LastMatch = 0;
List<string> Result = new List<string>();
Func<char[], char[], bool> Matches = (source1, source2) =>
{
for (int i = 0; i < source1.Length; i++)
{
if (source1[i] != source2[i])
return false;
}
return true;
};
for (int i = 0; _separator.Length + i < Source.Length; i++)
{
if (Matches(_separator.ToArray(), Source.Substring(i, _separator.Length).ToArray()))
{
Result.Add(Source.Substring(LastMatch, i - LastMatch));
LastMatch = i + _separator.Length;
}
}
Result.Add(Source.Substring(LastMatch, Source.Length - LastMatch));
return Result.ToArray();
}
}

Get a special, signed part of a string [duplicate]

I'm trying to develop a method that will match all strings between two strings:
I've tried this but it returns only the first match:
string ExtractString(string s, string start,string end)
{
// You should check for errors in real-world code, omitted for brevity
int startIndex = s.IndexOf(start) + start.Length;
int endIndex = s.IndexOf(end, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
}
Let's suppose we have this string
String Text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2"
I would like a c# function doing the following :
public List<string> ExtractFromString(String Text,String Start, String End)
{
List<string> Matched = new List<string>();
.
.
.
return Matched;
}
// Example of use
ExtractFromString("A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2","A1","A2")
// Will return :
// FIRSTSTRING
// SECONDSTRING
// THIRDSTRING
Thank you for your help !
private static List<string> ExtractFromBody(string body, string start, string end)
{
List<string> matched = new List<string>();
int indexStart = 0;
int indexEnd = 0;
bool exit = false;
while (!exit)
{
indexStart = body.IndexOf(start);
if (indexStart != -1)
{
indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);
matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));
body = body.Substring(indexEnd + end.Length);
}
else
{
exit = true;
}
}
return matched;
}
Here is a solution using RegEx. Don't forget to include the following using statement.
using System.Text.RegularExpressions
It will correctly return only text between the start and end strings given.
Will not be returned:
akslakhflkshdflhksdf
Will be returned:
FIRSTSTRING
SECONDSTRING
THIRDSTRING
It uses the regular expression pattern [start string].+?[end string]
The start and end strings are escaped in case they contain regular expression special characters.
private static List<string> ExtractFromString(string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
You could make that into an extension method of String like this:
public static class StringExtensionMethods
{
public static List<string> EverythingBetween(this string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
}
Usage:
string source = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
string start = "A1";
string end = "A2";
List<string> results = source.EverythingBetween(start, end);
text.Split(new[] {"A1", "A2"}, StringSplitOptions.RemoveEmptyEntries);
You can split the string into an array using the start identifier in following code:
String str = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
String[] arr = str.Split("A1");
Then iterate through your array and remove the last 2 characters of each string (to remove the A2). You'll also need to discard the first array element as it will be empty assuming the string starts with A1.
Code is untested, currently on a mobile
This is a generic solution, and I believe more readable code. Not tested, so beware.
public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source,
Func<T, bool> startPredicate,
Func<T, bool> endPredicate,
bool includeDelimiter)
{
var l = new List<T>();
foreach (var s in source)
{
if (startPredicate(s))
{
if (l.Any())
{
l = new List<T>();
}
l.Add(s);
}
else if (l.Any())
{
l.Add(s);
}
if (endPredicate(s))
{
if (includeDelimiter)
yield return l;
else
yield return l.GetRange(1, l.Count - 2);
l = new List<T>();
}
}
}
In your case you can call,
var text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
var splits = text.SplitBy(x => x == "A1", x => x == "A2", false);
This is not the most efficient when you do not want the delimiter to be included (like your case) in result but efficient for opposite cases. To speed up your case one can directly call the GetEnumerator and make use of MoveNext.

Extract all strings between two strings

I'm trying to develop a method that will match all strings between two strings:
I've tried this but it returns only the first match:
string ExtractString(string s, string start,string end)
{
// You should check for errors in real-world code, omitted for brevity
int startIndex = s.IndexOf(start) + start.Length;
int endIndex = s.IndexOf(end, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
}
Let's suppose we have this string
String Text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2"
I would like a c# function doing the following :
public List<string> ExtractFromString(String Text,String Start, String End)
{
List<string> Matched = new List<string>();
.
.
.
return Matched;
}
// Example of use
ExtractFromString("A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2","A1","A2")
// Will return :
// FIRSTSTRING
// SECONDSTRING
// THIRDSTRING
Thank you for your help !
private static List<string> ExtractFromBody(string body, string start, string end)
{
List<string> matched = new List<string>();
int indexStart = 0;
int indexEnd = 0;
bool exit = false;
while (!exit)
{
indexStart = body.IndexOf(start);
if (indexStart != -1)
{
indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);
matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));
body = body.Substring(indexEnd + end.Length);
}
else
{
exit = true;
}
}
return matched;
}
Here is a solution using RegEx. Don't forget to include the following using statement.
using System.Text.RegularExpressions
It will correctly return only text between the start and end strings given.
Will not be returned:
akslakhflkshdflhksdf
Will be returned:
FIRSTSTRING
SECONDSTRING
THIRDSTRING
It uses the regular expression pattern [start string].+?[end string]
The start and end strings are escaped in case they contain regular expression special characters.
private static List<string> ExtractFromString(string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
You could make that into an extension method of String like this:
public static class StringExtensionMethods
{
public static List<string> EverythingBetween(this string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
}
Usage:
string source = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
string start = "A1";
string end = "A2";
List<string> results = source.EverythingBetween(start, end);
text.Split(new[] {"A1", "A2"}, StringSplitOptions.RemoveEmptyEntries);
You can split the string into an array using the start identifier in following code:
String str = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
String[] arr = str.Split("A1");
Then iterate through your array and remove the last 2 characters of each string (to remove the A2). You'll also need to discard the first array element as it will be empty assuming the string starts with A1.
Code is untested, currently on a mobile
This is a generic solution, and I believe more readable code. Not tested, so beware.
public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source,
Func<T, bool> startPredicate,
Func<T, bool> endPredicate,
bool includeDelimiter)
{
var l = new List<T>();
foreach (var s in source)
{
if (startPredicate(s))
{
if (l.Any())
{
l = new List<T>();
}
l.Add(s);
}
else if (l.Any())
{
l.Add(s);
}
if (endPredicate(s))
{
if (includeDelimiter)
yield return l;
else
yield return l.GetRange(1, l.Count - 2);
l = new List<T>();
}
}
}
In your case you can call,
var text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
var splits = text.SplitBy(x => x == "A1", x => x == "A2", false);
This is not the most efficient when you do not want the delimiter to be included (like your case) in result but efficient for opposite cases. To speed up your case one can directly call the GetEnumerator and make use of MoveNext.

c# Sum to string of Char array

So I have these two functions,
public static string[] CharToHex(string str, string prefix, string delimeter)
{
List<string> list = new List<string>();
foreach (char c in str)
{
list.Add(prefix + String.Format("{0:X2}",(int)c) + delimeter);
}
return list.ToArray();
}
public static string[] StrToChar(string str, string prefix, string delimeter)
{
List<string> list = new List<string>();
foreach (char c in str)
{
list.Add(prefix + (int)c + delimeter);
}
return list.ToArray();
}
Basically, I'm trying to show the Sum'd value of both returned arrays in a label.
I created a function to calculate a sum,
public static string ArraySum(int[] array)
{
string sum = array.Sum().ToString();
return sum;
}
And another function to take the string array and convert it to a string,
public static string StringArrayToString(string[] array)
{
StringBuilder builder = new StringBuilder();
foreach (string value in array)
{
builder.Append(value);
}
return builder.ToString();
}
This is how I'm putting it all together,
string[] dec = StrToChar(txtInput.Text, txtPrefix.Text, txtDelimiter.Text);
string[] hex = CharToHex(txtInput.Text, txtPrefix.Text, txtDelimiter.Text);
string decStr = StringArrayToString(dec);
string hexStr = StringArrayToString(hex);
int[] decCount = dec.Select(x => int.Parse(x)).ToArray();
int[] hexCount = hex.Select(x => int.Parse(x)).ToArray();
var builder = new StringBuilder();
Array.ForEach(decCount, x => builder.Append(x));
var res = builder.ToString();
txtDecimal.Text = decStr;
txtHex.Text = hexStr;
lblDecimalSum.Text = res;
The issue here is, this obviously isn't working, it also seems horribly inefficient, there has to be an easier way of doing all of this and also, my sum isn't properly summing up the array elements.
I'm not entirely sure how to go about doing this and any assistance / feedback would be greatly appreciated.
Thank you kindly.
If I understand you correctly, you're trying to get the add the value of each character of a string together, not parse an int from a string and add those together. If that's the case, you can do it with linq:
string x = "xasdgdfhdsfh";
int sum = x.Sum(b => b);
In fact using linq, you can accomplish everything you want to do:
string x = "xasdgdfhdsfh";
string delim = txtDelimiter.Text;
string prefix = txtPrefix.Text;
lblDecimalSum.Text = x.Sum(c => c).ToString();
txtDecimal.Text =
string.Join(delim, x.Select(c => prefix + ((int)c).ToString()));
txtHex.Text =
string.Join(delim, x.Select(c => prefix + ((int)c).ToString("X2")));

Categories

Resources