I have a small project where I have an input sentence where it is possible for the user to specify variations:
The {small|big} car is {red|blue}
Above is a sample sentence i want to split into 4 sentences, like this:
The small car is red
The big car is red
The small car is blue
The big car is blue
I can't seem to wrap my mind around the problem. Maybe someone can helt me pls.
Edit
Here is my initial code
Regex regex = new Regex("{(.*?)}", RegexOptions.Singleline);
MatchCollection collection = regex.Matches(richTextBox1.Text);
string data = richTextBox1.Text;
//build amount of variations
foreach (Match match in collection)
{
string[] alternatives = match.Value.Split(new char[] { '|', '{', '}' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string alternative in alternatives)
{
//here i get problems
}
}
It sounds like you need a dynamic cartesian function for this. Eric Lippert's blog post written in response to Generating all Possible Combinations.
Firstly, we need to parse the input string:
Regex ex = new Regex(#"(?<=\{)(?<words>\w+(\|\w+)*)(?=\})");
var sentence = "The {small|big} car is {red|blue}";
then the input string should be modified to be used in string.Format-like functions:
int matchCount = 0;
var pattern = ex.Replace(sentence, me =>
{
return (matchCount++).ToString();
});
// pattern now contains "The {0} car is {1}"
then we need to find all the matches and to apply Eric's excellent CartesianProduct extension method:
var set = ex.Matches(sentence)
.Cast<Match>()
.Select(m =>
m.Groups["words"].Value
.Split('|')
).CartesianProduct();
foreach (var item in set)
{
Console.WriteLine(pattern, item.ToArray());
}
this will produce:
The small car is red
The small car is blue
The big car is red
The big car is blue
and, finally, the CartesianProduct method (taken from here):
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] {item}));
}
private void ExpandString( List<string> result, string text )
{
var start = text.IndexOf('{');
var end = text.IndexOf('}');
if (start >= 0 && end > start)
{
var head = text.Substring(0, start);
var list = text.Substring(start + 1, end - start - 1).Split('|');
var tail = text.Substring(end + 1);
foreach (var item in list)
ExpandString(result, head + item + tail);
}
else
result.Add(text);
}
Use like:
var result = new List<string>();
ExpandString(result, "The {small|big} car is {red|blue}");
If you don't know the number of variations, recursion is your friend:
static public IEnumerable<string> permute(string template)
{
List<string> options;
string before;
string after;
if (FindFirstOptionList(template, out options, out before, out after))
{
foreach (string option in options)
{
foreach (string permutation in permute(before + option + after))
{
yield return permutation;
}
}
}
else
{
yield return template;
}
}
static public bool FindFirstOptionList(string template, out List<string> options, out string before, out string after)
{
before = string.Empty;
after = string.Empty;
options = new List<string>(0);
if (template.IndexOf('{') == -1)
{
return false;
}
before = template.Substring(0, template.IndexOf('{'));
template = template.Substring(template.IndexOf('{') + 1);
if (template.IndexOf('}') == -1)
{
return false;
}
after = template.Substring(template.IndexOf('}') + 1);
options = template.Substring(0, template.IndexOf('}')).Split('|').ToList();
return true;
}
use is similar to danbystrom's solution, except this one returns an IEnumerable instead of manipulating one of the calling parameters. Beware syntax errors, etc
static void main()
{
foreach(string permutation in permute("The {small|big} car is {red|blue}"))
{
Console.WriteLine(permutation);
}
}
I would propose to split the input text into an ordered list of static and dynamic parts. Each dynamic part itself contains a list that stores its values and an index that represents the currently selected value. This index is intially set to zero.
To print out all possible combinations you at first have to implement a method that prints the complete list using the currently set indices of the dynamic parts. For the first call all indices will be set to zero.
Now you can increment the index of the first dynamic part and print the complete list. This will give you the first variation. Repeat this until you printed all possible values of the remaining dynamic parts.
Consider nesting iterative loops. Something like...
foreach(string s in someStringSet)
{
foreach(string t in someOtherStringSet)
{
// do something with s and t
}
}
Perhaps you are looking for this:
Edited version
static void Main(string[] args)
{
var thisstring = "The {Small|Big} car is {Red|Blue}";
string FirstString = thisstring.Substring(thisstring.IndexOf("{"), (thisstring.IndexOf("}") - thisstring.IndexOf("{")) + 1);
string[] FirstPossibility = FirstString.Replace("{", "").Replace("}", "").Split('|');
thisstring = thisstring.Replace(FirstString, "[0]");
string SecondString = thisstring.Substring(thisstring.IndexOf("{"), (thisstring.IndexOf("}") - thisstring.IndexOf("{")) + 1);
string[] SecondPosibility = SecondString.Replace("{", "").Replace("}", "").Split('|');
thisstring = thisstring.Replace(SecondString, "{1}").Replace("[0]", "{0}");
foreach (string tempFirst in FirstPossibility)
{
foreach (string tempSecond in SecondPosibility)
{
Console.WriteLine(string.Format(thisstring, tempFirst, tempSecond));
}
}
Console.Read();
}
Something like this should work:
private void Do()
{
string str = "The {small|big} car is {red|blue}";
Regex regex = new Regex("{(.*?)}", RegexOptions.Singleline);
int i = 0;
var strWithPlaceHolders = regex.Replace(str, m => "{" + (i++).ToString() + "}");
var collection = regex.Matches(str);
var alternatives = collection.OfType<Match>().Select(m => m.Value.Split(new char[] { '|', '{', '}' }, StringSplitOptions.RemoveEmptyEntries));
var replacers = GetReplacers(alternatives);
var combinations = new List<string>();
foreach (var replacer in replacers)
{
combinations.Add(string.Format(strWithPlaceHolders, replacer));
}
}
private IEnumerable<object[]> GetReplacers(IEnumerable<string[]> alternatives)
{
return GetAllPossibilities(0, alternatives.ToList());
}
private IEnumerable<object[]> GetAllPossibilities(int level, List<string[]> list)
{
if (level == list.Count - 1)
{
foreach (var elem in list[level])
yield return new[] { elem };
}
else
{
foreach (var elem in list[level])
{
var thisElemAsArray = new object[] { elem };
foreach (var subPossibilities in GetAllPossibilities(level + 1, list))
yield return thisElemAsArray.Concat(subPossibilities).ToArray();
}
}
yield break;
}
string color = SomeMethodToGetColor();
string size = SomeMethodToGetSize();
string sentence = string.Format("The {0} car is {1}", size, color);
Related
I have this on a string that is generated from a list how can I add a blank value in between the string.
using (var file = File.CreateText()
{
foreach (var permutation in result)
{
file.WriteLine(string.Join(",", permutation));
//i++;
}
}
Here is the result:
string result = A,B,C,D,E,F,G,H,I,J
How can I add a blank value ", ," in the result at a specific index
Example: A,B,C,D,E,F, ,G,H,I,J
Note: result is a permutation and length is not consistent
Result is a IEnumerable can it be split?
You can split your string with the comma as the separator, go through your characters and add them to a list. Add an extra empty string at your desired index and join the data back together. This way you don't have to deal with deciding if you need to add a commar or not when concatenating the strings.
var index = 6;
var data = "A,B,C,D,E,F,G,H,I,J";
var chars = data.Split(new [] { ',' });
List<string> resultData = new List<string>();
var i = 0;
foreach (var c in chars) {
if (i == index) {
resultData.Add(" ");
}
i++;
resultData.Add(c);
}
var result = string.Join(",", resultData.ToArray();
You can use the following basic algorithm to follow your questions code logic:
using (var file = File.CreateText())
{
int i = -1;// Loop counter
int termReplacementIndex = 3;// Input or constant to find letter to replace
foreach (var permutation in result)
{
i++;
if (i == ((termReplacementIndex * 2) - 2))// Only stops on letters not commas
{
file.WriteLine(string.Join(",", " ,"));// Adding of blank " "
}
file.WriteLine(string.Join(",", permutation));// Adding of letter
}
}
If permutation is a list you can just insert a null value at the required index:
using System;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var list = new List<string>();
list.Add("test 1");
list.Add("test 2");
list.Add("test 3");
list.Insert(2, null);
var str = string.Join(",", list);
Console.WriteLine(str); // test 1,test 2,,test 3
}
}
Modify the Collection before joining it so you wont have to split it and iterate it again.
As the side of permutation vary you could be trying to insert at index X when X is larger greater permutation.Count() instead of an ArgumentOutOfRangeException I decide to add the element at the end. But you will have to define that.
public string CsvProjection(List<string> inputs, int needleIndex, string needleValue)
{
if (needleIndex < 0)
{
throw new ArgumentOutOfRangeException("needleIndex must be positive.");
}
if (needleIndex > inputs.Count())
{//Either throw an exception because out of bound or add to the end
inputs.Add(needleValue);
}
inputs.Insert(needleIndex, needleValue);
return string.Join(",", inputs);
}
Usage simply add the value and the index variable:
var needleIndex= 2;
var needleValue = "My New Value";
using (var file = File.CreateText()
{
foreach (var permutation in result)
{
file.WriteLine(CsvProjection(permutation, needleIndex, needleValue));
}
}
I'm trying to develop a method that will match all strings between two strings:
I've tried this but it returns only the first match:
string ExtractString(string s, string start,string end)
{
// You should check for errors in real-world code, omitted for brevity
int startIndex = s.IndexOf(start) + start.Length;
int endIndex = s.IndexOf(end, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
}
Let's suppose we have this string
String Text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2"
I would like a c# function doing the following :
public List<string> ExtractFromString(String Text,String Start, String End)
{
List<string> Matched = new List<string>();
.
.
.
return Matched;
}
// Example of use
ExtractFromString("A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2","A1","A2")
// Will return :
// FIRSTSTRING
// SECONDSTRING
// THIRDSTRING
Thank you for your help !
private static List<string> ExtractFromBody(string body, string start, string end)
{
List<string> matched = new List<string>();
int indexStart = 0;
int indexEnd = 0;
bool exit = false;
while (!exit)
{
indexStart = body.IndexOf(start);
if (indexStart != -1)
{
indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);
matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));
body = body.Substring(indexEnd + end.Length);
}
else
{
exit = true;
}
}
return matched;
}
Here is a solution using RegEx. Don't forget to include the following using statement.
using System.Text.RegularExpressions
It will correctly return only text between the start and end strings given.
Will not be returned:
akslakhflkshdflhksdf
Will be returned:
FIRSTSTRING
SECONDSTRING
THIRDSTRING
It uses the regular expression pattern [start string].+?[end string]
The start and end strings are escaped in case they contain regular expression special characters.
private static List<string> ExtractFromString(string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
You could make that into an extension method of String like this:
public static class StringExtensionMethods
{
public static List<string> EverythingBetween(this string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
}
Usage:
string source = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
string start = "A1";
string end = "A2";
List<string> results = source.EverythingBetween(start, end);
text.Split(new[] {"A1", "A2"}, StringSplitOptions.RemoveEmptyEntries);
You can split the string into an array using the start identifier in following code:
String str = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
String[] arr = str.Split("A1");
Then iterate through your array and remove the last 2 characters of each string (to remove the A2). You'll also need to discard the first array element as it will be empty assuming the string starts with A1.
Code is untested, currently on a mobile
This is a generic solution, and I believe more readable code. Not tested, so beware.
public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source,
Func<T, bool> startPredicate,
Func<T, bool> endPredicate,
bool includeDelimiter)
{
var l = new List<T>();
foreach (var s in source)
{
if (startPredicate(s))
{
if (l.Any())
{
l = new List<T>();
}
l.Add(s);
}
else if (l.Any())
{
l.Add(s);
}
if (endPredicate(s))
{
if (includeDelimiter)
yield return l;
else
yield return l.GetRange(1, l.Count - 2);
l = new List<T>();
}
}
}
In your case you can call,
var text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
var splits = text.SplitBy(x => x == "A1", x => x == "A2", false);
This is not the most efficient when you do not want the delimiter to be included (like your case) in result but efficient for opposite cases. To speed up your case one can directly call the GetEnumerator and make use of MoveNext.
I am writing a simple console application to would allow me to count the occurrence of each unique word.
for example the console will allow the user to type a sentence, once press enter the system should count the number of time each words occurs. so far I can only count characters. any help would be appreciated.
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Please enter string");
string input = Convert.ToString(Console.ReadLine());
Dictionary<string, int> objdict = new Dictionary<string, int>();
foreach (var j in input)
{
if (objdict.ContainsKey(j.ToString()))
{
objdict[j.ToString()] = objdict[j.ToString()] + 1;
}
else
{
objdict.Add(j.ToString(), 1);
}
}
foreach (var temp in objdict)
{
Console.WriteLine("{0}:{1}", temp.Key, temp.Value);
}
Console.ReadLine();
}
}
Try this method:
private void countWordsInALIne(string line, Dictionary<string, int> words)
{
var wordPattern = new Regex(#"\w+");
foreach (Match match in wordPattern.Matches(line))
{
int currentCount=0;
words.TryGetValue(match.Value, out currentCount);
currentCount++;
words[match.Value] = currentCount;
}
}
Call the above method like this:
var words = new Dictionary<string, int>(StringComparer.CurrentCultureIgnoreCase);
countWordsInALine(line, words);
In the words dictionary you will find the words (key) along with its occurance frequency (value).
Just call Split method passing single space (assuming word is seperated by single space) and it would give collection of each word then iterate over each element of collection with the same logic you were having.
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Please enter string");
string input = Convert.ToString(Console.ReadLine());
Dictionary<string, int> objdict = new Dictionary<string, int>();
foreach (var j in input.Split(" "))
{
if (objdict.ContainsKey(j))
{
objdict[j] = objdict[j] + 1;
}
else
{
objdict.Add(j, 1);
}
}
foreach (var temp in objdict)
{
Console.WriteLine("{0}:{1}", temp.Key, temp.Value);
}
Console.ReadLine();
}
}
You need to split the string on spaces (or any other characters which you consider to delimit words). Try changing the loop to this:
foreach (string Word in input.Split(' ')) {
}
Might I suggest a ternary-tree to make things efficient?
Here's a link to a C# implementation: http://igoro.com/archive/efficient-auto-complete-with-a-ternary-search-tree/
After first inserting into the tree, you could simply call "Contains" with one of the implementations above to make things quick
Try this...
var theList = new List<string>() { "Alpha", "Alpha", "Beta", "Gamma", "Delta" };
theList.GroupBy(txt => txt)
.Where(grouping => grouping.Count() > 1)
.ToList()
.ForEach(groupItem => Console.WriteLine("{0} duplicated {1} times with these values {2}",
groupItem.Key,
groupItem.Count(),
string.Join(" ", groupItem.ToArray())));
Console.ReadKey();
http://omegacoder.com/?p=792
How can I split string (from a textbox) by commas excluding those in double quotation marks (without getting rid of the quotation marks), along with other possible punctuation marks (e.g. ' . ' ' ; ' ' - ')?
E.g. If someone entered the following into the textbox:
apple, orange, "baboons, cows", rainbow, "unicorns, gummy bears"
How can I split the above string into the following (say, into a List)?
apple
orange
"baboons, cows"
rainbow
"Unicorns, gummy bears..."
Thank you for your help!
You could try the below regex which uses positive lookahead,
string value = #"apple, orange, ""baboons, cows"", rainbow, ""unicorns, gummy bears""";
string[] lines = Regex.Split(value, #", (?=(?:""[^""]*?(?: [^""]*)*))|, (?=[^"",]+(?:,|$))");
foreach (string line in lines) {
Console.WriteLine(line);
}
Output:
apple
orange
"baboons, cows"
rainbow
"unicorns, gummy bears"
IDEONE
Try this:
Regex str = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);
foreach (Match m in str.Matches(input))
{
Console.WriteLine(m.Value.TrimStart(','));
}
You may also try to look at FileHelpers
Much like a CSV parser, instead of Regex, you can loop through each character, like so:
public List<string> ItemStringToList(string inputString)
{
var itemList = new List<string>();
var currentIem = "";
var quotesOpen = false;
for (int i = 0; i < inputString.Length; i++)
{
if (inputString[i] == '"')
{
quotesOpen = !quotesOpen;
continue;
}
if (inputString[i] == ',' && !quotesOpen)
{
itemList.Add(currentIem);
currentIem = "";
continue;
}
if (currentIem == "" && inputString[i] == ' ') continue;
currentIem += inputString[i];
}
if (currentIem != "") itemList.Add(currentIem);
return itemList;
}
Example test usage:
var test1 = ItemStringToList("one, two, three");
var test2 = ItemStringToList("one, \"two\", three");
var test3 = ItemStringToList("one, \"two, three\"");
var test4 = ItemStringToList("one, \"two, three\", four, \"five six\", seven");
var test5 = ItemStringToList("one, \"two, three\", four, \"five six\", seven");
var test6 = ItemStringToList("one, \"two, three\", four, \"five six, seven\"");
var test7 = ItemStringToList("\"one, two, three\", four, \"five six, seven\"");
You could change it to use StringBuilder if you want faster character joining.
Try with this it will work u c an split array string in many waysif you want to split by white space just put a space in (' ') .
namespace LINQExperiment1
{
class Program
{
static void Main(string[] args)
{
string[] sentence = new string[] { "apple", "orange", "baboons cows", " rainbow", "unicorns gummy bears" };
Console.WriteLine("option 1:"); Console.WriteLine("————-");
// option 1: Select returns three string[]’s with
// three strings in each.
IEnumerable<string[]> words1 =
sentence.Select(w => w.Split(' '));
// to get each word, we have to use two foreach loops
foreach (string[] segment in words1)
foreach (string word in segment)
Console.WriteLine(word);
Console.WriteLine();
Console.WriteLine("option 2:"); Console.WriteLine("————-");
// option 2: SelectMany returns nine strings
// (sub-iterates the Select result)
IEnumerable<string> words2 =
sentence.SelectMany(segment => segment.Split(','));
// with SelectMany we have every string individually
foreach (var word in words2)
Console.WriteLine(word);
// option 3: identical to Opt 2 above written using
// the Query Expression syntax (multiple froms)
IEnumerable<string> words3 =from segment in sentence
from word in segment.Split(' ')
select word;
}
}
}
This was trickier than I thought, a good practical problem I think.
Below is the solution I came up with for this. One thing I don't like about my solution is having to add double quotations back and the other one being names of the variables :p:
internal class Program
{
private static void Main(string[] args)
{
string searchString =
#"apple, orange, ""baboons, cows. dogs- hounds"", rainbow, ""unicorns, gummy bears"", abc, defghj";
char delimeter = ',';
char excludeSplittingWithin = '"';
string[] splittedByExcludeSplittingWithin = searchString.Split(excludeSplittingWithin);
List<string> splittedSearchString = new List<string>();
for (int i = 0; i < splittedByExcludeSplittingWithin.Length; i++)
{
if (i == 0 || splittedByExcludeSplittingWithin[i].StartsWith(delimeter.ToString()))
{
string[] splitttedByDelimeter = splittedByExcludeSplittingWithin[i].Split(delimeter);
for (int j = 0; j < splitttedByDelimeter.Length; j++)
{
splittedSearchString.Add(splitttedByDelimeter[j].Trim());
}
}
else
{
splittedSearchString.Add(excludeSplittingWithin + splittedByExcludeSplittingWithin[i] +
excludeSplittingWithin);
}
}
foreach (string s in splittedSearchString)
{
if (s.Trim() != string.Empty)
{
Console.WriteLine(s);
}
}
Console.ReadKey();
}
}
Another Regex solution:
private static IEnumerable<string> Parse(string input)
{
// if used frequently, should be instantiated with Compiled option
Regex regex = new Regex(#"(?<=^|,\s)(\""(?:[^\""]|\""\"")*\""|[^,\s]*)");
return regex.Matches(inputData).Where(m => m.Success);
}
I'm trying to develop a method that will match all strings between two strings:
I've tried this but it returns only the first match:
string ExtractString(string s, string start,string end)
{
// You should check for errors in real-world code, omitted for brevity
int startIndex = s.IndexOf(start) + start.Length;
int endIndex = s.IndexOf(end, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
}
Let's suppose we have this string
String Text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2"
I would like a c# function doing the following :
public List<string> ExtractFromString(String Text,String Start, String End)
{
List<string> Matched = new List<string>();
.
.
.
return Matched;
}
// Example of use
ExtractFromString("A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2","A1","A2")
// Will return :
// FIRSTSTRING
// SECONDSTRING
// THIRDSTRING
Thank you for your help !
private static List<string> ExtractFromBody(string body, string start, string end)
{
List<string> matched = new List<string>();
int indexStart = 0;
int indexEnd = 0;
bool exit = false;
while (!exit)
{
indexStart = body.IndexOf(start);
if (indexStart != -1)
{
indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);
matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));
body = body.Substring(indexEnd + end.Length);
}
else
{
exit = true;
}
}
return matched;
}
Here is a solution using RegEx. Don't forget to include the following using statement.
using System.Text.RegularExpressions
It will correctly return only text between the start and end strings given.
Will not be returned:
akslakhflkshdflhksdf
Will be returned:
FIRSTSTRING
SECONDSTRING
THIRDSTRING
It uses the regular expression pattern [start string].+?[end string]
The start and end strings are escaped in case they contain regular expression special characters.
private static List<string> ExtractFromString(string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
You could make that into an extension method of String like this:
public static class StringExtensionMethods
{
public static List<string> EverythingBetween(this string source, string start, string end)
{
var results = new List<string>();
string pattern = string.Format(
"{0}({1}){2}",
Regex.Escape(start),
".+?",
Regex.Escape(end));
foreach (Match m in Regex.Matches(source, pattern))
{
results.Add(m.Groups[1].Value);
}
return results;
}
}
Usage:
string source = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
string start = "A1";
string end = "A2";
List<string> results = source.EverythingBetween(start, end);
text.Split(new[] {"A1", "A2"}, StringSplitOptions.RemoveEmptyEntries);
You can split the string into an array using the start identifier in following code:
String str = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
String[] arr = str.Split("A1");
Then iterate through your array and remove the last 2 characters of each string (to remove the A2). You'll also need to discard the first array element as it will be empty assuming the string starts with A1.
Code is untested, currently on a mobile
This is a generic solution, and I believe more readable code. Not tested, so beware.
public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source,
Func<T, bool> startPredicate,
Func<T, bool> endPredicate,
bool includeDelimiter)
{
var l = new List<T>();
foreach (var s in source)
{
if (startPredicate(s))
{
if (l.Any())
{
l = new List<T>();
}
l.Add(s);
}
else if (l.Any())
{
l.Add(s);
}
if (endPredicate(s))
{
if (includeDelimiter)
yield return l;
else
yield return l.GetRange(1, l.Count - 2);
l = new List<T>();
}
}
}
In your case you can call,
var text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
var splits = text.SplitBy(x => x == "A1", x => x == "A2", false);
This is not the most efficient when you do not want the delimiter to be included (like your case) in result but efficient for opposite cases. To speed up your case one can directly call the GetEnumerator and make use of MoveNext.