Find count of each consecutive characters - c#

Need to find the count of each consecutive characters in a row.
Ex: aaaabbccaa
output: 4a2b2c2a
Character may repeat but need to count only consecutive ones. I also need to maintain original sequence.
I tried following but it groups all characters so was not useful.
str.GroupBy(c => c).Select(g => new { g.Key, Count = g.Count() }).ToList().ForEach(x => str+= x.Count + "" + x.Key)

Regular expression to the rescue ?
var myString = "aaaabbccaa";
var pattern = #"(\w)\1*";
var regExp = new Regex(pattern);
var matches = regExp.Matches(myString);
var tab = matches.Select(x => String.Format("{0}{1}", x.Value.First(), x.Value.Length));
var result = String.Join("", tab);

Here is a LINQ solution:
var input = "aaaabbccaa";
var result = string.IsNullOrEmpty(input) ? "" : string.Join("",input.Skip(1)
.Aggregate((t:input[0].ToString(),o:Enumerable.Empty<string>()),
(a,c)=>a.t[0]==c ? (a.t+c,a.o) : (c.ToString(),a.o.Append(a.t)),
a=>a.o.Append(a.t).Select(p => $"{p.Length}{p[0]}")));
Here is the iterator solution:
var result = RleString("aaaabbccaa");
private static IEnumerable<(char chr, int count)> Rle(string s)
{
if (string.IsNullOrEmpty(s)) yield break;
var lastchar = s.First(); // or s[0]
var count = 1;
foreach (char letter in s.Skip(1))
{
if (letter != lastchar)
{
yield return (lastchar, count);
lastchar = letter;
count = 0;
}
count++;
}
if (count > 0)
yield return (lastchar, count);
}
private static string RleString(string s)
{
return String.Join("",Rle(s).Select(z=>$"{z.count}{z.chr}"));
}

Non-LINQ solution (dotnetfiddle):
using System;
using System.Text;
public class Program
{
public static void Main()
{
// produces 4a2b2c2a
Console.WriteLine(GetConsecutiveGroups("aaaabbccaa"));
}
private static string GetConsecutiveGroups(string input)
{
var result = new StringBuilder();
var sb = new StringBuilder();
foreach (var c in input)
{
if (sb.Length == 0 || sb[sb.Length - 1] == c)
{
sb.Append(c);
}
else
{
result.Append($"{sb.Length}{sb[0]}");
sb.Clear();
sb.Append(c);
}
}
if (sb.Length > 0)
{
result.Append($"{sb.Length}{sb[0]}");
}
return result.ToString();
}
}

This small program will do the trick, but it's not a single line nice linq statement. Just my two cents.
using System;
using System.Linq;
using System.Collections.Generic;
public class Simple {
public static void Main() {
var text = "aaaabbccaa"; //output: 4a3b2c2a
var lista = new List<string>();
var previousLetter = text.Substring(1,1);
var item = string.Empty;
foreach (char letter in text)
{
if (previousLetter == letter.ToString()){
item += letter.ToString();
}
else
{
lista.Add(item);
item = letter.ToString();
}
previousLetter = letter.ToString();
}
lista.Add(item);
foreach (var i in lista)
Console.WriteLine(i.Substring(1,1) + i.Select(y => y).ToList().Count().ToString());
}
}

Here is my non-LINQ version that is quite fast compared to LINQ or Regex:
var prevChar = str[0];
var ct = 1;
var s = new StringBuilder();
var len = str.Length;
for (int j2 = 1; j2 < len; ++j2) {
if (str[j2] == prevChar)
++ct;
else {
s.Append(ct);
s.Append(prevChar);
ct = 1;
prevChar = str[j2];
}
}
s.Append(ct);
s.Append(prevChar);
var final = s.ToString();
}
My LINQ version looks like this, but uses a couple of extension methods I already had:
var ans = str.GroupByRuns().Select(s => $"{s.Count()}{s.Key}").Join();

var chars = "aaaabbccaa".ToCharArray();
int counter = 1;
for (var i = 0; i < chars.Count(); i++)
{
if (i + 1 >= chars.Count() || chars[i] != chars[i + 1])
{
Console.Write($"{counter}{chars[i]}");
counter = 1;
}
else
{
counter++;
}
}

You could have a character var and a counter var outside your Linq scope to keep track of the previous character and the current count and then use linq foreach, but I am just as curious as the rest to why you insist on doing this. Even if you do, the Solution may not be as easy to read as an iterative version and readability and maintenance overhead is very import if anyone else is ever going to read it.

Related

How can I use indexof and substring to find words in a string?

In the constructor :
var tempFR = File.ReadAllText(file);
GetResults(tempFR);
Then :
private List<string> GetResults(string file)
{
List<string> results = new List<string>();
string word = textBox1.Text;
string[] words = word.Split(new string[] { ",," }, StringSplitOptions.None);
for(int i = 0; i < words.Length; i++)
{
int start = file.IndexOf(words[i], 0);
results.Add(file.Substring(start));
}
return results;
}
words contains in this case 3 words System , public , test
I want to find all the words in file and add them to the list results using indexof and substring.
The way it is now start value is -1 all the time.
To clear some things.
This is a screenshot of the textBox1 :
That is why I'm using two commas to split and get the words.
This screenshot showing the words after split them from the textBox1 :
And this is the file string content :
I want to add to the List results all the words in the file.
When looking at the last screenshot there should be 11 results.
Three time the word using three times the word system five times the word public.
but the variable start is -1
Update :
Tried Barns solution/s but for me it's not working good.
First the code that make a search and then loop over the files and reporting to backgroundworker :
int numberofdirs = 0;
void DirSearch(string rootDirectory, string filesExtension, string[] textToSearch, BackgroundWorker worker, DoWorkEventArgs e)
{
List<string> filePathList = new List<string>();
int numberoffiles = 0;
try
{
filePathList = SearchAccessibleFilesNoDistinct(rootDirectory, null, worker, e).ToList();
}
catch (Exception err)
{
}
label21.Invoke((MethodInvoker)delegate
{
label21.Text = "Phase 2: Searching in files";
});
MyProgress myp = new MyProgress();
myp.Report4 = filePathList.Count.ToString();
foreach (string file in filePathList)
{
try
{
var tempFR = File.ReadAllText(file);
_busy.WaitOne();
if (worker.CancellationPending == true)
{
e.Cancel = true;
return;
}
bool reportedFile = false;
for (int i = 0; i < textToSearch.Length; i++)
{
if (tempFR.IndexOf(textToSearch[i], StringComparison.InvariantCultureIgnoreCase) >= 0)
{
if (!reportedFile)
{
numberoffiles++;
myp.Report1 = file;
myp.Report2 = numberoffiles.ToString();
myp.Report3 = textToSearch[i];
myp.Report5 = FindWordsWithtRegex(tempFR, textToSearch);
backgroundWorker1.ReportProgress(0, myp);
reportedFile = true;
}
}
}
numberofdirs++;
label1.Invoke((MethodInvoker)delegate
{
label1.Text = string.Format("{0}/{1}", numberofdirs, myp.Report4);
label1.Visible = true;
});
}
catch (Exception err)
{
}
}
}
I have the words array already in textToSearch and the file content in tempFR then I'm using the first solution of Barns :
private List<string> FindWordsWithtRegex(string filecontent, string[] words)
{
var res = new List<string>();
foreach (var word in words)
{
Regex reg = new Regex(word);
var c = reg.Matches(filecontent);
int k = 0;
foreach (var g in c)
{
Console.WriteLine(g.ToString());
res.Add(g + ":" + k++);
}
}
Console.WriteLine("Results of FindWordsWithtRegex");
res.ForEach(f => Console.WriteLine(f));
Console.WriteLine();
return res;
}
But the results I'm getting in the List res is not the same output in Barns solution/s this is the results I'm getting the List res for the first file :
In this case two words system and using but it found only the using 3 times but there is also system 3 times in the file content. and the output format is not the same as in the Barns solutions :
Here is an alternative using Regex instead of using IndexOf. Note I have created my own string to parse, so my results will be a bit different.
EDIT
private List<string> FindWordsWithCountRegex(string filecontent, string[] words)
{
var res = new List<string>();
foreach (var word in words)
{
Regex reg = new Regex(word, RegexOptions.IgnoreCase);
var c = reg.Matches(filecontent).Count();
res.Add(word + ":" + c);
}
return res;
}
Simple change this part and use a single char typically a space not a comma:
string[] words = word.Split(' ');
int start = file.IndexOf(words[i],0);
start will be -1 if the word is not found.
MSDN: IndexOf(String, Int32)
for(int i = 0; i < words.Length; i++)
{
int start = file.IndexOf(words[i], 0);
// only add to results if word is found (index >= 0)
if (start >= 0) results.Add(file.Substring(start));
}
If you want all appearance of the words you need an extra loop
int fileLength = file.Length;
for(int i = 0; i < words.Length; i++)
{
int startIdx = 0;
while (startIdx < fileLength ){
int idx = file.IndexOf(words[i], startIdx]);
if (start >= 0) {
// add to results
results.Add(file.Substring(start));
// and let Word-search continue from last found Word Position Ending
startIdx = (start + words.Length);
}
}
int start = file.IndexOf(words[i], 0);
// only add to results if word is found (index >= 0)
if (start >= 0) results.Add(file.Substring(start));
}
MayBe you want a caseinsensitiv search
file.IndexOf(words[i], 0, StringComparison.CurrentCultureIgnoreCase); MSDN: StringComparer Class

Array of string management

I have an array of string, I want to take all the string in an interval of this array until string does not contains something.
Something like:
string [] arrayReading = {
"e","x","a","takefromhere",
"keeptaking","keeptaking","dont'ttakefromhere","m","p","l","e"
};
I have tried:
List<string> result = null;
for (int i = 0; i < arrayReading.Length; i++)
{
if (arrayReading[i].Contains("takefromhere"))
{
result.Add(arrayReading[i]);
if (!arrayReading[i + 1].Contains("dont'ttakefromhere"))
{
result.Add(arrayReading[i + 1]);
if (!arrayReading[i + 2].Contains("dont'ttakefromhere"))
{
rescription.Add(arrayReading[i + 1]);
}
}
}
}
Seems working but it's not really dynamic as I want it, because maybe I need to take 20 values between "takefromhere" and "don'ttakefromhere".
When querying you can try Linq:
using System.Linq;
...
List<string> result = arrayReading
.SkipWhile(item => item != "takefromhere")
.TakeWhile(item => item != "dont'ttakefromhere")
.ToList();
Or if you want good old loop solution:
List<string> result = new List<string>();
bool taking = false;
foreach (string item in arrayReading) {
if (!taking)
taking = item == "takefromhere";
if (taking) {
if (item == "dont'ttakefromhere")
break;
result.Add(item);
}
}
Let's have a look:
Console.Write(string.Join("; ", result));
Outcome:
takefromhere; keeptaking; keeptaking

Make every other a-z letter Upper / Lower case, ignoring whitespace

Can somebody tell me what I am doing wrong please? can't seem to get the expected output, i.e. ignore whitespace and only upper/lowercase a-z characters regardless of the number of whitespace characters
my code:
var sentence = "dancing sentence";
var charSentence = sentence.ToCharArray();
var rs = "";
for (var i = 0; i < charSentence.Length; i++)
{
if (charSentence[i] != ' ')
{
if (i % 2 == 0 && charSentence[i] != ' ')
{
rs += charSentence[i].ToString().ToUpper();
}
else if (i % 2 == 1 && charSentence[i] != ' ')
{
rs += sentence[i].ToString().ToLower();
}
}
else
{
rs += " ";
}
}
Console.WriteLine(rs);
Expected output: DaNcInG sEnTeNcE
Actual output: DaNcInG SeNtEnCe
I use flag instead of i because (as you mentioned) white space made this algorithm work wrong:
var sentence = "dancing sentence";
var charSentence = sentence.ToCharArray();
var rs = "";
var flag = true;
for (var i = 0; i < charSentence.Length; i++)
{
if (charSentence[i] != ' ')
{
if (flag)
{
rs += charSentence[i].ToString().ToUpper();
}
else
{
rs += sentence[i].ToString().ToLower();
}
flag = !flag;
}
else
{
rs += " ";
}
}
Console.WriteLine(rs);
Try a simple Finite State Automata with just two states (upper == true/false); another suggestion is to use StringBuilder:
private static string ToDancing(string value) {
if (string.IsNullOrEmpty(value))
return value;
bool upper = false;
StringBuilder sb = new StringBuilder(value.Length);
foreach (var c in value)
if (char.IsLetter(c))
sb.Append((upper = !upper) ? char.ToUpper(c) : char.ToLower(c));
else
sb.Append(c);
return sb.ToString();
}
Test
var sentence = "dancing sentence";
Console.Write(ToDancing(sentence));
Outcome
DaNcInG sEnTeNcE
I think you should declare one more variable called isUpper. Now you have two variables, i indicates the index of the character that you are iterating next and isUpper indicates whether a letter should be uppercase.
You increment i as usual, but set isUpper to true at first:
// before the loop
boolean isUpper = true;
Then, rather than checking whether i is divisible by 2, check isUpper:
if (isUpper)
{
rs += charSentence[i].ToString().ToUpper();
}
else
{
rs += sentence[i].ToString().ToLower();
}
Immediately after the above if statement, "flip" isUpper:
isUpper = !isUpper;
Linq version
var sentence = "dancing sentence";
int i = 0;
string result = string.Concat(sentence.Select(x => { i += x == ' ' ? 0 : 1; return i % 2 != 0 ? char.ToUpper(x) : char.ToLower(x); }));
Sidenote:
please replace charSentence[i].ToString().ToUpper() with char.ToUpper(charSentence[i])
Thanks #Dmitry Bychenko. Best Approach. But i thought as per the OP's (might be a fresher...) mindset, what could be the solution. Here i have the code as another solution.
Lengthy code. I myself don't like but still representing
class Program
{
static void Main(string[] args)
{
var sentence = "dancing sentence large also";
string newString = string.Empty;
StringBuilder newStringdata = new StringBuilder();
string[] arr = sentence.Split(' ');
for (int i=0; i< arr.Length;i++)
{
if (i==0)
{
newString = ReturnEvenModifiedString(arr[i]);
newStringdata.Append(newString);
}
else
{
if(char.IsUpper(newString[newString.Length - 1]))
{
newString = ReturnOddModifiedString(arr[i]);
newStringdata.Append(" ");
newStringdata.Append(newString);
}
else
{
newString = ReturnEvenModifiedString(arr[i]);
newStringdata.Append(" ");
newStringdata.Append(newString);
}
}
}
Console.WriteLine(newStringdata.ToString());
Console.Read();
}
//For Even Test
private static string ReturnEvenModifiedString(string initialString)
{
string newString = string.Empty;
var temparr = initialString.ToCharArray();
for (var i = 0; i < temparr.Length; i++)
{
if (temparr[i] != ' ')
{
if (i % 2 == 0 && temparr[i] != ' ')
{
newString += temparr[i].ToString().ToUpper();
}
else
{
newString += temparr[i].ToString().ToLower();
}
}
}
return newString;
}
//For Odd Test
private static string ReturnOddModifiedString(string initialString)
{
string newString = string.Empty;
var temparr = initialString.ToCharArray();
for (var i = 0; i < temparr.Length; i++)
{
if (temparr[i] != ' ')
{
if (i % 2 != 0 && temparr[i] != ' ')
{
newString += temparr[i].ToString().ToUpper();
}
else
{
newString += temparr[i].ToString().ToLower();
}
}
}
return newString;
}
}
OUTPUT

Where can I find a _simple_, easy to understand implementation of an LR(1) parser generator?

Where can I find a simple (as much as possible, but no simpler!) implementation of an LR(1) parser generator?
I'm not looking for performance, just the ability to generate the LR(1) states (item sets).
C++, C#, Java, and Python would all work for me.
I've written a very simple one in C# and wanted to share it here.
It basically populates the action lookup table, which tells you which state to shift to or which rule to use for reduction.
If the number is nonnegative, then it denotes a new state; if it's negative, then its one's complement (i.e. ~x) denotes the rule index.
All you need now is to make a lexer and to do the actual parsing with the action table.
Note 1: It might be quite slow at generating the states for a real-world grammar, so you might want to think twice before using it in production code!
Note 2: You might want to double-check its correctness, since I've only checked it a bit.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using size_t = System.UInt32;
public class LRParser
{
private string[] symbols; // index => symbol
private IDictionary<string, size_t> interned = new SortedDictionary<string, size_t>(); // symbol => index
private int[/*state*/, /*lookahead*/] actions; // If >= 0, represents new state after shift. If < 0, represents one's complement (i.e. ~x) of reduction rule.
public LRParser(params KeyValuePair<string, string[]>[] grammar)
{
this.interned.Add(string.Empty, new size_t());
foreach (var rule in grammar)
{
if (!this.interned.ContainsKey(rule.Key))
{ this.interned.Add(rule.Key, (size_t)this.interned.Count); }
foreach (var symbol in rule.Value)
{
if (!this.interned.ContainsKey(symbol))
{ this.interned.Add(symbol, (size_t)this.interned.Count); }
}
}
this.symbols = this.interned.ToArray().OrderBy(p => p.Value).Select(p => p.Key).ToArray();
var syntax = Array.ConvertAll(grammar, r => new KeyValuePair<size_t, size_t[]>(this.interned[r.Key], Array.ConvertAll(r.Value, s => this.interned[s])));
var nonterminals = Array.ConvertAll(this.symbols, s => new List<size_t>());
for (size_t i = 0; i < syntax.Length; i++) { nonterminals[syntax[i].Key].Add(i); }
var firsts = Array.ConvertAll(Enumerable.Range(0, this.symbols.Length).ToArray(), s => nonterminals[s].Count > 0 ? new HashSet<size_t>() : new HashSet<size_t>() { (size_t)s });
int old;
do
{
old = firsts.Select(l => l.Count).Sum();
foreach (var rule in syntax)
{
foreach (var i in First(rule.Value, firsts))
{ firsts[rule.Key].Add(i); }
}
} while (old < firsts.Select(l => l.Count).Sum());
var actions = new Dictionary<int, IDictionary<size_t, IList<int>>>();
var states = new Dictionary<HashSet<Item>, int>(HashSet<Item>.CreateSetComparer());
var todo = new Stack<HashSet<Item>>();
var root = new Item(0, 0, new size_t());
todo.Push(new HashSet<Item>());
Closure(root, todo.Peek(), firsts, syntax, nonterminals);
states.Add(new HashSet<Item>(todo.Peek()), states.Count);
while (todo.Count > 0)
{
var set = todo.Pop();
var closure = new HashSet<Item>();
foreach (var item in set)
{ Closure(item, closure, firsts, syntax, nonterminals); }
var grouped = Array.ConvertAll(this.symbols, _ => new HashSet<Item>());
foreach (var item in closure)
{
if (item.Symbol >= syntax[item.Rule].Value.Length)
{
IDictionary<size_t, IList<int>> map;
if (!actions.TryGetValue(states[set], out map))
{ actions[states[set]] = map = new Dictionary<size_t, IList<int>>(); }
IList<int> list;
if (!map.TryGetValue(item.Lookahead, out list))
{ map[item.Lookahead] = list = new List<int>(); }
list.Add(~(int)item.Rule);
continue;
}
var next = item;
next.Symbol++;
grouped[syntax[item.Rule].Value[item.Symbol]].Add(next);
}
for (size_t symbol = 0; symbol < grouped.Length; symbol++)
{
var g = new HashSet<Item>();
foreach (var item in grouped[symbol])
{ Closure(item, g, firsts, syntax, nonterminals); }
if (g.Count > 0)
{
int state;
if (!states.TryGetValue(g, out state))
{
state = states.Count;
states.Add(g, state);
todo.Push(g);
}
IDictionary<size_t, IList<int>> map;
if (!actions.TryGetValue(states[set], out map))
{ actions[states[set]] = map = new Dictionary<size_t, IList<int>>(); }
IList<int> list;
if (!map.TryGetValue(symbol, out list))
{ map[symbol] = list = new List<int>(); }
list.Add(state);
}
}
}
this.actions = new int[states.Count, this.symbols.Length];
for (int i = 0; i < this.actions.GetLength(0); i++)
{
for (int j = 0; j < this.actions.GetLength(1); j++)
{ this.actions[i, j] = int.MinValue; }
}
foreach (var p in actions)
{
foreach (var q in p.Value)
{ this.actions[p.Key, q.Key] = q.Value.Single(); }
}
foreach (var state in states.OrderBy(p => p.Value))
{
Console.WriteLine("State {0}:", state.Value);
foreach (var item in state.Key.OrderBy(i => i.Rule).ThenBy(i => i.Symbol).ThenBy(i => i.Lookahead))
{
Console.WriteLine(
"\t{0}: {1} \xB7 {2} | {3} → {0}",
this.symbols[syntax[item.Rule].Key],
string.Join(" ", syntax[item.Rule].Value.Take((int)item.Symbol).Select(s => this.symbols[s]).ToArray()),
string.Join(" ", syntax[item.Rule].Value.Skip((int)item.Symbol).Select(s => this.symbols[s]).ToArray()),
this.symbols[item.Lookahead] == string.Empty ? "\x04" : this.symbols[item.Lookahead],
string.Join(
", ",
Array.ConvertAll(
actions[state.Value][item.Symbol < syntax[item.Rule].Value.Length ? syntax[item.Rule].Value[item.Symbol] : item.Lookahead].ToArray(),
a => a >= 0 ? string.Format("state {0}", a) : string.Format("{0} (rule {1})", this.symbols[syntax[~a].Key], ~a))));
}
Console.WriteLine();
}
}
private static void Closure(Item item, HashSet<Item> closure /*output*/, HashSet<size_t>[] firsts, KeyValuePair<size_t, size_t[]>[] syntax, IList<size_t>[] nonterminals)
{
if (closure.Add(item) && item.Symbol >= syntax[item.Rule].Value.Length)
{
foreach (var r in nonterminals[syntax[item.Rule].Value[item.Symbol]])
{
foreach (var i in First(syntax[item.Rule].Value.Skip((int)(item.Symbol + 1)), firsts))
{ Closure(new Item(r, 0, i == new size_t() ? item.Lookahead : i), closure, firsts, syntax, nonterminals); }
}
}
}
private struct Item : IEquatable<Item>
{
public size_t Rule;
public size_t Symbol;
public size_t Lookahead;
public Item(size_t rule, size_t symbol, size_t lookahead)
{
this.Rule = rule;
this.Symbol = symbol;
this.Lookahead = lookahead;
}
public override bool Equals(object obj) { return obj is Item && this.Equals((Item)obj); }
public bool Equals(Item other)
{ return this.Rule == other.Rule && this.Symbol == other.Symbol && this.Lookahead == other.Lookahead; }
public override int GetHashCode()
{ return this.Rule.GetHashCode() ^ this.Symbol.GetHashCode() ^ this.Lookahead.GetHashCode(); }
}
private static IEnumerable<size_t> First(IEnumerable<size_t> symbols, IEnumerable<size_t>[] map)
{
foreach (var symbol in symbols)
{
bool epsilon = false;
foreach (var s in map[symbol])
{
if (s == new size_t()) { epsilon = true; }
else { yield return s; }
}
if (!epsilon) { yield break; }
}
yield return new size_t();
}
private static KeyValuePair<K, V> MakePair<K, V>(K k, V v) { return new KeyValuePair<K, V>(k, v); }
private static void Main(string[] args)
{
var sw = Stopwatch.StartNew();
var parser = new LRParser(
MakePair("start", new string[] { "exps" }),
MakePair("exps", new string[] { "exps", "exp" }),
MakePair("exps", new string[] { }),
MakePair("exp", new string[] { "INTEGER" })
);
Console.WriteLine(sw.ElapsedMilliseconds);
}
}
LRSTAR 9.1 is a minimal LR(1) and LR(*) parser generator. You can use it to verify that your parser generator is giving the correct states, by using option '/s'. LRSTAR has been tested against HYACC and found to be giving the correct LR(1) states. 20 grammars are provided with LRSTAR and 6 Microsoft Visual Studio projects.

Create Space Between Capital Letters and Skip Space Between Consecutive

I get the way to create space "ThisCourse" to be "This Course"
Add Space Before Capital Letter By (EtienneT) LINQ Statement
But i cannot
Create Space Betweeen This "ThisCourseID" to be "This Course ID" without space between "ID"
And Is there a way to do this in Linq ??
Well, if it has to be a single linq statement...
var s = "ThisCourseIDMoreXYeahY";
s = string.Join(
string.Empty,
s.Select((x,i) => (
char.IsUpper(x) && i>0 &&
( char.IsLower(s[i-1]) || (i<s.Count()-1 && char.IsLower(s[i+1])) )
) ? " " + x : x.ToString()));
Console.WriteLine(s);
Output: "This Course ID More X Yeah Y"
var s = "ThisCourseID";
for (var i = 1; i < s.Length; i++)
{
if (char.IsLower(s[i - 1]) && char.IsUpper(s[i]))
{
s = s.Insert(i, " ");
}
}
Console.WriteLine(s); // "This Course ID"
You can improve this using StringBuilder if you are going to use this on very long strings, but for your purpose, as you presented it, it should work just fine.
FIX:
var s = "ThisCourseIDSomething";
for (var i = 1; i < s.Length - 1; i++)
{
if (char.IsLower(s[i - 1]) && char.IsUpper(s[i]) ||
s[i - 1] != ' ' && char.IsUpper(s[i]) && char.IsLower(s[i + 1]))
{
s = s.Insert(i, " ");
}
}
Console.WriteLine(s); // This Course ID Something
You don't need LINQ - but you could 'enumerate' and use lambda to make it more generic...
(though not sure if any of this makes sense)
static IEnumerable<string> Split(this string text, Func<char?, char?, char, int?> shouldSplit)
{
StringBuilder output = new StringBuilder();
char? before = null;
char? before2nd = null;
foreach (var c in text)
{
var where = shouldSplit(before2nd, before, c);
if (where != null)
{
var str = output.ToString();
switch(where)
{
case -1:
output.Remove(0, str.Length -1);
yield return str.Substring(0, str.Length - 1);
break;
case 0: default:
output.Clear();
yield return str;
break;
}
}
output.Append(c);
before2nd = before;
before = c;
}
yield return output.ToString();
}
...and call it like this e.g. ...
static IEnumerable<string> SplitLines(this string text)
{
return text.Split((before2nd, before, now) =>
{
if ((before2nd ?? 'A') == '\r' && (before ?? 'A') == '\n') return 0; // split on 'now'
return null; // don't split
});
}
static IEnumerable<string> SplitOnCase(this string text)
{
return text.Split((before2nd, before, now) =>
{
if (char.IsLower(before ?? 'A') && char.IsUpper(now)) return 0; // split on 'now'
if (char.IsUpper(before2nd ?? 'a') && char.IsUpper(before ?? 'a') && char.IsLower(now)) return -1; // split one char before
return null; // don't split
});
}
...and somewhere...
var text = "ToSplitOrNotToSplitTHEQuestionIsNow";
var words = text.SplitOnCase();
foreach (var word in words)
Console.WriteLine(word);
text = "To\r\nSplit\r\nOr\r\nNot\r\nTo\r\nSplit\r\nTHE\r\nQuestion\r\nIs\r\nNow";
words = text.SplitLines();
foreach (var word in words)
Console.WriteLine(word);
:)

Categories

Resources