C# StringBuilder: Check if it ends with a new line - c#

I have a StringBuilder that accumulates code. In some cases, it has 2 empty lines between code blocks, and I'd like to make that 1 empty line.
How can I check if the current code already has an empty line at the end? (I prefer not to use its ToString() method because of performance issues.)

You can access any character of your StringBuilder with its index, like you would with a String.
var sb = new StringBuilder();
sb.Append("Hello world!\n");
Console.WriteLine(sb[sb.Length - 1] == '\n'); // True

You can normalize the newlines, using a regex:
var test = #"hello
moop
hello";
var regex = new Regex(#"(?:\r\n|[\r\n])+");
var newLinesNormalized = regex.Replace(test, Environment.NewLine);
output:
hello
moop
hello

Single line check. Uses a string type, not StringBuilder, but you should get the basic idea.
if (theString.Substring(theString.Length - Environment.NewLine.Length, Environment.NewLine.Length).Contains(Environment.NewLine))
{
//theString does end with a NewLine
}
else
{
//theString does NOT end with a NewLine
}

Here is the complete example.
class string_builder
{
string previousvalue = null;
StringBuilder sB;
public string_builder()
{
sB = new StringBuilder();
}
public void AppendToStringBuilder(string new_val)
{
if (previousvalue.EndsWith("\n") && !String.IsNullOrEmpty(previousvalue) )
{
sB.Append(new_val);
}
else
{
sB.AppendLine(new_val);
}
previousvalue = new_val;
}
}
class Program
{
public static void Main(string[] args)
{
string_builder sb = new string_builder();
sb.AppendToStringBuilder("this is line1\n");
sb.AppendToStringBuilder("this is line2");
sb.AppendToStringBuilder("\nthis is line3\n");
}
}

I've got 'funny' answer:
var sb = new StringBuilder();
sb.AppendLine("test");
sb.AppendLine("test2");
Console.WriteLine(sb.ToString().TrimEnd('\n').Length != sb.ToString().Length); //true

Since I don't care about 2 empty lines in the middle of the code, the simplest way is to use
myCode.Replace(string.Format("{0}{0}", Environment.NewLine),Environment.NewLine);
This option doesn't require any changes to classes that use the code accumulator.

In-case anyone ends up here like I did here is a general method to check the end of a StringBuilder for an arbitrary string with having to use ToString on it.
public static bool EndsWith(this StringBuilder haystack, string needle)
{
var needleLength = needle.Length - 1;
var haystackLength = haystack.Length - 1;
if (haystackLength < needleLength)
{
return false;
}
for (int i = 0; i < needleLength; i++)
{
if (haystack[haystackLength - i] != needle[needleLength - i])
{
return false;
}
}
return true;
}

Related

Deleting substrings "dynamically" from a line in C#

I have a file with some lines in the text file like this
This is a test value with {MyTestValue = 0.34} How do I delete the test value? My line also has {MySecondTestValue = 0.35}
The value of MyTestValue is not the same value in each line.
Is there a way to determine the number of chars till the closing parenthesis and delete everything within the parentheses. So my output would be something like:
This is a test value with {} How do I delete the test value? My line also has {MySecondTestValue = 0.35}
Possible implementation via regular expressions:
String source = "This is a test value with {MyTestValue = 0.34} How do I delete the test value?";
String result = Regex.Replace(source, "{.*}", (MatchEvaluator) ((match) => "{}"));
string line="This is a test value with {MyTestValue = 0.34} How do I delete the test value?";
int index1=line.indexOf('{');
int index2=line.indexOf('}');
line=line.Replace(line.Substring(index1,index2-index1),"");
Try this
string output = Regex.Replace(input, #"{MyTestValue = [0-9.]+}", "{}");
Stringbuilder is the most efficient way to work with strings. You can create custom method that works with it :
static string[] ClearValues(string[] dirtyLines, string[] ignoreValuesList)
{
string[] result = new string[dirtyLines.Length];
bool ignore = false; StringBuilder s = new StringBuilder();
StringBuilder s2 = new StringBuilder();
for (int i = 0; i < dirtyLines.Length; i++)
{
for (int i2 = 0; i2 < dirtyLines[i].Length; i2++)
{
if (dirtyLines[i][i2] == '{') { s2.Clear(); s.Append(dirtyLines[i][i2]); ignore = true; continue; }
if (dirtyLines[i][i2] == '}') { if(ignoreValuesList.Contains(s2.ToString())) s.Append(s2.ToString()); s.Append(dirtyLines[i][i2]); ignore = false; continue; }
if (!ignore) { s.Append(dirtyLines[i][i2]); } else { s2.Append(dirtyLines[i][i2]); }
}
result[i] = s.ToString();
s.Clear();
}
return result;
}
Example of usage :
static void Main()
{
string[] dirtyLines =
{
"This is a test value with {MyTestValue = 0.34} How do I delete the test value?",
"This is {SomeOther = 11} How do I delete the test value?",
"{X = 134} How do {Y = 500} I delete the {Z = 400}test value?",
};
Stopwatch s = new Stopwatch();
s.Start();
string[] clean = ClearValues(dirtyLines, new[] { "Y = 500", "Z = 400" });
s.Stop();
for (int i = 0; i < clean.Length; i++)
{
Console.WriteLine(clean[i]);
}
Console.WriteLine("\nIt took {0} ms and {1} CPU ticks for method to execute", s.ElapsedMilliseconds, s.ElapsedTicks);
Console.ReadKey();
}
Output:

How can i write this java code in C#

i have an array of integer called digits
public String toDecimalString() {
StringBuilder b = new StringBuilder(9 * digits.length);
Formatter f = new Formatter(b);
f.format("%d", digits[0]);
for(int i = 1 ; i < digits.length; i++) {
f.format("%09d", digits[i]);
}
return b.toString();
}
I tried
String.Format("%09d", digits[i]);
but I think I'm doing something wrong
I'm not really familiar with java formatters, but I think this is what you want
var str = string.Format("{0:D9}", digits[i]);
Or even better
var str = digits[i].ToString("D9");
To join all these strings I suggest this:
var str = string.Join(string.Empty, digits.Select(d => d.ToString("D9")));
Further Reading
Standard Numeric Format Strings
Custom Numeric Format Strings
I think you want something like
StringBuilder sb = new StringBuilder();
sb.append(String.Format("DL", digits[i]));
for (int i = 1; i < digits.Length; i++) {
sb.append(String.Format("D9", digits[i]));
}
Copy from java code and paste it directly into c# code, then change (which are in your toDecimalString() method):
f.format to f.Format
digits.length to digits.Length
b.toString() to b.ToString()
and then paste this class to your code:
public partial class Formatter: IFormatProvider, ICustomFormatter {
public String Format(String format, object arg, IFormatProvider formatProvider=null) {
if(!format.StartsWith("%")||!format.EndsWith("d"))
throw new NotImplementedException();
m_Builder.Append(String.Format("{0:D"+format.Substring(1, format.Length-2)+"}", arg));
return m_Builder.ToString();
}
object IFormatProvider.GetFormat(Type formatType) {
return typeof(ICustomFormatter)!=formatType?null:this;
}
public Formatter(StringBuilder b) {
this.m_Builder=b;
}
StringBuilder m_Builder;
}
Note that the class only implemented the minimum requirement as your question stated, you would need to add the code if your further extend the requirement.
public string toDecimalString()
{
StringBuilder b = new StringBuilder(9 * digits.Length);
var str = digits[0].ToString("D");
b.Append(str);
for (int i = 1; i < digits.Length; i++)
{
var str2 = digits[i].ToString("D9");
b.Append(str2);
}
return b.ToString();
}
Thanks for all the answers, I finally reached a solution as above

How to insert/remove hyphen to/from a plain string in c#?

I have a string like this;
string text = "6A7FEBFCCC51268FBFF";
And I have one method for which I want to insert the logic for appending the hyphen after 4 characters to 'text' variable. So, the output should be like this;
6A7F-EBFC-CC51-268F-BFF
Appending hyphen to above 'text' variable logic should be inside this method;
public void GetResultsWithHyphen
{
// append hyphen after 4 characters logic goes here
}
And I want also remove the hyphen from a given string such as 6A7F-EBFC-CC51-268F-BFF. So, removing hyphen from a string logic should be inside this method;
public void GetResultsWithOutHyphen
{
// Removing hyphen after 4 characters logic goes here
}
How can I do this in C# (for desktop app)?
What is the best way to do this?
Appreciate everyone's answer in advance.
GetResultsWithOutHyphen is easy (and should return a string instead of void
public string GetResultsWithOutHyphen(string input)
{
// Removing hyphen after 4 characters logic goes here
return input.Replace("-", "");
}
for GetResultsWithHyphen, there may be slicker ways to do it, but here's one way:
public string GetResultsWithHyphen(string input)
{
// append hyphen after 4 characters logic goes here
string output = "";
int start = 0;
while (start < input.Length)
{
output += input.Substring(start, Math.Min(4,input.Length - start)) + "-";
start += 4;
}
// remove the trailing dash
return output.Trim('-');
}
Use regex:
public String GetResultsWithHyphen(String inputString)
{
return Regex.Replace(inputString, #"(\w{4})(\w{4})(\w{4})(\w{4})(\w{3})",
#"$1-$2-$3-$4-$5");
}
and for removal:
public String GetResultsWithOutHyphen(String inputString)
{
return inputString.Replace("-", "");
}
Here's the shortest regex I could come up with. It will work on strings of any length. Note that the \B token will prevent it from matching at the end of a string, so you don't have to trim off an extra hyphen as with some answers above.
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
string text = "6A7FEBFCCC51268FBFF";
for (int i = 0; i <= text.Length;i++ )
Console.WriteLine(hyphenate(text.Substring(0, i)));
}
static string hyphenate(string s)
{
var re = new Regex(#"(\w{4}\B)");
return re.Replace (s, "$1-");
}
static string dehyphenate (string s)
{
return s.Replace("-", "");
}
}
}
var hyphenText = new string(
text
.SelectMany((i, ch) => i%4 == 3 && i != text.Length-1 ? new[]{ch, '-'} : new[]{ch})
.ToArray()
)
something along the lines of:
public string GetResultsWithHyphen(string inText)
{
var counter = 0;
var outString = string.Empty;
while (counter < inText.Length)
{
if (counter % 4 == 0)
outString = string.Format("{0}-{1}", outString, inText.Substring(counter, 1));
else
outString += inText.Substring(counter, 1);
counter++;
}
return outString;
}
This is rough code and may not be perfectly, syntactically correct
public static string GetResultsWithHyphen(string str) {
return Regex.Replace(str, "(.{4})", "$1-");
//if you don't want trailing -
//return Regex.Replace(str, "(.{4})(?!$)", "$1-");
}
public static string GetResultsWithOutHyphen(string str) {
//if you just want to remove the hyphens:
//return input.Replace("-", "");
//if you REALLY want to remove hyphens only if they occur after 4 places:
return Regex.Replace(str, "(.{4})-", "$1");
}
For removing:
String textHyphenRemoved=text.Replace('-',''); should remove all of the hyphens
for adding
StringBuilder strBuilder = new StringBuilder();
int startPos = 0;
for (int i = 0; i < text.Length / 4; i++)
{
startPos = i * 4;
strBuilder.Append(text.Substring(startPos,4));
//if it isn't the end of the string add a hyphen
if(text.Length-startPos!=4)
strBuilder.Append("-");
}
//add what is left
strBuilder.Append(text.Substring(startPos, 4));
string textWithHyphens = strBuilder.ToString();
Do note that my adding code is untested.
GetResultsWithOutHyphen method
public string GetResultsWithOutHyphen(string input)
{
return input.Replace("-", "");
}
GetResultsWithOutHyphen method
You could pass a variable instead of four for flexibility.
public string GetResultsWithHyphen(string input)
{
string output = "";
int start = 0;
while (start < input.Length)
{
char bla = input[start];
output += bla;
start += 1;
if (start % 4 == 0)
{
output += "-";
}
}
return output;
}
This worked for me when I had a value for a social security number (123456789) and needed it to display as (123-45-6789) in a listbox.
ListBox1.Items.Add("SS Number : " & vbTab & Format(SSNArray(i), "###-##-####"))
In this case I had an array of Social Security Numbers. This line of code alters the formatting to put a hyphen in.
Callee
public static void Main()
{
var text = new Text("THISisJUSTanEXAMPLEtext");
var convertText = text.Convert();
Console.WriteLine(convertText);
}
Caller
public class Text
{
private string _text;
private int _jumpNo = 4;
public Text(string text)
{
_text = text;
}
public Text(string text, int jumpNo)
{
_text = text;
_jumpNo = jumpNo < 1 ? _jumpNo : jumpNo;
}
public string Convert()
{
if (string.IsNullOrEmpty(_text))
{
return string.Empty;
}
if (_text.Length < _jumpNo)
{
return _text;
}
var convertText = _text.Substring(0, _jumpNo);
int start = _jumpNo;
while (start < _text.Length)
{
convertText += "-" + _text.Substring(start, Math.Min(_jumpNo, _text.Length - start));
start += _jumpNo;
}
return convertText;
}
}

Regular Expression - Is this possible?

Rather than describing what I want (it's difficult to explain), Let me provide an example of what I need to accomplish in C# using a regular expression:
"HelloWorld" should be transformed to "Hello World"
"HelloWORld" should be transformed to "Hello WO Rld" //Two consecutive letters in capital should be treatead as one word
"helloworld" should be transformed to "helloworld"
EDIT:
"HellOWORLd" should be transformed to "Hell OW OR Ld"
Every 2-consecutive capital letters should be considered one word.
Is this possible?
This is fully working C# code, not just the regex:
Console.WriteLine(
Regex.Replace(
"HelloWORld",
"(?<!^)(?<wordstart>[A-Z]{1,2})",
" ${wordstart}", RegexOptions.Compiled));
And it prints:
Hello WO Rld
Update
To make this more UNICODE/international aware, consider replacing [A-Z] by \p{Lt} (meaning a UNICODE code point that represents a Letter in uppercase). The result for the current input would the same. So here is a slightly more compelling example:
Console.WriteLine(Regex.Replace(
#"ÉclaireürfØÑJßå",
#"(?<!^)(?<wordstart>\p{Lu}{1,2})",
#" ${wordstart}",
RegexOptions.Compiled));
The regular expression engine is not a transformative thing by nature, but rather a pattern matching (and replacing) engine. People often mistake the replace part of Regex, thinking that it can do more than it's designed to.
Back to your question, though... Regex cannot do what you want, instead, you should write your own parser to do this. With C#, if you're familiar with the language, this task is somewhat trivial.
It's a case of "You're using the wrong tool for the job".
Here are regular expressions that detect what you are looking for:
([A-Z]\w*?)[A-Z]
this matches any uppercase letter from A to Z once followed by aphanumerics up to the next uppercase.
([A-Z]{2}\w*?)[A-Z]
this matches any uppercase letter from A to Z exactly 2 times.
Regex is a matching engine, you can parse the input string and use regex.isMatch to find candidate matches to then insert spaces into the output string
string f(string input)
{
//'lowerUPPER' -> 'lower UPPER'
var x = Regex.Replace(input, "([a-z])([A-Z])","$1 $2");
//'UPPER' -> 'UP PE R'
return Regex.Replace(x, "([A-Z]{2})","$1 ");
}
class Program
{
static void Main(string[] args)
{
Print(Parse("HelloWorld"));
Print(Parse("HelloWORld"));
Print(Parse("helloworld"));
Print(Parse("HellOWORLd"));
Console.ReadLine();
}
static void Print(IEnumerable<string> input)
{
foreach (var s in input)
{
Console.Write(s);
Console.Write(' ');
}
Console.WriteLine();
}
static IEnumerable<string> Parse(string input)
{
var sb = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
if (!char.IsUpper(input[i]))
{
sb.Append(input[i]);
continue;
}
if (sb.Length > 0)
{
yield return sb.ToString();
sb.Clear();
}
sb.Append(input[i]);
if (char.IsUpper(input[i + 1]))
{
sb.Append(input[++i]);
yield return sb.ToString();
sb.Clear();
}
}
if (sb.Length > 0)
{
yield return sb.ToString();
}
}
}
I think does not need regular expression in this case.
Try this:
static void Main(string[] args)
{
var input = "HellOWORLd";
var i = 0;
var x = 4;
var len = input.Length;
var output = new List<string>();
while (x <= len)
{
output.Add(SubStr(input, i, x));
i = x;
x += 2;
}
var ret = output.ToArray(); //["Hell","OW", "OR", "Ld"]
Console.ReadLine();
}
static string SubStr(string str, int start, int end)
{
var len = str.Length;
if (start >= 0 && end <= len)
{
var ret = new StringBuilder();
for (int i = 0; i < len; i++)
{
if (i == start)
{
do
{
ret.Append(str[i]);
i++;
} while (i != end);
}
}
return ret.ToString();
}
return null;
}

Replace non-numeric with empty string

Quick add on requirement in our project. A field in our DB to hold a phone number is set to only allow 10 characters. So, if I get passed "(913)-444-5555" or anything else, is there a quick way to run a string through some kind of special replace function that I can pass it a set of characters to allow?
Regex?
Definitely regex:
string CleanPhone(string phone)
{
Regex digitsOnly = new Regex(#"[^\d]");
return digitsOnly.Replace(phone, "");
}
or within a class to avoid re-creating the regex all the time:
private static Regex digitsOnly = new Regex(#"[^\d]");
public static string CleanPhone(string phone)
{
return digitsOnly.Replace(phone, "");
}
Depending on your real-world inputs, you may want some additional logic there to do things like strip out leading 1's (for long distance) or anything trailing an x or X (for extensions).
You can do it easily with regex:
string subject = "(913)-444-5555";
string result = Regex.Replace(subject, "[^0-9]", ""); // result = "9134445555"
You don't need to use Regex.
phone = new String(phone.Where(c => char.IsDigit(c)).ToArray())
Here's the extension method way of doing it.
public static class Extensions
{
public static string ToDigitsOnly(this string input)
{
Regex digitsOnly = new Regex(#"[^\d]");
return digitsOnly.Replace(input, "");
}
}
Using the Regex methods in .NET you should be able to match any non-numeric digit using \D, like so:
phoneNumber = Regex.Replace(phoneNumber, "\\D", String.Empty);
How about an extension method that doesn't use regex.
If you do stick to one of the Regex options at least use RegexOptions.Compiled in the static variable.
public static string ToDigitsOnly(this string input)
{
return new String(input.Where(char.IsDigit).ToArray());
}
This builds on Usman Zafar's answer converted to a method group.
for the best performance and lower memory consumption , try this:
using System;
using System.Diagnostics;
using System.Text;
using System.Text.RegularExpressions;
public class Program
{
private static Regex digitsOnly = new Regex(#"[^\d]");
public static void Main()
{
Console.WriteLine("Init...");
string phone = "001-12-34-56-78-90";
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 1000000; i++)
{
DigitsOnly(phone);
}
sw.Stop();
Console.WriteLine("Time: " + sw.ElapsedMilliseconds);
var sw2 = new Stopwatch();
sw2.Start();
for (int i = 0; i < 1000000; i++)
{
DigitsOnlyRegex(phone);
}
sw2.Stop();
Console.WriteLine("Time: " + sw2.ElapsedMilliseconds);
Console.ReadLine();
}
public static string DigitsOnly(string phone, string replace = null)
{
if (replace == null) replace = "";
if (phone == null) return null;
var result = new StringBuilder(phone.Length);
foreach (char c in phone)
if (c >= '0' && c <= '9')
result.Append(c);
else
{
result.Append(replace);
}
return result.ToString();
}
public static string DigitsOnlyRegex(string phone)
{
return digitsOnly.Replace(phone, "");
}
}
The result in my computer is:
Init...
Time: 307
Time: 2178
I'm sure there's a more efficient way to do it, but I would probably do this:
string getTenDigitNumber(string input)
{
StringBuilder sb = new StringBuilder();
for(int i - 0; i < input.Length; i++)
{
int junk;
if(int.TryParse(input[i], ref junk))
sb.Append(input[i]);
}
return sb.ToString();
}
try this
public static string cleanPhone(string inVal)
{
char[] newPhon = new char[inVal.Length];
int i = 0;
foreach (char c in inVal)
if (c.CompareTo('0') > 0 && c.CompareTo('9') < 0)
newPhon[i++] = c;
return newPhon.ToString();
}

Categories

Resources