How to convert a double into a floating-point string representation without scientific notation in the .NET Framework?
"Small" samples (effective numbers may be of any size, such as 1.5E200 or 1e-200) :
3248971234698200000000000000000000000000000000
0.00000000000000000000000000000000000023897356978234562
None of the standard number formats are like this, and a custom format also doesn't seem to allow having an open number of digits after the decimal separator.
This is not a duplicate of How to convert double to string without the power to 10 representation (E-05) because the answers given there do not solve the issue at hand. The accepted solution in this question was to use a fixed point (such as 20 digits), which is not what I want. A fixed point formatting and trimming the redundant 0 doesn't solve the issue either because the max width for fixed width is 99 characters.
Note: the solution has to deal correctly with custom number formats (e.g. other decimal separator, depending on culture information).
Edit: The question is really only about displaing aforementioned numbers. I'm aware of how floating point numbers work and what numbers can be used and computed with them.
For a general-purpose¹ solution you need to preserve 339 places:
doubleValue.ToString("0." + new string('#', 339))
The maximum number of non-zero decimal digits is 16. 15 are on the right side of the decimal point. The exponent can move those 15 digits a maximum of 324 places to the right. (See the range and precision.)
It works for double.Epsilon, double.MinValue, double.MaxValue, and anything in between.
The performance will be much greater than the regex/string manipulation solutions since all formatting and string work is done in one pass by unmanaged CLR code. Also, the code is much simpler to prove correct.
For ease of use and even better performance, make it a constant:
public static class FormatStrings
{
public const string DoubleFixedPoint = "0.###################################################################################################################################################################################################################################################################################################################################################";
}
¹ Update: I mistakenly said that this was also a lossless solution. In fact it is not, since ToString does its normal display rounding for all formats except r. Live example. Thanks, #Loathing! Please see Lothing’s answer if you need the ability to roundtrip in fixed point notation (i.e, if you’re using .ToString("r") today).
I had a similar problem and this worked for me:
doubleValue.ToString("F99").TrimEnd('0')
F99 may be overkill, but you get the idea.
This is a string parsing solution where the source number (double) is converted into a string and parsed into its constituent components. It is then reassembled by rules into the full-length numeric representation. It also accounts for locale as requested.
Update: The tests of the conversions only include single-digit whole numbers, which is the norm, but the algorithm also works for something like: 239483.340901e-20
using System;
using System.Text;
using System.Globalization;
using System.Threading;
public class MyClass
{
public static void Main()
{
Console.WriteLine(ToLongString(1.23e-2));
Console.WriteLine(ToLongString(1.234e-5)); // 0.00010234
Console.WriteLine(ToLongString(1.2345E-10)); // 0.00000001002345
Console.WriteLine(ToLongString(1.23456E-20)); // 0.00000000000000000100023456
Console.WriteLine(ToLongString(5E-20));
Console.WriteLine("");
Console.WriteLine(ToLongString(1.23E+2)); // 123
Console.WriteLine(ToLongString(1.234e5)); // 1023400
Console.WriteLine(ToLongString(1.2345E10)); // 1002345000000
Console.WriteLine(ToLongString(-7.576E-05)); // -0.00007576
Console.WriteLine(ToLongString(1.23456e20));
Console.WriteLine(ToLongString(5e+20));
Console.WriteLine("");
Console.WriteLine(ToLongString(9.1093822E-31)); // mass of an electron
Console.WriteLine(ToLongString(5.9736e24)); // mass of the earth
Console.ReadLine();
}
private static string ToLongString(double input)
{
string strOrig = input.ToString();
string str = strOrig.ToUpper();
// if string representation was collapsed from scientific notation, just return it:
if (!str.Contains("E")) return strOrig;
bool negativeNumber = false;
if (str[0] == '-')
{
str = str.Remove(0, 1);
negativeNumber = true;
}
string sep = Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator;
char decSeparator = sep.ToCharArray()[0];
string[] exponentParts = str.Split('E');
string[] decimalParts = exponentParts[0].Split(decSeparator);
// fix missing decimal point:
if (decimalParts.Length==1) decimalParts = new string[]{exponentParts[0],"0"};
int exponentValue = int.Parse(exponentParts[1]);
string newNumber = decimalParts[0] + decimalParts[1];
string result;
if (exponentValue > 0)
{
result =
newNumber +
GetZeros(exponentValue - decimalParts[1].Length);
}
else // negative exponent
{
result =
"0" +
decSeparator +
GetZeros(exponentValue + decimalParts[0].Length) +
newNumber;
result = result.TrimEnd('0');
}
if (negativeNumber)
result = "-" + result;
return result;
}
private static string GetZeros(int zeroCount)
{
if (zeroCount < 0)
zeroCount = Math.Abs(zeroCount);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < zeroCount; i++) sb.Append("0");
return sb.ToString();
}
}
You could cast the double to decimal and then do ToString().
(0.000000005).ToString() // 5E-09
((decimal)(0.000000005)).ToString() // 0,000000005
I haven't done performance testing which is faster, casting from 64-bit double to 128-bit decimal or a format string of over 300 chars. Oh, and there might possibly be overflow errors during conversion, but if your values fit a decimal this should work fine.
Update: The casting seems to be a lot faster. Using a prepared format string as given in the other answer, formatting a million times takes 2.3 seconds and casting only 0.19 seconds. Repeatable. That's 10x faster. Now it's only about the value range.
This is what I've got so far, seems to work, but maybe someone has a better solution:
private static readonly Regex rxScientific = new Regex(#"^(?<sign>-?)(?<head>\d+)(\.(?<tail>\d*?)0*)?E(?<exponent>[+\-]\d+)$", RegexOptions.IgnoreCase|RegexOptions.ExplicitCapture|RegexOptions.CultureInvariant);
public static string ToFloatingPointString(double value) {
return ToFloatingPointString(value, NumberFormatInfo.CurrentInfo);
}
public static string ToFloatingPointString(double value, NumberFormatInfo formatInfo) {
string result = value.ToString("r", NumberFormatInfo.InvariantInfo);
Match match = rxScientific.Match(result);
if (match.Success) {
Debug.WriteLine("Found scientific format: {0} => [{1}] [{2}] [{3}] [{4}]", result, match.Groups["sign"], match.Groups["head"], match.Groups["tail"], match.Groups["exponent"]);
int exponent = int.Parse(match.Groups["exponent"].Value, NumberStyles.Integer, NumberFormatInfo.InvariantInfo);
StringBuilder builder = new StringBuilder(result.Length+Math.Abs(exponent));
builder.Append(match.Groups["sign"].Value);
if (exponent >= 0) {
builder.Append(match.Groups["head"].Value);
string tail = match.Groups["tail"].Value;
if (exponent < tail.Length) {
builder.Append(tail, 0, exponent);
builder.Append(formatInfo.NumberDecimalSeparator);
builder.Append(tail, exponent, tail.Length-exponent);
} else {
builder.Append(tail);
builder.Append('0', exponent-tail.Length);
}
} else {
builder.Append('0');
builder.Append(formatInfo.NumberDecimalSeparator);
builder.Append('0', (-exponent)-1);
builder.Append(match.Groups["head"].Value);
builder.Append(match.Groups["tail"].Value);
}
result = builder.ToString();
}
return result;
}
// test code
double x = 1.0;
for (int i = 0; i < 200; i++) {
x /= 10;
}
Console.WriteLine(x);
Console.WriteLine(ToFloatingPointString(x));
The problem using #.###...### or F99 is that it doesn't preserve precision at the ending decimal places, e.g:
String t1 = (0.0001/7).ToString("0." + new string('#', 339)); // 0.0000142857142857143
String t2 = (0.0001/7).ToString("r"); // 1.4285714285714287E-05
The problem with DecimalConverter.cs is that it is slow. This code is the same idea as Sasik's answer, but twice as fast. Unit test method at bottom.
public static class RoundTrip {
private static String[] zeros = new String[1000];
static RoundTrip() {
for (int i = 0; i < zeros.Length; i++) {
zeros[i] = new String('0', i);
}
}
private static String ToRoundTrip(double value) {
String str = value.ToString("r");
int x = str.IndexOf('E');
if (x < 0) return str;
int x1 = x + 1;
String exp = str.Substring(x1, str.Length - x1);
int e = int.Parse(exp);
String s = null;
int numDecimals = 0;
if (value < 0) {
int len = x - 3;
if (e >= 0) {
if (len > 0) {
s = str.Substring(0, 2) + str.Substring(3, len);
numDecimals = len;
}
else
s = str.Substring(0, 2);
}
else {
// remove the leading minus sign
if (len > 0) {
s = str.Substring(1, 1) + str.Substring(3, len);
numDecimals = len;
}
else
s = str.Substring(1, 1);
}
}
else {
int len = x - 2;
if (len > 0) {
s = str[0] + str.Substring(2, len);
numDecimals = len;
}
else
s = str[0].ToString();
}
if (e >= 0) {
e = e - numDecimals;
String z = (e < zeros.Length ? zeros[e] : new String('0', e));
s = s + z;
}
else {
e = (-e - 1);
String z = (e < zeros.Length ? zeros[e] : new String('0', e));
if (value < 0)
s = "-0." + z + s;
else
s = "0." + z + s;
}
return s;
}
private static void RoundTripUnitTest() {
StringBuilder sb33 = new StringBuilder();
double[] values = new [] { 123450000000000000.0, 1.0 / 7, 10000000000.0/7, 100000000000000000.0/7, 0.001/7, 0.0001/7, 100000000000000000.0, 0.00000000001,
1.23e-2, 1.234e-5, 1.2345E-10, 1.23456E-20, 5E-20, 1.23E+2, 1.234e5, 1.2345E10, -7.576E-05, 1.23456e20, 5e+20, 9.1093822E-31, 5.9736e24, double.Epsilon };
foreach (int sign in new [] { 1, -1 }) {
foreach (double val in values) {
double val2 = sign * val;
String s1 = val2.ToString("r");
String s2 = ToRoundTrip(val2);
double val2_ = double.Parse(s2);
double diff = Math.Abs(val2 - val2_);
if (diff != 0) {
throw new Exception("Value {0} did not pass ToRoundTrip.".Format2(val.ToString("r")));
}
sb33.AppendLine(s1);
sb33.AppendLine(s2);
sb33.AppendLine();
}
}
}
}
The obligatory Logarithm-based solution. Note that this solution, because it involves doing math, may reduce the accuracy of your number a little bit. Not heavily tested.
private static string DoubleToLongString(double x)
{
int shift = (int)Math.Log10(x);
if (Math.Abs(shift) <= 2)
{
return x.ToString();
}
if (shift < 0)
{
double y = x * Math.Pow(10, -shift);
return "0.".PadRight(-shift + 2, '0') + y.ToString().Substring(2);
}
else
{
double y = x * Math.Pow(10, 2 - shift);
return y + "".PadRight(shift - 2, '0');
}
}
Edit: If the decimal point crosses non-zero part of the number, this algorithm will fail miserably. I tried for simple and went too far.
In the old days when we had to write our own formatters, we'd isolate the mantissa and exponent and format them separately.
In this article by Jon Skeet (https://csharpindepth.com/articles/FloatingPoint) he provides a link to his DoubleConverter.cs routine that should do exactly what you want. Skeet also refers to this at extracting mantissa and exponent from double in c#.
I have just improvised on the code above to make it work for negative exponential values.
using System;
using System.Text.RegularExpressions;
using System.IO;
using System.Text;
using System.Threading;
namespace ConvertNumbersInScientificNotationToPlainNumbers
{
class Program
{
private static string ToLongString(double input)
{
string str = input.ToString(System.Globalization.CultureInfo.InvariantCulture);
// if string representation was collapsed from scientific notation, just return it:
if (!str.Contains("E")) return str;
var positive = true;
if (input < 0)
{
positive = false;
}
string sep = Thread.CurrentThread.CurrentCulture.NumberFormat.NumberDecimalSeparator;
char decSeparator = sep.ToCharArray()[0];
string[] exponentParts = str.Split('E');
string[] decimalParts = exponentParts[0].Split(decSeparator);
// fix missing decimal point:
if (decimalParts.Length == 1) decimalParts = new string[] { exponentParts[0], "0" };
int exponentValue = int.Parse(exponentParts[1]);
string newNumber = decimalParts[0].Replace("-", "").
Replace("+", "") + decimalParts[1];
string result;
if (exponentValue > 0)
{
if (positive)
result =
newNumber +
GetZeros(exponentValue - decimalParts[1].Length);
else
result = "-" +
newNumber +
GetZeros(exponentValue - decimalParts[1].Length);
}
else // negative exponent
{
if (positive)
result =
"0" +
decSeparator +
GetZeros(exponentValue + decimalParts[0].Replace("-", "").
Replace("+", "").Length) + newNumber;
else
result =
"-0" +
decSeparator +
GetZeros(exponentValue + decimalParts[0].Replace("-", "").
Replace("+", "").Length) + newNumber;
result = result.TrimEnd('0');
}
float temp = 0.00F;
if (float.TryParse(result, out temp))
{
return result;
}
throw new Exception();
}
private static string GetZeros(int zeroCount)
{
if (zeroCount < 0)
zeroCount = Math.Abs(zeroCount);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < zeroCount; i++) sb.Append("0");
return sb.ToString();
}
public static void Main(string[] args)
{
//Get Input Directory.
Console.WriteLine(#"Enter the Input Directory");
var readLine = Console.ReadLine();
if (readLine == null)
{
Console.WriteLine(#"Enter the input path properly.");
return;
}
var pathToInputDirectory = readLine.Trim();
//Get Output Directory.
Console.WriteLine(#"Enter the Output Directory");
readLine = Console.ReadLine();
if (readLine == null)
{
Console.WriteLine(#"Enter the output path properly.");
return;
}
var pathToOutputDirectory = readLine.Trim();
//Get Delimiter.
Console.WriteLine("Enter the delimiter;");
var columnDelimiter = (char)Console.Read();
//Loop over all files in the directory.
foreach (var inputFileName in Directory.GetFiles(pathToInputDirectory))
{
var outputFileWithouthNumbersInScientificNotation = string.Empty;
Console.WriteLine("Started operation on File : " + inputFileName);
if (File.Exists(inputFileName))
{
// Read the file
using (var file = new StreamReader(inputFileName))
{
string line;
while ((line = file.ReadLine()) != null)
{
String[] columns = line.Split(columnDelimiter);
var duplicateLine = string.Empty;
int lengthOfColumns = columns.Length;
int counter = 1;
foreach (var column in columns)
{
var columnDuplicate = column;
try
{
if (Regex.IsMatch(columnDuplicate.Trim(),
#"^[+-]?[0-9]+(\.[0-9]+)?[E]([+-]?[0-9]+)$",
RegexOptions.IgnoreCase))
{
Console.WriteLine("Regular expression matched for this :" + column);
columnDuplicate = ToLongString(Double.Parse
(column,
System.Globalization.NumberStyles.Float));
Console.WriteLine("Converted this no in scientific notation " +
"" + column + " to this number " +
columnDuplicate);
}
}
catch (Exception)
{
}
duplicateLine = duplicateLine + columnDuplicate;
if (counter != lengthOfColumns)
{
duplicateLine = duplicateLine + columnDelimiter.ToString();
}
counter++;
}
duplicateLine = duplicateLine + Environment.NewLine;
outputFileWithouthNumbersInScientificNotation = outputFileWithouthNumbersInScientificNotation + duplicateLine;
}
file.Close();
}
var outputFilePathWithoutNumbersInScientificNotation
= Path.Combine(pathToOutputDirectory, Path.GetFileName(inputFileName));
//Create Directory If it does not exist.
if (!Directory.Exists(pathToOutputDirectory))
Directory.CreateDirectory(pathToOutputDirectory);
using (var outputFile =
new StreamWriter(outputFilePathWithoutNumbersInScientificNotation))
{
outputFile.Write(outputFileWithouthNumbersInScientificNotation);
outputFile.Close();
}
Console.WriteLine("The transformed file is here :" +
outputFilePathWithoutNumbersInScientificNotation);
}
}
}
}
}
This code takes an input directory and based on the delimiter converts all values in scientific notation to numeric format.
Thanks
try this one:
public static string DoubleToFullString(double value,
NumberFormatInfo formatInfo)
{
string[] valueExpSplit;
string result, decimalSeparator;
int indexOfDecimalSeparator, exp;
valueExpSplit = value.ToString("r", formatInfo)
.ToUpper()
.Split(new char[] { 'E' });
if (valueExpSplit.Length > 1)
{
result = valueExpSplit[0];
exp = int.Parse(valueExpSplit[1]);
decimalSeparator = formatInfo.NumberDecimalSeparator;
if ((indexOfDecimalSeparator
= valueExpSplit[0].IndexOf(decimalSeparator)) > -1)
{
exp -= (result.Length - indexOfDecimalSeparator - 1);
result = result.Replace(decimalSeparator, "");
}
if (exp >= 0) result += new string('0', Math.Abs(exp));
else
{
exp = Math.Abs(exp);
if (exp >= result.Length)
{
result = "0." + new string('0', exp - result.Length)
+ result;
}
else
{
result = result.Insert(result.Length - exp, decimalSeparator);
}
}
}
else result = valueExpSplit[0];
return result;
}
Being millions of programmers world wide, it's always a good practice to try search if someone has bumped into your problem already. Sometimes there's solutions are garbage, which means it's time to write your own, and sometimes there are great, such as the following:
http://www.yoda.arachsys.com/csharp/DoubleConverter.cs
(details: http://www.yoda.arachsys.com/csharp/floatingpoint.html)
string strdScaleFactor = dScaleFactor.ToString(); // where dScaleFactor = 3.531467E-05
decimal decimalScaleFactor = Decimal.Parse(strdScaleFactor, System.Globalization.NumberStyles.Float);
I don't know if my answer to the question can still be helpful. But in this case I suggest the "decomposition of the double variable into decimal places" to store it in an Array / Array of data of type String.
This process of decomposition and storage in parts (number by number) from double to string, would basically work with the use of two loops and an "alternative" (if you thought of workaround, I think you got it), where the first loop will extract the values from double without converting to String, resulting in blessed scientific notation and storing number by number in an Array. And this will be done using MOD - the same method to check a palindrome number, which would be for example:
String[] Array_ = new double[ **here you will put an extreme value of places your DOUBLE can reach, you must have a prediction**];
for (int i = 0, variableDoubleMonstrous > 0, i++){
x = variableDoubleMonstrous %10;
Array_[i] = x;
variableDoubleMonstrous /= 10;
}
And the second loop to invert the Array values (because in this process of checking a palindrome, the values invert from the last place, to the first, from the penultimate to the second and so on. Remember?) to get the original value:
String[] ArrayFinal = new String[the same number of "places" / indices of the other Array / Data array];
int lengthArray = Array_.Length;
for (int i = 0, i < Array_.Length, i++){
FinalArray[i] = Array_[lengthArray - 1];
lengthArray--;
}
***Warning: There's a catch that I didn't pay attention to. In that case there will be no "." (floating point decimal separator or double), so this solution is not generalized. But if it is really important to use decimal separators, unfortunately the only possibility (If done well, it will have a great performance) is:
**Use a routine to get the position of the decimal point of the original value, the one with scientific notation - the important thing is that you know that this floating point is before a number such as the "Length" position x, and after a number such as the y position - extracting each digit using the loops - as shown above - and at the end "export" the data from the last Array to another one, including the decimal place divider (the comma, or the period , if variable decimal, double or float) in the imaginary position that was in the original variable, in the "real" position of that matrix.
*** The concept of position is, find out how many numbers occur before the decimal point, so with this information you will be able to store in the String Array the point in the real position.
NEEDS THAT CAN BE MADE:
But then you ask:
But what about when I'm going to convert String to a floating point value?
My answer is that you use the second matrix of this entire process (the one that receives the inversion of the first matrix that obtains the numbers by the palindrome method) and use it for the conversion, but always making sure, when necessary, of the position of the decimal place in future situations, in case this conversion (Double -> String) is needed again.
But what if the problem is to use the value of the converted Double (Array of Strings) in a calculation. Then in this case you went around in circles. Well, the original variable will work anyway even with scientific notation. The only difference between floating point and decimal variable types is in the rounding of values, which depending on the purpose, it will only be necessary to change the type of data used, but it is dangerous to have a significant loss of information, look here
I could be wrong, but isn't it like this?
data.ToString("n");
http://msdn.microsoft.com/en-us/library/dwhawy9k.aspx
i think you need only to use IFormat with
ToString(doubleVar, System.Globalization.NumberStyles.Number)
example:
double d = double.MaxValue;
string s = d.ToString(d, System.Globalization.NumberStyles.Number);
My solution was using the custom formats.
try this:
double d;
d = 1234.12341234;
d.ToString("#########0.#########");
Just to build on what jcasso said what you can do is to adjust your double value by changing the exponent so that your favorite format would do it for you, apply the format, and than pad the result with zeros to compensate for the adjustment.
This works fine for me...
double number = 1.5E+200;
string s = number.ToString("#");
//Output: "150000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
I want to convert/find each of my string characters to (int) and reverse this operation.
I manage to do the first part,but the seconds one is giving me some problems.
string input;
string encrypt = ""; string decrypt = "";
input = textBox.Text;
foreach (char c in input)
{
int x = (int)c;
string s = x.ToString();
encrypt += s;
}
MessageBox.Show(encrypt);
foreach (int i in encrypt)
{
char c = (char)i;
string s = c.ToString();
decrypt += c;
}
MessageBox.Show(decrypt);
Thanks!
Here is a fixed program according to my advise above
string encrypt = ""; string decrypt = "";
string input = Console.ReadLine();
var length = input.Length;
int[] converted = new int[length];
for (int index = 0; index < length; index++)
{
int x = input[index];
string s = x.ToString();
encrypt += s;
converted[index] = x;
}
Console.WriteLine(encrypt);
for (int index = 0; index < converted.Length; index++)
{
char c = (char)converted[index];
string s = c.ToString();
decrypt += s;
}
Console.WriteLine(decrypt);
This will not work as is, because you're adding numbers to a string with no padding.
Let's assume the first three letter's values are '1','2','3', you'll have a string with "123".
Now, if you know each letter is 1 int length, you're good, but what happens if 12 is valid? and 23?
This might not be a "real" issues in your case because the values will probably be all 2 ints long, but it's very lacking (unless it's homework, in which case, oh well ...)
The ascii values for the alphabet will go from 65 for A to 122 z.
You can either pad them (say 3 chars per number, so 065 for A, and so on), delimit them (have ".", and split the string on that), use an array (like shahar's suggestion), lists, etc etc ...
In Your scenario, encryption may give output as you expected but its hard to decrypt the encrypted text using such mechanism. so I just do some customization on your code and make it workable here.
i suggest a similar one here:
string input;
string encrypt = ""; string decrypt = "";
int charCount = 0;
input = "textBox.Text";
foreach (char c in input)
{
int x = (int)c;
string s = x.ToString("000");
encrypt += s;
charCount++;
}
// MessageBox.Show(encrypt);
while (encrypt.Length > 0)
{
int item = Int32.Parse(encrypt.Substring(0, 3));
encrypt = encrypt.Substring(3);
char c = (char)item;
string s = c.ToString();
decrypt += c;
}
Reason for your code is not working:
You have declared encrypt as string and iterate through each integer in that string value, it is quiet not possible.
if you make that loop to iterate through each characters in that string value again it gives confusion. as :
lets take S as your input. its equivalent int value is 114 so if you make a looping means it will give 1,1,4, you will not get s back from it.
I need some help. I'm writing an error log using text file with exception details. With that I want my stack trace details to be written like the below and not in straight line to avoid the user from scrolling the scroll bar of the note pad or let's say on the 100th character the strings will be written to the next line. I don't know how to achieve that. Thanks in advance.
SAMPLE(THIS IS MY CURRENT OUTPUT ALL IN STRAIGHT LINE)
STACKTRACE:
at stacktraceabcdefghijklmnopqrstuvwxyztacktraceabcdefghijklmnopqrswxyztacktraceabcdefghijk
**MY DESIRED OUTPUT (the string will write to the next line after certain character count)
STACKTRACE:
at stacktraceabcdefghijklmno
pqrstuvwxyztacktraceabcdefgh
ijklmnopqrswxyztacktraceabcd
efghijk
MY CODE
builder.Append(String.Format("STACKTRACE:"));
builder.AppendLine();
builder.Append(logDetails.StackTrace);
Following example splits 10 characters per line, you can change as you like {N} where N can be any number.
var input = "stacktraceabcdefghijklmnopqrstuvwxyztacktraceabcdefghijklmnopqrswxyztacktraceabcdefghijk";
var regex = new Regex(#".{10}");
string result = regex.Replace(input, "$&" + Environment.NewLine);
Console.WriteLine(result);
Here is the Demo
you can use the following code:
string yourstring;
StringBuilder sb = new StringBuilder();
for(int i=0;i<yourstring.length;++i){
if(i%100==0){
sb.AppendLine();
}
sb.Append(yourstring[i]);
}
you may create a function for this
string splitat(string line, int charcount)
{
string toren = "";
if (charcount>=line.Length)
{
return line;
}
int totalchars = line.Length;
int loopcnt = totalchars / charcount;
int appended = 0;
for (int i = 0; i < loopcnt; i++)
{
toren += line.Substring(appended, charcount) + Environment.NewLine;
appended += charcount;
int left = totalchars - appended;
if (left>0)
{
if (left>charcount)
{
continue;
}
else
{
toren += line.Substring(appended, left) + Environment.NewLine;
}
}
}
return toren;
}
Best , Easiest and Generic Answer :). Just set the value of splitAt to the that number of character count after that u want it to break.
string originalString = "1111222233334444";
List<string> test = new List<string>();
int splitAt = 4; // change 4 with the size of strings you want.
for (int i = 0; i < originalString.Length; i = i + splitAt)
{
if (originalString.Length - i >= splitAt)
test.Add(originalString.Substring(i, splitAt));
else
test.Add(originalString.Substring(i,((originalString.Length - i))));
}
I need to remove all additional spaces in a string.
I use regex for matching strings and matched strings i replace with some others.
For better understanding please see examples below:
3 input strings:
Hello, how are you?
Hello , how are you?
Hello , how are you ?
This are 3 strings that should match by one pattern-regex.
It looks something like this:
Hello\s*,\s+how\s+are\s+you\s*?
It works fine but there is a perfomance problem.
If I have a lot of patterns (~20k) and try to execute each pattern it runs very slow (3-5 minutes).
Maybe there is better way for doing this?
for example use some 3d-party libs?
UPD: Folks, this question is not about how to do this. It's about how to do this with best perfomance. :)
Let me explain more detailed. The main goal is tokenize text. (replace some token with special symbols)
For example I have a token "nice try".
Then I input text "this is nice try".
result: "this is #tokenizedtext#" where #tokenizedtext# some special symbols. It doesen't matter in this case.
Next I have string "Mike said it was a nice try".
result should be "Mike said it was a #tokenizedtext#".
I think the main idea is clear.
So I can have a lot of tokens. When I process it I convert my token from "nice try" to pattern "nice\s+try". and try to replace with this pattern input text.
It works fine. But if in tokens there is more spaces and there is also punctuation then my regexes became bigger and works very slow.
Do you have some suggestions (technical or logic) for solving this problem?
I can suggest a few solutions.
First of all, avoid the static Regex method. Create an instance of it (and store it, don't call the constructor for each replacement!) and, if possible, use RegexOptions.Compiled. It should improve your performance.
Second, you can try to review your pattern. I'll do some profiling, but I'm currently undecisive between:
#"(?<=\s)\s+"
With replacement being an empty string or:
#"\s+"
With a space as a replacement. You can try this code, in the meanwhile:
var s = "Hello , how are you?";
var pattern = #"\s+";
var regex = new Regex(pattern, RegexOptions.Compiled);
var replaced = regex.Replace(s, " ");
EDIT: After having done some measurement, the second pattern seems to be faster. I'm editing my sample to adapt it.
EDIT 2: I've written an unsafe method. It's much faster than the other ones presented here, including the Regex ones, but, as the word itself says, it's unsafe. I don't think that there's any problem with the code I've written but I may be wrong -- So please, check it again and again in case there's a bug in the method.
static unsafe string TrimInternal(string input)
{
var length = input.Length;
var array = stackalloc char[length];
fixed (char* fix = input)
{
var ptr = fix;
var counter = 0;
var lastWasSpace = false;
while (*ptr != '\x0')
{
//Current char is a space?
var isSpace = *ptr == ' ';
//If it's a space but the last one wasn't
//Or if it's not a space
if (isSpace && !lastWasSpace || !isSpace)
//Write into the result array
array[counter++] = *ptr;
//The last character (before the next loop) was a space
lastWasSpace = isSpace;
//Increase the pointer
ptr++;
}
return new string(array, 0, counter);
}
}
Usage (compile with /unsafe):
var s = TrimInternal("Hello , how are you?");
Profiling made in Release build, optimizations on, 1000000 iterations:
My above solution with Regex: 00:00:03.2130121
The unsafe solution: 00:00:00.2063467
This might work for you. It should be pretty fast. Note that it also removes spaces at the end of the string; that might not be what you want...
using System;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(">{0}<", RemoveExtraSpaces("Hello, how are you?"));
Console.WriteLine(">{0}<", RemoveExtraSpaces("Hello , how are you?"));
Console.WriteLine(">{0}<", RemoveExtraSpaces("Hello , how are you ?"));
}
public static string RemoveExtraSpaces(string text)
{
var buffer = new char[text.Length];
bool isSpaced = false;
int n = 0;
foreach (char c in text)
{
if (c == ' ')
{
isSpaced = true;
}
else
{
if (isSpaced)
{
if ((c != ',') && (c != '?'))
{
buffer[n++] = ' ';
}
isSpaced = false;
}
buffer[n++] = c;
}
}
return new string(buffer, 0, n);
}
}
}
Something of my own :
find all the position of WhiteSpacechar in string;
private static IEnumerable<int> GetWhiteSpacePos(string input)
{
int iPos = -1;
while ((iPos = input.IndexOf(" ", iPos + 1, StringComparison.Ordinal)) > -1)
{
yield return iPos;
}
}
Remove all whitespace that are in in sequence Returned from GetWhiteSpacePos
string original_string = "Hello , how are you ?";
var poss = GetWhiteSpacePos(original_string).ToList();
int startPos;
int endPos;
StringBuilder builder = new StringBuilder(original_string);
for (int i = poss.Count -1; i > 1; i--)
{
endPos = poss[i];
while ((poss[i] == poss[i - 1] + 1) && i > 1)
{
i--;
}
startPos = poss[i];
if (endPos - startPos > 1)
{
builder.Remove(startPos, endPos - startPos);
}
}
string new_string = builder.ToString();
You are using a very complex regex..simplify the regex and that would definitely increasre the performance
Use \s+ and replace it with a single space
Well, these kind of problems really trouble us. Use this code, and I'm sure you're getting the result for what you've asked. This command removes any extra white space between any string.
cleanString= Regex.Replace(originalString, #"\s", " ");
Hope thar works for you. Thanks.
And since this is a single Instruction. It will utilize less CPU resource and hence less CPU time, which ultimately increases your performance. Therefore A/C to me this method works the best when compared in terms of performance.
if its just a matter of SPACE;
try this
Source : http://www.codeproject.com/Articles/10890/Fastest-C-Case-Insenstive-String-Replace
private static string ReplaceEx(string original,
string pattern, string replacement)
{
int count, position0, position1;
count = position0 = position1 = 0;
string upperString = original.ToUpper();
string upperPattern = pattern.ToUpper();
int inc = (original.Length / pattern.Length) *
(replacement.Length - pattern.Length);
char[] chars = new char[original.Length + Math.Max(0, inc)];
while ((position1 = upperString.IndexOf(upperPattern,
position0)) != -1)
{
for (int i = position0; i < position1; ++i)
chars[count++] = original[i];
for (int i = 0; i < replacement.Length; ++i)
chars[count++] = replacement[i];
position0 = position1 + pattern.Length;
}
if (position0 == 0) return original;
for (int i = position0; i < original.Length; ++i)
chars[count++] = original[i];
return new string(chars, 0, count);
}
Usage:
string original_string = "Hello , how are you ?";
while (original_string.Contains(" "))
{
original_string = ReplaceEx(original_string, " ", " ");
}
Replacing the regex way:
string resultString = null;
try {
resultString = Regex.Replace(subjectString, #"\s+", " ", RegexOption.Compiled);
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
I'm trying to put the values of a string into a byte array with out changing the characters. This is because the string is in fact a byte representation of the data.
The goal is to move the input string into a byte array and then convert the byte array using:
string result = System.Text.Encoding.UTF8.GetString(data);
I hope someone can help me although I know it´s not a very good description.
EDIT:
And maybe I should explain that what I´m working on is a simple windows form with a textbox where users can copy the encoded data into it and then click preview to see the decoded data.
EDIT:
A little more code:
(inputText is a textbox)
private void button1_Click(object sender, EventArgs e)
{
string inputString = this.inputText.Text;
byte[] input = new byte[inputString.Length];
for (int i = 0; i < inputString.Length; i++)
{
input[i] = inputString[i];
}
string output = base64Decode(input);
this.inputText.Text = "";
this.inputText.Text = output;
}
This is a part of a windows form and it includes a rich text box. This code doesn´t work because it won´t let me convert type char to byte.
But if I change the line to :
private void button1_Click(object sender, EventArgs e)
{
string inputString = this.inputText.Text;
byte[] input = new byte[inputString.Length];
for (int i = 0; i < inputString.Length; i++)
{
input[i] = (byte)inputString[i];
}
string output = base64Decode(input);
this.inputText.Text = "";
this.inputText.Text = output;
}
It encodes the value and I don´t want that. I hope this explains a little bit better what I´m trying to do.
EDIT: The base64Decode function:
public string base64Decode(byte[] data)
{
try
{
string result = System.Text.Encoding.UTF8.GetString(data);
return result;
}
catch (Exception e)
{
throw new Exception("Error in base64Decode" + e.Message);
}
}
The string is not encoded using base64 just to be clear. This is just bad naming on my behalf.
Note this is just one line of input.
I've got it. The problem was I was always trying to decode the wrong format. I feel very stupid because when I posted the example input I saw this had to be hex and it was so from then on it was easy. I used this site for reference:
http://msdn.microsoft.com/en-us/library/bb311038.aspx
My code:
public string[] getHexValues(string s)
{
int j = 0;
string[] hex = new String[s.Length/2];
for (int i = 0; i < s.Length-2; i += 2)
{
string temp = s.Substring(i, 2);
this.inputText.Text = temp;
if (temp.Equals("0x")) ;
else
{
hex[j] = temp;
j++;
}
}
return hex;
}
public string convertFromHex(string[] hex)
{
string result = null;
for (int i = 0; i < hex.Length; i++)
{
int value = Convert.ToInt32(hex[i], 16);
result += Char.ConvertFromUtf32(value);
}
return result;
}
I feel quite dumb right now but thanks to everyone who helped, especially #Jon Skeet.
Are you saying you have something like this:
string s = "48656c6c6f2c20776f726c6421";
and you want these values as a byte array? Then:
public IEnumerable<byte> GetBytesFromByteString(string s) {
for (int index = 0; index < s.Length; index += 2) {
yield return Convert.ToByte(s.Substring(index, 2), 16);
}
}
Usage:
string s = "48656c6c6f2c20776f726c6421";
var bytes = GetBytesFromByteString(s).ToArray();
Note that the output of
Console.WriteLine(System.Text.ASCIIEncoding.ASCII.GetString(bytes));
is
Hello, world!
You obviously need to make the above method a lot safer.
Encoding has the reverse method:
byte[] data = System.Text.Encoding.UTF8.GetBytes(originalString);
string result = System.Text.Encoding.UTF8.GetString(data);
Debug.Assert(result == originalString);
But what you mean 'without converting' is unclear.
One way to do it would be to write:
string s = new string(bytes.Select(x => (char)c).ToArray());
That will give you a string that has one character for every single byte in the array.
Another way is to use an 8-bit character encoding. For example:
var MyEncoding = Encoding.GetEncoding("windows-1252");
string s = MyEncoding.GetString(bytes);
I'm think that Windows-1252 defines all 256 characters, although I'm not certain. If it doesn't, you're going to end up with converted characters. You should be able to find an 8-bit encoding that will do this without any conversion. But you're probably better off using the byte-to-character loop above.
If anyone still needs it this worked for me:
byte[] result = Convert.FromBase64String(str);
Have you tried:
string s = "....";
System.Text.UTF8Encoding.UTF8.GetBytes(s);