If I want to concatenate a string N number of times, which method should i prefer?
Take this code as an example:
public static string Repeat(this string instance, int times)
{
var result = string.Empty;
for (int i = 0; i < times; i++)
result += instance;
return result;
}
This method may be invoked with "times" set to 5, or 5000. What method should I prefer to use?
string.Join? Stringbuilder? Just standard string.Concat?
A similar function is going to be implemented in a commercial library so I really need the "optimal" way to do this.
public static string Repeat(this string instance, int times)
{
if (times == 1 || string.IsNullOrEmpty(instance)) return instance;
if (times == 0) return "";
if (times < 0) throw new ArgumentOutOfRangeException("times");
StringBuilder sb = new StringBuilder(instance.Length * times);
for (int i = 0; i < times; i++)
sb.Append(instance);
return sb.ToString();
}
Stringbuilder ofcourse. It is meant for fast string join operations, because it won't create new object each time you want to join a string.
For details see here.
StringBuilder.
"result += result;"
creates a new string each and every time you do the assignment, and assign that new string to your variable, since strings are immutable.
Go with StringBuilder, definitely.
Never make an assumption that one method is faster than another -- you must alway measure the performance of both and then decide.
Surprisingly, for smaller numbers of iterations, just a standard string concatenation (result += string) is often faster than using a string builder.
If you know that the number of iterations will always be the same (e.g. it will always be 50 iterations), then I would suggest that you make some performance measurements using different methods.
If you really want to get clever, make performance measurements against number of iterations and you can find the 'crossover point' where one method is faster than another and hard-code that threshold into the method:
if(iterations < 30)
{
CopyWithConcatenation();
}
else
{
CopyWithStringBuilder();
}
The exact performance crossover point will depend on the specific detail of your code and you'll never be able to find out what they are without making performance measurements.
To complicate things a little more, StringBuilder has 'neater' memory management that string concatenation (which creates more temporary instances) so this might also have an impact on overall performance outside of your string loop (like the next time the Garbage Collector runs).
Let us know how you got on.
Related
This is a common question but I hope this does not get tagged as a duplicate since the nature of the question is different (please read the whole not only the title)
Unaware of the existence of String.Replace I wrote the following:
int theIndex = 0;
while ((theIndex = message.IndexOf(separationChar, theIndex)) != -1) //we found the character
{
theIndex++;
if (theIndex < message.Length)//not in the last position
{
message = message.Insert(theIndex, theTime);
}
else
{
// I dont' think this is really neccessary
break;
}
} //while finding characters
As you can see I am replacing occurrences of separationChar in the message String with a String called "theTime".
Now, this works ok for small strings but I have been given a really huge String (in the order of several hundred Kbytes- by the way is there a limit for String or StringBuilder??) and it takes a lot of time...
So my questions are:
1) Is it more efficient if I just do
oldString=separationChar.ToString();
newString=oldString.Insert(theTime);
message= message.Replace(oldString,newString);
2) Is there any other way I can process very long Strings to insert a String (theTime) when finding some char in a very fast and efficient way??
Thanks a lot
As Danny already mentioned, string.Insert() actually creates a new instance each time you use it, and these also have to be garbage collected at some point.
You could instead start with an empty StringBuilder to construct the result string:
public static string Replace(this string str, char find, string replacement)
{
StringBuilder result = new StringBuilder(str.Length); // initial capacity
int pointer = 0;
int index;
while ((index = str.IndexOf(find, pointer)) >= 0)
{
// Append the unprocessed data up to the character
result.Append(str, pointer, index - pointer);
// Append the replacement string
result.Append(replacement);
// Next unprocessed data starts after the character
pointer = index + 1;
}
// Append the remainder of the unprocessed data
result.Append(str, pointer, str.Length - pointer);
return result.ToString();
}
This will not cause a new string to be created (and garbage collected) for each occurrence of the character. Instead, when the internal buffer of the StringBuilder is full, it will create a new buffer chunk "of sufficient capacity". Quote from reference source, when its buffer is full:
Compute the length of the new block we need
We make the new chunk at least big enough for the current need (minBlockCharCount), but also as big as the current length (thus doubling capacity), up to a maximum
(so we stay in the small object heap, and never allocate really big chunks even if
the string gets really big).
Thank you for answering my question.
I am writing an answer because I have to report that I tried the solution in my question 1) and it is indeed more efficient according to the results of my program. String.Replace can replace a string(from a char) with another string very fast.
oldString=separationChar.ToString();
newString=oldString.Insert(theTime);
message= message.Replace(oldString,newString);
So here is my code:
if (txtboxAntwoord.Text == lblProvincie.Text)
{
}
The thing I want to achieve is: make the if statement so that it does check if the text is the same, but it does not check if the text contains upper- or lowercases.
Let's say lblProvincie's text = "Some Text" and I want to check if the containing text of txtboxAntwoord is the same, but it shouldn't matter if it contains the uppercases of the text.
You can use the .Equals method on string and pass in a string comparison option that ignores case.
if (string.Equals(txtboxAntwoord.Text, lblProvincie.Text,
StringComparison.OrdinalIgnoreCase))
for pure speed where culture-based comparison is unimportant
OR
if (string.Equals(txtboxAntwoord.Text, lblProvincie.Text,
StringComparison.CurrentCultureIgnoreCase))
if you need to take culture-based comparisons into account.
While this approach may be slightly more complicated, it is more efficient than the ToUpper() approach since new strings do not need to be allocated. It also has the advantage of being able to specify different comparison options such as CurrentCultureIgnoreCase.
While this may not be much of an impact on application performance in an isolated context, this will certainly make a difference when doing large amounts of string comparisons.
const string test1 = "Test1";
const string test2 = "test1";
var s1 = new Stopwatch();
s1.Start();
for (int i = 0; i < 1000000; i++)
{
if (!(test1.ToUpper() == test2.ToUpper()))
{
var x = "1";
}
}
s1.Stop();
s1.ElapsedMilliseconds.Dump();
var s2 = new Stopwatch();
s2.Start();
for (int i = 0; i < 1000000; i++)
{
if(!string.Equals(test1, test2,
StringComparison.OrdinalIgnoreCase))
{
var x = "1";
}
}
s2.Stop();
s2.ElapsedMilliseconds.Dump();
The first contrived example takes 265 milliseconds on my machine for 1 million iterations. The second only takes 25. In addition, there was additional string creation for each of those iterations.
Per Mike's suggestion in the comments, it is only fair to also profile CurrentCultureIgnoreCase. This is still more efficient than ToUpper, taking 114 milliseconds which is still over twice as fast as ToUpper and does not allocate additional strings.
You can use ToUpper() or ToLower on both values so that both have same case uppor or lower, you can do it like:
if (txtboxAntwoord.Text.ToUpper() == lblProvincie.Text.ToUpper())
What you are looking for is called "case insensitive string comparison".
You can achieve it with Ehsan Sajjad's suggestion, but it would be inefficient, because for each comparison you would be generating at least one (in his example two, but that can be optimized) new string to contain the uppercase version of the string to compare to, and then immediately letting that string be garbage-collected.
David L's suggestion is bound to perform a lot better, though I would advise against StringComparison.OrdinalIgnoreCase, because it ignores the current culture.
Instead, use the following:
string.Equals( text1, text2, StringComparison.CurrentCultureIgnoreCase )
Console.Write(i) in each for or Console.Write(StringBuilder) at the end: Which one is better?
I have two functions, the first one prints within the for loop and the other one is printing at the end.
public static void checkmethod1(int value, Checkevent text)
{
Stopwatch stopwatch2 = new Stopwatch();
stopwatch2.Start();
StringBuilder builder = new StringBuilder();
switch (text)
{
case Checkevent.odd:
for (int i = 1; i <= value; i = i + 2)
{
builder.Append(i).Append(" ");
}
break;
case Checkevent.even:
for (int i = 2; i <= value; i = i + 2)
{
builder.Append(i).Append(" ");
}
break;
}
stopwatch2.Stop();
Console.WriteLine(builder);
Console.WriteLine("{0}", stopwatch2.Elapsed);
}
Function 2:
public static void checkmethod3(int value, Checkevent text)
{
Stopwatch stopwatch2 = new Stopwatch();
stopwatch2.Start();
switch (text)
{
case Checkevent.odd:
for (int i = 1; i <= value; i = i + 2)
{
Console.Write(i);
}
break;
case Checkevent.even:
for (int i = 2; i <= value; i = i + 2)
{
Console.Write(i);
}
break;
}
stopwatch2.Stop();
Console.Write("{0}", stopwatch2.Elapsed);
}
In this particular scenario I will prefer StringBuilder as The loop is not taking significant time that could change the user experience. The StringBuilder in general require less memory and you will get better performance. As you each modification in string new string object is create but that is not the case with StringBuilder.
The first method executed the Console.Write only once but the second on will execute it the times for loop iterates. This will make the second one slow.
If you want to show user the text in console when you want user to see text as it appear like you are showing log to view the process flow then showing it once (with StringBuilder) may not give user chance to read it. In that case you would be write log as it is generated using Console.Write(string).
Deciding when to use string and when to use StringBuilder could become easy when you understand how both work. Their one of important behavior is given as under.
Using the StringBuilder Class in the .NET Framework
The String object is immutable. Every time you use one of the methods
in the System.String class, you create a new string object in memory,
which requires a new allocation of space for that new object. In
situations where you need to perform repeated modifications to a
string, the overhead associated with creating a new String object can
be costly. The System.Text.StringBuilder class can be used when you
want to modify a string without creating a new object. For example,
using the StringBuilder class can boost performance when concatenating
many strings together in a loop.
Edit
I test the above two method with three values 100, 10000 and 100000 and machine used has following specs
Operating System: Windows 7 Enterprise and CPU
Processor: Intel(R) Core (TM) i5-3570 CPU # 3.40 GHz 3.40 GHz
Installed memory (RAM): 8.00 GB
System Type: 64 bit Operating System
Value 100
Time with StringBuilder 00:00:00.0000184
Time without StringBuilder 00:00:00.0037037
Value 10000
Time with StringBuilder 00:00:00.0013233
Time without StringBuilder 00:00:00.2970272
Value 100000
Time with StringBuilder 00:00:00.0133015
Time without StringBuilder 00:00:02.5853375
In the first method where you used the StringBuilder the Console.Write is executed only once but in other case it is executed as many times as the loop iterations. This makes the second one slow. Comparing StringBuilder with string concatenation is not applicable here.
I want to know how return values for strings works for strings in C#. In one of my functions, I generate html code and the string is really huge, I then return it from the function, and then insert it into the page. But I want to know should I pass a huge string as a return value, or just insert it into the page from the same function?
When C# returns a string, does it create a new string from the old one, and return that?
Thanks.
Strings (or any other reference type) are not copied when returning from a function, only value types are.
System.String is a reference type (class) and so passing as parameter and returning only involve the copying of a reference (32 or 64 bits).
The size of the string is not relevant.
Returning a string is a cheap operation - as mentioned it's purely a matter of returning 32 or 64 bits (4 or 8 bytes).
However, as Sten Petrov points out string + operations involve the creation of a new string, and can be a little expensive. If you wanted to save performance & memory I'd suggest doing something like this:
static int i = 0;
static void Main(string[] args)
{
while (Console.ReadLine() == "")
{
var pageSB = new StringBuilder();
foreach (var section in new[] { AddHeader(), AddContent(), AddFooter() })
for (int i = 0; i < section.Length; i++)
pageSB.Append(section[i]);
Console.Write(pageSB.ToString());
}
}
static StringBuilder AddHeader()
{
return new StringBuilder().Append("Hi ").AppendLine("World");
}
static StringBuilder AddContent()
{
return new StringBuilder()
.AppendFormat("This page has been viewed: {0} times\n", ++i);
}
static StringBuilder AddFooter()
{
return new StringBuilder().Append("Bye ").AppendLine("World");
}
Here we use the StringBuilders to hold a reference to all the strings we want to concat, and wait until the very end before joining them together. This'll save many unnecessary additions (which are memory and CPU heavy in comparison).
Of course, I doubt you'll actually see any need for this in practise - and if you do I'd spend some time learning about pooling etc. to help reduce the garbage created by all the string builders - and maybe consider creating a custom 'string holder' that suits your purposes better.
For some reason, this code works fine when I don't use a seed in the Random class, but if I try to use DateTime.Now to get a more random number, I get a StackOverflowException! My class is really simple. Could someone tell me what I'm doing wrong here? See MakeUniqueFileName.
public class TempUtil
{
private int strcmp(string s1, string s2)
{
try
{
for (int i = 0; i < s1.Length; i++)
if (s1[i] != s2[i]) return 0;
return 1;
}
catch (IndexOutOfRangeException)
{
return 0;
}
}
private int Uniqueness(object randomObj)
{
switch (randomObj.ToString())
{
case "System.Object":
case "System.String":
return randomObj.ToString()[0];
case "System.Int32":
return int.Parse(randomObj.ToString());
case "System.Boolean":
return strcmp(randomObj.ToString(), "True");
default:
return Uniqueness(randomObj.ToString());
}
}
public string MakeUniqueFileName()
{
return "C:\\windows\\temp\\" + new Random(Uniqueness(DateTime.Now)).NextDouble() + ".tmp";
}
}
You're calling DateTime.Now.ToString(), which doesn't give you one of the strings you're checking for... so you're recursing, calling it with the same string... which still isn't one of the strings you're looking for.
You don't need to use Random to demonstrate the problem. This will do it very easily:
Uniqueness(""); // Tick, tick, tick... stack overflow
What did you expect it to be doing? It's entirely unclear what your code is meant to be doing, but I suggest you ditch the Uniqueness method completely. In fact, I suggest you get rid of the whole class, and use Path.GetTempFileName instead.
In short:
It should say
switch (randomObj.GetType().ToString())
instead of
switch (randomObj.ToString())
But even then this isn't very clever.
You are passing a DateTime instance to your Uniqueness method.
This falls through and calls itself with ToString - on a DateTime instance this will be a formatted DateTime string (such as "21/01/2011 13:13:01").
Since this string doesn't match any of your switch cases (again), the method calls itself again, but the result of calling ToString on a string is the same string.
You have caused an infinite call stack that results in the StackOverflowException.
There is no need to call Uniquness - when creating a Random instance, it will be based on the current time anyways.
I suggest reading Random numbers from the C# in depth website.
The parameter-less constructor of Random already uses the current time as seed value. It uses the time ticks, used internally to represent a DateTime.
A problem with this approach, however, is that the time clock ticks very slowly compared to the CPU clock frequency. If you create a new instance of Random each time you need a random value, it may be, that the clock did not tick between two calls, thus generating the same random number twice.
You can simply solve this problem by creating a single instance of Random.
public class TempUtil {
private static readonly Random random = new Random();
public string MakeUniqueFileName()
{
return #"C:\windows\temp\" + random.NextDouble() + ".tmp";
}
}
This will generate very good random numbers.
By the way
System.IO.Path.GetTempFileName()
automatically creates an empty temporary file with a unique name and returns the full path of that file.
Where to begin.
1. There is already a string compare. Use it. It has been debugged.
2. Your Unique function is illogical. The first two case items return a 'S' perhaps cast to an int. You have neglected the break on the first case.
Your third case is like this:
if (x =="System.Int32") return int.Parse("System.Int32");
That may return 32, or a parse error.
Your fourth case is like this:
if (x == "System.Boolean") return strcmp("System.Boolean", "True");
Your default case is called recursevly (sp) causing the stack overflow (see comment above)
In order fix this program, I recommend you read at least one good book on C#, then rethink your program, then write it. Perhaps Javascript would be a better fit.