Writing to console char by char, fastest way - c#

In a current project of mine I have to parse a string, and write parts of it to the console. While testing how to do this without too much overhead, I discovered that one way I was testing is actually faster than Console.WriteLine, which is slightly confusing to me.
I'm aware this is not the proper way to benchmark stuff, but I'm usually fine with a rough "this is faster than this", which I can tell after running it a few times.
static void Main(string[] args)
{
var timer = new Stopwatch();
timer.Restart();
Test1("just a little test string.");
timer.Stop();
Console.WriteLine(timer.Elapsed);
timer.Restart();
Test2("just a little test string.");
timer.Stop();
Console.WriteLine(timer.Elapsed);
timer.Restart();
Test3("just a little test string.");
timer.Stop();
Console.WriteLine(timer.Elapsed);
}
static void Test1(string str)
{
Console.WriteLine(str);
}
static void Test2(string str)
{
foreach (var c in str)
Console.Write(c);
Console.Write('\n');
}
static void Test3(string str)
{
using (var stream = new StreamWriter(Console.OpenStandardOutput()))
{
foreach (var c in str)
stream.Write(c);
stream.Write('\n');
}
}
As you can see, Test1 is using Console.WriteLine. My first thought was to simply call Write for every char, see Test2. But this resulted in taking roughly twice as long. My guess would be that it flushes after every write, which makes it slower. So I tried Test3, using a StreamWriter (AutoFlush off), which resulted in being about 25% faster than Test1, and I'm really curious why that is. Or is it that writing to the console can't be benchmarked properly? (noticed some strange data when adding more test cases...)
Can someone enlighten me?
Also, if there's a better way to do this (going though a string and only writing parts of it to the console), feel free to comment on that.

First I agree with the other comments that your test harness leaves something to be desired... I rewrote it and included it below. The result after rewrite post a clear winner:
//Test 1 = 00:00:03.7066514
//Test 2 = 00:00:24.6765818
//Test 3 = 00:00:00.8609692
From this you are correct the buffered stream writer is better than 25% faster. It's faster only because it's buffered. Internally the StreamWriter implementation uses a default buffer size of around 1~4kb (depending on the stream type). If you construct the StreamWriter with an 8-byte buffer (the smallest allowed) you will see most of your performance improvement disappear. You can also see this by using a Flush() call following each write.
Here is the test rewritten to obtain the numbers above:
private static StreamWriter stdout = new StreamWriter(Console.OpenStandardOutput());
static void Main(string[] args)
{
Action<string>[] tests = new Action<string>[] { Test1, Test2, Test3 };
TimeSpan[] timming = new TimeSpan[tests.Length];
// Repeat the entire sequence of tests many times to accumulate the result
for (int i = 0; i < 100; i++)
{
for( int itest =0; itest < tests.Length; itest++)
{
string text = String.Format("just a little test string, test = {0}, iteration = {1}", itest, i);
Action<string> thisTest = tests[itest];
//Clear the console so that each test begins from the same state
Console.Clear();
var timer = Stopwatch.StartNew();
//Repeat the test many times, if this was not using the console
//I would use a much higher number, say 10,000
for (int j = 0; j < 100; j++)
thisTest(text);
timer.Stop();
//Accumulate the result, but ignore the first run
if (i != 0)
timming[itest] += timer.Elapsed;
//Depending on what you are benchmarking you may need to force GC here
}
}
//Now print the results we have collected
Console.Clear();
for (int itest = 0; itest < tests.Length; itest++)
Console.WriteLine("Test {0} = {1}", itest + 1, timming[itest]);
Console.ReadLine();
}
static void Test1(string str)
{
Console.WriteLine(str);
}
static void Test2(string str)
{
foreach (var c in str)
Console.Write(c);
Console.Write('\n');
}
static void Test3(string str)
{
foreach (var c in str)
stdout.Write(c);
stdout.Write('\n');
}

I've ran your test for 10000 times each and the results are the following on my machine:
test1 - 0.6164241
test2 - 8.8143273
test3 - 0.9537039
this is the script I used:
static void Main(string[] args)
{
Test1("just a little test string."); // warm up
GC.Collect(); // compact Heap
GC.WaitForPendingFinalizers(); // and wait for the finalizer queue to empty
Stopwatch timer = new Stopwatch();
timer.Start();
for (int i = 0; i < 10000; i++)
{
Test1("just a little test string.");
}
timer.Stop();
Console.WriteLine(timer.Elapsed);
}

I changed the code to run each test 1000 times.
static void Main(string[] args) {
var timer = new Stopwatch();
timer.Restart();
for (int i = 0; i < 1000; i++)
Test1("just a little test string.");
timer.Stop();
TimeSpan elapsed1 = timer.Elapsed;
timer.Restart();
for (int i = 0; i < 1000; i++)
Test2("just a little test string.");
timer.Stop();
TimeSpan elapsed2 = timer.Elapsed;
timer.Restart();
for (int i = 0; i < 1000; i++)
Test3("just a little test string.");
timer.Stop();
TimeSpan elapsed3 = timer.Elapsed;
Console.WriteLine(elapsed1);
Console.WriteLine(elapsed2);
Console.WriteLine(elapsed3);
Console.Read();
}
My output:
00:00:05.2172738
00:00:09.3893525
00:00:05.9624869

I also ran this one 10000 times and got these results:
00:00:00.6947374
00:00:09.6185047
00:00:00.8006468
Which seems in keeping with what others observed. I was curious why Test3 was slower than Test1, so wrote a fourth test:
timer.Start();
using (var stream = new StreamWriter(Console.OpenStandardOutput()))
{
for (int i = 0; i < testSize; i++)
{
Test4("just a little test string.", stream);
}
}
timer.Stop();
This one reuses the stream for each test, thus avoiding the overhead of recreating it each time. Result:
00:00:00.4090399
Although this is the fastest, it writes all the output at the end of the using block, which may not be what you are after. I would imagine that this approach would chew up more memory as well.

Related

C# Unexpected Performance - Function Calls

I'm optimizing every line of code in my application, as performance is key. I'm testing all assumptions, as what I expect is not what I see in reality.
A strange occurrence to me is the performance of function calls. Below are two scenarios. Iterating an integer within the loop, and with a function in the loop. I expected the function call to be slower, however it is faster??
Can anyone explain this? I'm using .NET 4.7.1
Without function: 2808ms
With function 2295ms
UPDATE:
Switching the loops switches the runtime as well - I don't understand why, but will accept it as it is. Running the two different loops in different applications give similar results. I'll assume in the future that a function call won't create any additional overhead
public static int a = 0;
public static void Increment()
{
a = a + 1;
}
static void Main(string[] args)
{
//There were suggestions that the first for loop always runs faster. I have included a 'dummy' for loop here to warm up.
a = 0;
for (int i = 0;i < 1000;i++)
{
a = a + 1;
}
//Normal increment
Stopwatch sw = new Stopwatch();
sw.Start();
a = 0;
for (int i = 0; i < 900000000;i++)
{
a = a + 1;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
//Increment with function
Stopwatch sw2 = new Stopwatch();
sw2.Start();
a = 0;
for (int i = 0; i < 900000000; i++)
{
Increment();
}
sw2.Stop();
Console.WriteLine(sw2.ElapsedMilliseconds);
Console.ReadLine();
}

Weird DirectoryInfo.EnumerateFiles() performance pattern?

I was experimenting with DirectoryInfo.EnumerateFiles() and found this weird performance pattern that I can't understand. If I perform several successive enumerations, each successive enumeration takes longer than the previous one by a substantial amount of time. That is weird. But what's even weirder, it's that if I put several searches in a for loop, then with each iteration, the search times reset. Here's my method and my results:
static int i;
static void Main(string[] args)
{
for (int j = 0; j < 15; j++)
{
var sw = new System.Diagnostics.Stopwatch();
i = 0;
sw.Start();
EnumerateFiles();
sw.Stop();
Console.WriteLine(sw.Elapsed.ToString());
i = 0;
sw.Start();
EnumerateFiles();
sw.Stop();
Console.WriteLine(sw.Elapsed.ToString());
i = 0;
sw.Start();
EnumerateFiles();
sw.Stop();
Console.WriteLine(sw.Elapsed.ToString());
i = 0;
sw.Start();
EnumerateFiles();
sw.Stop();
Console.WriteLine(sw.Elapsed.ToString());
Console.WriteLine("====================================");
}
Console.ReadLine();
}
private static void EnumerateFiles()
{
foreach (var item in new System.IO.DirectoryInfo("d:\\aac").EnumerateFiles("*.*", System.IO.SearchOption.AllDirectories))
{
i++;
}
}
And the results:
Total files: 5386
00:00:00.2868080
00:00:00.5720745
00:00:00.8443089
00:00:01.1315225
====================================
00:00:00.2729422
00:00:00.5275304
00:00:00.8259863
00:00:01.0712183
====================================
00:00:00.2457264
00:00:00.4642581
00:00:00.6948112
00:00:00.9178203
====================================
00:00:00.2198666
00:00:00.4503493
00:00:00.6717144
00:00:00.8951899
====================================
00:00:00.2391378
00:00:00.4602923
00:00:00.6767395
00:00:00.9082248
====================================
//last one (i == 15):
00:00:00.2138526
00:00:00.4437129
00:00:00.6626495
00:00:00.8794025
Does anyone know why that is happening?
Now, the answer to the question about why am I doing four searches in one iteration is: I was trying out some things, measuring the performance and stumbled upon this anomaly, and now I want to know why it is like so.
instead of all the sw.Start();
do sw = System.Diagnostics.Stopwatch.StartNew();
and try again
the issue here is your not resetting the stopwatch, your pausing it
you can also call the sw.Reset(); sw.Start(); or sw.Retart() instead

Parallel.Foreach loop gets different result than For loop?

I've
written simple for loop iterating through array and Parallel.ForEach loop doing the same thing. However, resuls I've get are different so I want to ask what the heck is going on? :D
class Program
{
static void Main(string[] args)
{
long creating = 0;
long reading = 0;
long readingParallel = 0;
for (int j = 0; j < 10; j++)
{
Stopwatch timer1 = new Stopwatch();
Random rnd = new Random();
int[] array = new int[100000000];
timer1.Start();
for (int i = 0; i < 100000000; i++)
{
array[i] = rnd.Next(5);
}
timer1.Stop();
long result = 0;
Stopwatch timer2 = new Stopwatch();
timer2.Start();
for (int i = 0; i < 100000000; i++)
{
result += array[i];
}
timer2.Stop();
Stopwatch timer3 = new Stopwatch();
long result2 = 0;
timer3.Start();
Parallel.ForEach(array, (item) =>
{
result2 += item;
});
if (result != result2)
{
Console.WriteLine(result + " - " + result2);
}
timer3.Stop();
creating += timer1.ElapsedMilliseconds;
reading += timer2.ElapsedMilliseconds;
readingParallel += timer3.ElapsedMilliseconds;
}
Console.WriteLine("Create : \t" + creating / 100);
Console.WriteLine("Read: \t\t" + reading / 100);
Console.WriteLine("ReadP: \t\t" + readingParallel / 100);
Console.ReadKey();
}
}
So in the condition I get results:
result = 200009295;
result2 = 35163054;
Is there anything wrong?
The += operator is non-atomic and actually performs multiple operations:
load value at location that result is pointing to, into memory
add array[i] to the in-memory value (I'm simplifying here)
write the result back to result
Since a lot of these add operations will be running in parallel it is not just possible, but likely that there will be races between some of these operations where one thread reads a result value and performs the addition, but before it has the chance to write it back, another thread grabs the old result value (which hasn't yet been updated) and also performs the addition. Then both threads write their respective values to result. Regardless of which one wins the race, you end up with a smaller number than expected.
This is why the Interlocked class exists.
Your code could very easily be fixed:
Parallel.ForEach(array, (item) =>
{
Interlocked.Add(ref result2, item);
});
Don't be surprised if Parallel.ForEach ends up slower than the fully synchronous version in this case though. This is due to the fact that
the amount of work inside the delegate you pass to Parallel.ForEach is very small
Interlocked methods incur a slight but non-negligible overhead, which will be quite noticeable in this particular case

Code sample that shows casting to uint is more efficient than range check

So I am looking at this question and the general consensus is that uint cast version is more efficient than range check with 0. Since the code is also in MS's implementation of List I assume it is a real optimization. However I have failed to produce a code sample that results in better performance for the uint version. I have tried different tests and there is something missing or some other part of my code is dwarfing the time for the checks. My last attempt looks like this:
class TestType
{
public TestType(int size)
{
MaxSize = size;
Random rand = new Random(100);
for (int i = 0; i < MaxIterations; i++)
{
indexes[i] = rand.Next(0, MaxSize);
}
}
public const int MaxIterations = 10000000;
private int MaxSize;
private int[] indexes = new int[MaxIterations];
public void Test()
{
var timer = new Stopwatch();
int inRange = 0;
int outOfRange = 0;
timer.Start();
for (int i = 0; i < MaxIterations; i++)
{
int x = indexes[i];
if (x < 0 || x > MaxSize)
{
throw new Exception();
}
inRange += indexes[x];
}
timer.Stop();
Console.WriteLine("Comparision 1: " + inRange + "/" + outOfRange + ", elapsed: " + timer.ElapsedMilliseconds + "ms");
inRange = 0;
outOfRange = 0;
timer.Reset();
timer.Start();
for (int i = 0; i < MaxIterations; i++)
{
int x = indexes[i];
if ((uint)x > (uint)MaxSize)
{
throw new Exception();
}
inRange += indexes[x];
}
timer.Stop();
Console.WriteLine("Comparision 2: " + inRange + "/" + outOfRange + ", elapsed: " + timer.ElapsedMilliseconds + "ms");
}
}
class Program
{
static void Main()
{
TestType t = new TestType(TestType.MaxIterations);
t.Test();
TestType t2 = new TestType(TestType.MaxIterations);
t2.Test();
TestType t3 = new TestType(TestType.MaxIterations);
t3.Test();
}
}
The code is a bit of a mess because I tried many things to make uint check perform faster like moving the compared variable into a field of a class, generating random index access and so on but in every case the result seems to be the same for both versions. So is this change applicable on modern x86 processors and can someone demonstrate it somehow?
Note that I am not asking for someone to fix my sample or explain what is wrong with it. I just want to see the case where the optimization does work.
if (x < 0 || x > MaxSize)
The comparison is performed by the CMP processor instruction (Compare). You'll want to take a look at Agner Fog's instruction tables document (PDF), it list the cost of instructions. Find your processor back in the list, then locate the CMP instruction.
For mine, Haswell, CMP takes 1 cycle of latency and 0.25 cycles of throughput.
A fractional cost like that could use an explanation, Haswell has 4 integer execution units that can execute instructions at the same time. When a program contains enough integer operations, like CMP, without an interdependency then they can all execute at the same time. In effect making the program 4 times faster. You don't always manage to keep all 4 of them busy at the same time with your code, it is actually pretty rare. But you do keep 2 of them busy in this case. Or in other words, two comparisons take just as long as single one, 1 cycle.
There are other factors at play that make the execution time identical. One thing helps is that the processor can predict the branch very well, it can speculatively execute x > MaxSize in spite of the short-circuit evaluation. And it will in fact end up using the result since the branch is never taken.
And the true bottleneck in this code is the array indexing, accessing memory is one of the slowest thing the processor can do. So the "fast" version of the code isn't faster even though it provides more opportunity to allow the processor to concurrently execute instructions. It isn't much of an opportunity today anyway, a processor has too many execution units to keep busy. Otherwise the feature that makes HyperThreading work. In both cases the processor bogs down at the same rate.
On my machine, I have to write code that occupies more than 4 engines to make it slower. Silly code like this:
if (x < 0 || x > MaxSize || x > 10000000 || x > 20000000 || x > 3000000) {
outOfRange++;
}
else {
inRange++;
}
Using 5 compares, now I can a difference, 61 vs 47 msec. Or in other words, this is a way to count the number of integer engines in the processor. Hehe :)
So this is a micro-optimization that probably used to pay off a decade ago. It doesn't anymore. Scratch it off your list of things to worry about :)
I would suggest attempting code which does not throw an exception when the index is out of range. Exceptions are incredibly expensive and can completely throw off your bench results.
The code below does a timed-average bench for 1,000 iterations of 1,000,000 results.
using System;
using System.Diagnostics;
namespace BenchTest
{
class Program
{
const int LoopCount = 1000000;
const int AverageCount = 1000;
static void Main(string[] args)
{
Console.WriteLine("Starting Benchmark");
RunTest();
Console.WriteLine("Finished Benchmark");
Console.Write("Press any key to exit...");
Console.ReadKey();
}
static void RunTest()
{
int cursorRow = Console.CursorTop; int cursorCol = Console.CursorLeft;
long totalTime1 = 0; long totalTime2 = 0;
long invalidOperationCount1 = 0; long invalidOperationCount2 = 0;
for (int i = 0; i < AverageCount; i++)
{
Console.SetCursorPosition(cursorCol, cursorRow);
Console.WriteLine("Running iteration: {0}/{1}", i + 1, AverageCount);
int[] indexArgs = RandomFill(LoopCount, int.MinValue, int.MaxValue);
int[] sizeArgs = RandomFill(LoopCount, 0, int.MaxValue);
totalTime1 += RunLoop(TestMethod1, indexArgs, sizeArgs, ref invalidOperationCount1);
totalTime2 += RunLoop(TestMethod2, indexArgs, sizeArgs, ref invalidOperationCount2);
}
PrintResult("Test 1", TimeSpan.FromTicks(totalTime1 / AverageCount), invalidOperationCount1);
PrintResult("Test 2", TimeSpan.FromTicks(totalTime2 / AverageCount), invalidOperationCount2);
}
static void PrintResult(string testName, TimeSpan averageTime, long invalidOperationCount)
{
Console.WriteLine(testName);
Console.WriteLine(" Average Time: {0}", averageTime);
Console.WriteLine(" Invalid Operations: {0} ({1})", invalidOperationCount, (invalidOperationCount / (double)(AverageCount * LoopCount)).ToString("P3"));
}
static long RunLoop(Func<int, int, int> testMethod, int[] indexArgs, int[] sizeArgs, ref long invalidOperationCount)
{
Stopwatch sw = new Stopwatch();
Console.Write("Running {0} sub-iterations", LoopCount);
sw.Start();
long startTickCount = sw.ElapsedTicks;
for (int i = 0; i < LoopCount; i++)
{
invalidOperationCount += testMethod(indexArgs[i], sizeArgs[i]);
}
sw.Stop();
long stopTickCount = sw.ElapsedTicks;
long elapsedTickCount = stopTickCount - startTickCount;
Console.WriteLine(" - Time Taken: {0}", new TimeSpan(elapsedTickCount));
return elapsedTickCount;
}
static int[] RandomFill(int size, int minValue, int maxValue)
{
int[] randomArray = new int[size];
Random rng = new Random();
for (int i = 0; i < size; i++)
{
randomArray[i] = rng.Next(minValue, maxValue);
}
return randomArray;
}
static int TestMethod1(int index, int size)
{
return (index < 0 || index >= size) ? 1 : 0;
}
static int TestMethod2(int index, int size)
{
return ((uint)(index) >= (uint)(size)) ? 1 : 0;
}
}
}
You aren't comparing like with like.
The code you were talking about not only saved one branch by using the optimisation, but also 4 bytes of CIL in a small method.
In a small method 4 bytes can be the difference in being inlined and not being inlined.
And if the method calling that method is also written to be small, then that can mean two (or more) method calls are jitted as one piece of inline code.
And maybe some of it is then, because it is inline and available for analysis by the jitter, optimised further again.
The real difference is not between index < 0 || index >= _size and (uint)index >= (uint)_size, but between code that has repeated efforts to minimise the method body size and code that does not. Look for example at how another method is used to throw the exception if necessary, further shaving off a couple of bytes of CIL.
(And no, that's not to say that I think all methods should be written like that, but there certainly can be performance differences when one does).

Am I undermining the efficiency of StringBuilder?

I've started using StringBuilder in preference to straight concatenation, but it seems like it's missing a crucial method. So, I implemented it myself, as an extension:
public void Append(this StringBuilder stringBuilder, params string[] args)
{
foreach (string arg in args)
stringBuilder.Append(arg);
}
This turns the following mess:
StringBuilder sb = new StringBuilder();
...
sb.Append(SettingNode);
sb.Append(KeyAttribute);
sb.Append(setting.Name);
Into this:
sb.Append(SettingNode, KeyAttribute, setting.Name);
I could use sb.AppendFormat("{0}{1}{2}",..., but this seems even less preferred, and still harder to read. Is my extension a good method, or does it somehow undermine the benefits of StringBuilder? I'm not trying to prematurely optimize anything, as my method is more about readability than speed, but I'd also like to know I'm not shooting myself in the foot.
I see no problem with your extension. If it works for you it's all good.
I myself prefere:
sb.Append(SettingNode)
.Append(KeyAttribute)
.Append(setting.Name);
Questions like this can always be answered with a simple test case.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
namespace SBTest
{
class Program
{
private const int ITERATIONS = 1000000;
private static void Main(string[] args)
{
Test1();
Test2();
Test3();
}
private static void Test1()
{
var sw = Stopwatch.StartNew();
var sb = new StringBuilder();
for (var i = 0; i < ITERATIONS; i++)
{
sb.Append("TEST" + i.ToString("00000"),
"TEST" + (i + 1).ToString("00000"),
"TEST" + (i + 2).ToString("00000"));
}
sw.Stop();
Console.WriteLine("Testing Append() extension method...");
Console.WriteLine("--------------------------------------------");
Console.WriteLine("Test 1 iterations: {0:n0}", ITERATIONS);
Console.WriteLine("Test 1 milliseconds: {0:n0}", sw.ElapsedMilliseconds);
Console.WriteLine("Test 1 output length: {0:n0}", sb.Length);
Console.WriteLine("");
}
private static void Test2()
{
var sw = Stopwatch.StartNew();
var sb = new StringBuilder();
for (var i = 0; i < ITERATIONS; i++)
{
sb.Append("TEST" + i.ToString("00000"));
sb.Append("TEST" + (i+1).ToString("00000"));
sb.Append("TEST" + (i+2).ToString("00000"));
}
sw.Stop();
Console.WriteLine("Testing multiple calls to Append() built-in method...");
Console.WriteLine("--------------------------------------------");
Console.WriteLine("Test 2 iterations: {0:n0}", ITERATIONS);
Console.WriteLine("Test 2 milliseconds: {0:n0}", sw.ElapsedMilliseconds);
Console.WriteLine("Test 2 output length: {0:n0}", sb.Length);
Console.WriteLine("");
}
private static void Test3()
{
var sw = Stopwatch.StartNew();
var sb = new StringBuilder();
for (var i = 0; i < ITERATIONS; i++)
{
sb.AppendFormat("{0}{1}{2}",
"TEST" + i.ToString("00000"),
"TEST" + (i + 1).ToString("00000"),
"TEST" + (i + 2).ToString("00000"));
}
sw.Stop();
Console.WriteLine("Testing AppendFormat() built-in method...");
Console.WriteLine("--------------------------------------------");
Console.WriteLine("Test 3 iterations: {0:n0}", ITERATIONS);
Console.WriteLine("Test 3 milliseconds: {0:n0}", sw.ElapsedMilliseconds);
Console.WriteLine("Test 3 output length: {0:n0}", sb.Length);
Console.WriteLine("");
}
}
public static class SBExtentions
{
public static void Append(this StringBuilder sb, params string[] args)
{
foreach (var arg in args)
sb.Append(arg);
}
}
}
On my PC, the output is:
Testing Append() extension method...
--------------------------------------------
Test 1 iterations: 1,000,000
Test 1 milliseconds: 1,080
Test 1 output length: 29,700,006
Testing multiple calls to Append() built-in method...
--------------------------------------------
Test 2 iterations: 1,000,000
Test 2 milliseconds: 1,001
Test 2 output length: 29,700,006
Testing AppendFormat() built-in method...
--------------------------------------------
Test 3 iterations: 1,000,000
Test 3 milliseconds: 1,124
Test 3 output length: 29,700,006
So your extension method is only slightly slower than the Append() method and is slightly faster than the AppendFormat() method, but in all 3 cases, the difference is entirely too trivial to worry about. Thus, if your extension method enhances the readability of your code, use it!
It's a little bit of overhead creating the extra array, but I doubt that it's a lot. You should measure
If it turns out that the overhead of creating string arrays is significant, you can mitigate it by having several overloads - one for two parameters, one for three, one for four etc... so that only when you get to a higher number of parameters (e.g. six or seven) will it need to create the array. The overloads would be like this:
public void Append(this builder, string item1, string item2)
{
builder.Append(item1);
builder.Append(item2);
}
public void Append(this builder, string item1, string item2, string item3)
{
builder.Append(item1);
builder.Append(item2);
builder.Append(item3);
}
public void Append(this builder, string item1, string item2,
string item3, string item4)
{
builder.Append(item1);
builder.Append(item2);
builder.Append(item3);
builder.Append(item4);
}
// etc
And then one final overload using params, e.g.
public void Append(this builder, string item1, string item2,
string item3, string item4, params string[] otherItems)
{
builder.Append(item1);
builder.Append(item2);
builder.Append(item3);
builder.Append(item4);
foreach (string item in otherItems)
{
builder.Append(item);
}
}
I'd certainly expect these (or just your original extension method) to be faster than using AppendFormat - which needs to parse the format string, after all.
Note that I didn't make these overloads call each other pseudo-recursively - I suspect they'd be inlined, but if they weren't the overhead of setting up a new stack frame etc could end up being significant. (We're assuming the overhead of the array is significant, if we've got this far.)
Other than a bit of overhead, I don't personally see any issues with it. Definitely more readable. As long as you're passing a reasonable number of params in I don't see the problem.
From a clarity perspective, your extension is ok.
It would probably be best to simply use the .append(x).append(y).append(z) format if you never have more than about 5 or 6 items.
StringBuilder itself would only net you a performance gain if you were processing many thousands of items. In addition you'll be creating the array every time you call the method.
So if you're doing it for clarity, that's ok. If you're doing it for efficiency, then you're probably on the wrong track.
I wouldn't say you're undermining it's efficiency, but you may be doing something inefficient when a more efficient method is available. AppendFormat is what I think you want here. If the {0}{1}{2} string being used constantly is too ugly, I tend to put my format strings in consts above, so the look would be more or less the same as your extension.
sb.AppendFormat(SETTING_FORMAT, var1, var2, var3);
I haven't tested recently, but in the past, StringBuilder was actually slower than plain-vanilla string concatenation ("this " + "that") until you get to about 7 concatenations.
If this is string concatenation that is not happening in a loop, you may want to consider if you should be using the StringBuilder at all. (In a loop, I start to worry about allocations with plain-vanilla string concatenation, since strings are immutable.)
Potentially even faster, because it performs at most one reallocation/copy step, for many appends.
public void Append(this StringBuilder stringBuilder, params string[] args)
{
int required = stringBuilder.Length;
foreach (string arg in args)
required += arg.Length;
if (stringBuilder.Capacity < required)
stringBuilder.Capacity = required;
foreach (string arg in args)
stringBuilder.Append(arg);
}
Ultimately it comes down to which one results in less string creation. I have a feeling that the extension will result in a higher string count that using the string format. But the performance probably won't be that different.
Chris,
Inspired by this Jon Skeet response (second answer), I slightly rewrote your code. Basically, I added the TestRunner method which runs the passed-in function and reports the elapsed time, eliminating a little redundant code. Not to be smug, but rather as a programming exercise for myself. I hope it's helpful.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
namespace SBTest
{
class Program
{
private static void Main(string[] args)
{
// JIT everything
AppendTest(1);
AppendFormatTest(1);
int iterations = 1000000;
// Run Tests
TestRunner(AppendTest, iterations);
TestRunner(AppendFormatTest, iterations);
Console.ReadLine();
}
private static void TestRunner(Func<int, long> action, int iterations)
{
GC.Collect();
var sw = Stopwatch.StartNew();
long length = action(iterations);
sw.Stop();
Console.WriteLine("--------------------- {0} -----------------------", action.Method.Name);
Console.WriteLine("iterations: {0:n0}", iterations);
Console.WriteLine("milliseconds: {0:n0}", sw.ElapsedMilliseconds);
Console.WriteLine("output length: {0:n0}", length);
Console.WriteLine("");
}
private static long AppendTest(int iterations)
{
var sb = new StringBuilder();
for (var i = 0; i < iterations; i++)
{
sb.Append("TEST" + i.ToString("00000"),
"TEST" + (i + 1).ToString("00000"),
"TEST" + (i + 2).ToString("00000"));
}
return sb.Length;
}
private static long AppendFormatTest(int iterations)
{
var sb = new StringBuilder();
for (var i = 0; i < iterations; i++)
{
sb.AppendFormat("{0}{1}{2}",
"TEST" + i.ToString("00000"),
"TEST" + (i + 1).ToString("00000"),
"TEST" + (i + 2).ToString("00000"));
}
return sb.Length;
}
}
public static class SBExtentions
{
public static void Append(this StringBuilder sb, params string[] args)
{
foreach (var arg in args)
sb.Append(arg);
}
}
}
Here's the output:
--------------------- AppendTest -----------------------
iterations: 1,000,000
milliseconds: 1,274
output length: 29,700,006
--------------------- AppendFormatTest -----------------------
iterations: 1,000,000
milliseconds: 1,381
output length: 29,700,006

Categories

Resources