This question already has answers here:
Performance of Arrays vs. Lists
(15 answers)
Closed 2 years ago.
LIST FOR 00:00:00.0000980
LIST FOREACH 00:00:00.0000007
ARRAY FOR 00:00:00.0028450
ARRAY FOREACH 00:00:00.0051233
I have always made performance heavy things with arrays, but it seems lists are much faster.
using System;
using System.Diagnostics;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var cron = NCrontab.Advanced.CrontabSchedule.Parse("0 0 1 1 * 2016",
NCrontab.Advanced.Enumerations.CronStringFormat.WithYears);
var date = new System.DateTime(year: 2016, month: 1, day: 1,
hour: 0, minute: 0, second: 0);
int[] testArray = new int[1000000];
List<int> testList = new List<int>(1000000);
var stopWatch = new Stopwatch();
stopWatch.Start();
for (int i = 0; i < testList.Count;i++)
{
var s = 1;
}
stopWatch.Stop();
Console.WriteLine("LIST FOR " + stopWatch.Elapsed);
stopWatch = new Stopwatch();
stopWatch.Start();
foreach (int i in testList)
{
var d = 1;
}
stopWatch.Stop();
Console.WriteLine("LIST FOREACH " + stopWatch.Elapsed);
stopWatch = new Stopwatch();
stopWatch.Start();
for (int i = 0; i < testArray.Length;i++)
{
var f = 1;
}
stopWatch.Stop();
Console.WriteLine("ARRAY FOR " + stopWatch.Elapsed);
stopWatch = new Stopwatch();
stopWatch.Start();
foreach (int i in testArray)
{
var h = 1;
}
stopWatch.Stop();
Console.WriteLine("ARRAY FOREACH " + stopWatch.Elapsed);
}
}
https://dotnetfiddle.net/RKWBJl
List internally implements T[], just that its flexible and internally keeps expanding when Capacity is over. Since for the all the important operations, it internally does the array operations, just that it provides the convenience for the dynamic expansion.
I don't understand it, but I need to know for heavy and frequent operations.
Basically I'm trying to determine what's better to have high FPS.
Those tests are not comparable.
This creates an array with a million entries.
int[] testArray = new int[1000000];
This creates an empty list, but with memory allocated in the background for up to a million entries (so it doesn't have to resize its internal buffer until it reaches a million.
List<int> testList = new List<int>(1000000);
So testArray.Length will be 1000000 here and testList.Count will be 0 (since you are not adding a million items to it).
Performance wise it will also be a lot of other things to consider then just looping over the values. Adding or removing items can be very expensive for an array if it leads to you needing to resize it for example.
A second thing here is that your way of benchmarking these things will not be really reliable either. If you want to do actual benchmarks, check out https://benchmarkdotnet.org which will do this properly and handle things such as warmup runs, memory allocations, running on different framework versions etc.
I'm optimizing every line of code in my application, as performance is key. I'm testing all assumptions, as what I expect is not what I see in reality.
A strange occurrence to me is the performance of function calls. Below are two scenarios. Iterating an integer within the loop, and with a function in the loop. I expected the function call to be slower, however it is faster??
Can anyone explain this? I'm using .NET 4.7.1
Without function: 2808ms
With function 2295ms
UPDATE:
Switching the loops switches the runtime as well - I don't understand why, but will accept it as it is. Running the two different loops in different applications give similar results. I'll assume in the future that a function call won't create any additional overhead
public static int a = 0;
public static void Increment()
{
a = a + 1;
}
static void Main(string[] args)
{
//There were suggestions that the first for loop always runs faster. I have included a 'dummy' for loop here to warm up.
a = 0;
for (int i = 0;i < 1000;i++)
{
a = a + 1;
}
//Normal increment
Stopwatch sw = new Stopwatch();
sw.Start();
a = 0;
for (int i = 0; i < 900000000;i++)
{
a = a + 1;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
//Increment with function
Stopwatch sw2 = new Stopwatch();
sw2.Start();
a = 0;
for (int i = 0; i < 900000000; i++)
{
Increment();
}
sw2.Stop();
Console.WriteLine(sw2.ElapsedMilliseconds);
Console.ReadLine();
}
I've
written simple for loop iterating through array and Parallel.ForEach loop doing the same thing. However, resuls I've get are different so I want to ask what the heck is going on? :D
class Program
{
static void Main(string[] args)
{
long creating = 0;
long reading = 0;
long readingParallel = 0;
for (int j = 0; j < 10; j++)
{
Stopwatch timer1 = new Stopwatch();
Random rnd = new Random();
int[] array = new int[100000000];
timer1.Start();
for (int i = 0; i < 100000000; i++)
{
array[i] = rnd.Next(5);
}
timer1.Stop();
long result = 0;
Stopwatch timer2 = new Stopwatch();
timer2.Start();
for (int i = 0; i < 100000000; i++)
{
result += array[i];
}
timer2.Stop();
Stopwatch timer3 = new Stopwatch();
long result2 = 0;
timer3.Start();
Parallel.ForEach(array, (item) =>
{
result2 += item;
});
if (result != result2)
{
Console.WriteLine(result + " - " + result2);
}
timer3.Stop();
creating += timer1.ElapsedMilliseconds;
reading += timer2.ElapsedMilliseconds;
readingParallel += timer3.ElapsedMilliseconds;
}
Console.WriteLine("Create : \t" + creating / 100);
Console.WriteLine("Read: \t\t" + reading / 100);
Console.WriteLine("ReadP: \t\t" + readingParallel / 100);
Console.ReadKey();
}
}
So in the condition I get results:
result = 200009295;
result2 = 35163054;
Is there anything wrong?
The += operator is non-atomic and actually performs multiple operations:
load value at location that result is pointing to, into memory
add array[i] to the in-memory value (I'm simplifying here)
write the result back to result
Since a lot of these add operations will be running in parallel it is not just possible, but likely that there will be races between some of these operations where one thread reads a result value and performs the addition, but before it has the chance to write it back, another thread grabs the old result value (which hasn't yet been updated) and also performs the addition. Then both threads write their respective values to result. Regardless of which one wins the race, you end up with a smaller number than expected.
This is why the Interlocked class exists.
Your code could very easily be fixed:
Parallel.ForEach(array, (item) =>
{
Interlocked.Add(ref result2, item);
});
Don't be surprised if Parallel.ForEach ends up slower than the fully synchronous version in this case though. This is due to the fact that
the amount of work inside the delegate you pass to Parallel.ForEach is very small
Interlocked methods incur a slight but non-negligible overhead, which will be quite noticeable in this particular case
In a current project of mine I have to parse a string, and write parts of it to the console. While testing how to do this without too much overhead, I discovered that one way I was testing is actually faster than Console.WriteLine, which is slightly confusing to me.
I'm aware this is not the proper way to benchmark stuff, but I'm usually fine with a rough "this is faster than this", which I can tell after running it a few times.
static void Main(string[] args)
{
var timer = new Stopwatch();
timer.Restart();
Test1("just a little test string.");
timer.Stop();
Console.WriteLine(timer.Elapsed);
timer.Restart();
Test2("just a little test string.");
timer.Stop();
Console.WriteLine(timer.Elapsed);
timer.Restart();
Test3("just a little test string.");
timer.Stop();
Console.WriteLine(timer.Elapsed);
}
static void Test1(string str)
{
Console.WriteLine(str);
}
static void Test2(string str)
{
foreach (var c in str)
Console.Write(c);
Console.Write('\n');
}
static void Test3(string str)
{
using (var stream = new StreamWriter(Console.OpenStandardOutput()))
{
foreach (var c in str)
stream.Write(c);
stream.Write('\n');
}
}
As you can see, Test1 is using Console.WriteLine. My first thought was to simply call Write for every char, see Test2. But this resulted in taking roughly twice as long. My guess would be that it flushes after every write, which makes it slower. So I tried Test3, using a StreamWriter (AutoFlush off), which resulted in being about 25% faster than Test1, and I'm really curious why that is. Or is it that writing to the console can't be benchmarked properly? (noticed some strange data when adding more test cases...)
Can someone enlighten me?
Also, if there's a better way to do this (going though a string and only writing parts of it to the console), feel free to comment on that.
First I agree with the other comments that your test harness leaves something to be desired... I rewrote it and included it below. The result after rewrite post a clear winner:
//Test 1 = 00:00:03.7066514
//Test 2 = 00:00:24.6765818
//Test 3 = 00:00:00.8609692
From this you are correct the buffered stream writer is better than 25% faster. It's faster only because it's buffered. Internally the StreamWriter implementation uses a default buffer size of around 1~4kb (depending on the stream type). If you construct the StreamWriter with an 8-byte buffer (the smallest allowed) you will see most of your performance improvement disappear. You can also see this by using a Flush() call following each write.
Here is the test rewritten to obtain the numbers above:
private static StreamWriter stdout = new StreamWriter(Console.OpenStandardOutput());
static void Main(string[] args)
{
Action<string>[] tests = new Action<string>[] { Test1, Test2, Test3 };
TimeSpan[] timming = new TimeSpan[tests.Length];
// Repeat the entire sequence of tests many times to accumulate the result
for (int i = 0; i < 100; i++)
{
for( int itest =0; itest < tests.Length; itest++)
{
string text = String.Format("just a little test string, test = {0}, iteration = {1}", itest, i);
Action<string> thisTest = tests[itest];
//Clear the console so that each test begins from the same state
Console.Clear();
var timer = Stopwatch.StartNew();
//Repeat the test many times, if this was not using the console
//I would use a much higher number, say 10,000
for (int j = 0; j < 100; j++)
thisTest(text);
timer.Stop();
//Accumulate the result, but ignore the first run
if (i != 0)
timming[itest] += timer.Elapsed;
//Depending on what you are benchmarking you may need to force GC here
}
}
//Now print the results we have collected
Console.Clear();
for (int itest = 0; itest < tests.Length; itest++)
Console.WriteLine("Test {0} = {1}", itest + 1, timming[itest]);
Console.ReadLine();
}
static void Test1(string str)
{
Console.WriteLine(str);
}
static void Test2(string str)
{
foreach (var c in str)
Console.Write(c);
Console.Write('\n');
}
static void Test3(string str)
{
foreach (var c in str)
stdout.Write(c);
stdout.Write('\n');
}
I've ran your test for 10000 times each and the results are the following on my machine:
test1 - 0.6164241
test2 - 8.8143273
test3 - 0.9537039
this is the script I used:
static void Main(string[] args)
{
Test1("just a little test string."); // warm up
GC.Collect(); // compact Heap
GC.WaitForPendingFinalizers(); // and wait for the finalizer queue to empty
Stopwatch timer = new Stopwatch();
timer.Start();
for (int i = 0; i < 10000; i++)
{
Test1("just a little test string.");
}
timer.Stop();
Console.WriteLine(timer.Elapsed);
}
I changed the code to run each test 1000 times.
static void Main(string[] args) {
var timer = new Stopwatch();
timer.Restart();
for (int i = 0; i < 1000; i++)
Test1("just a little test string.");
timer.Stop();
TimeSpan elapsed1 = timer.Elapsed;
timer.Restart();
for (int i = 0; i < 1000; i++)
Test2("just a little test string.");
timer.Stop();
TimeSpan elapsed2 = timer.Elapsed;
timer.Restart();
for (int i = 0; i < 1000; i++)
Test3("just a little test string.");
timer.Stop();
TimeSpan elapsed3 = timer.Elapsed;
Console.WriteLine(elapsed1);
Console.WriteLine(elapsed2);
Console.WriteLine(elapsed3);
Console.Read();
}
My output:
00:00:05.2172738
00:00:09.3893525
00:00:05.9624869
I also ran this one 10000 times and got these results:
00:00:00.6947374
00:00:09.6185047
00:00:00.8006468
Which seems in keeping with what others observed. I was curious why Test3 was slower than Test1, so wrote a fourth test:
timer.Start();
using (var stream = new StreamWriter(Console.OpenStandardOutput()))
{
for (int i = 0; i < testSize; i++)
{
Test4("just a little test string.", stream);
}
}
timer.Stop();
This one reuses the stream for each test, thus avoiding the overhead of recreating it each time. Result:
00:00:00.4090399
Although this is the fastest, it writes all the output at the end of the using block, which may not be what you are after. I would imagine that this approach would chew up more memory as well.
I can't figure out a discrepancy between the time it takes for the Contains method to find an element in an ArrayList and the time it takes for a small function that I wrote to do the same thing. The documentation states that Contains performs a linear search, so it's supposed to be in O(n) and not any other faster method. However, while the exact values may not be relevant, the Contains method returns in 00:00:00.1087087 seconds while my function takes 00:00:00.1876165. It might not be much, but this difference becomes more evident when dealing with even larger arrays. What am I missing and how should I write my function to match Contains's performances?
I'm using C# on .NET 3.5.
public partial class Window1 : Window
{
public bool DoesContain(ArrayList list, object element)
{
for (int i = 0; i < list.Count; i++)
if (list[i].Equals(element)) return true;
return false;
}
public Window1()
{
InitializeComponent();
ArrayList list = new ArrayList();
for (int i = 0; i < 10000000; i++) list.Add("zzz " + i);
Stopwatch sw = new Stopwatch();
sw.Start();
//Console.Out.WriteLine(list.Contains("zzz 9000000") + " " + sw.Elapsed);
Console.Out.WriteLine(DoesContain(list, "zzz 9000000") + " " + sw.Elapsed);
}
}
EDIT:
Okay, now, lads, look:
public partial class Window1 : Window
{
public bool DoesContain(ArrayList list, object element)
{
int count = list.Count;
for (int i = count - 1; i >= 0; i--)
if (element.Equals(list[i])) return true;
return false;
}
public bool DoesContain1(ArrayList list, object element)
{
int count = list.Count;
for (int i = 0; i < count; i++)
if (element.Equals(list[i])) return true;
return false;
}
public Window1()
{
InitializeComponent();
ArrayList list = new ArrayList();
for (int i = 0; i < 10000000; i++) list.Add("zzz " + i);
Stopwatch sw = new Stopwatch();
long total = 0;
int nr = 100;
for (int i = 0; i < nr; i++)
{
sw.Reset();
sw.Start();
DoesContain(list,"zzz");
total += sw.ElapsedMilliseconds;
}
Console.Out.WriteLine(total / nr);
total = 0;
for (int i = 0; i < nr; i++)
{
sw.Reset();
sw.Start();
DoesContain1(list, "zzz");
total += sw.ElapsedMilliseconds;
}
Console.Out.WriteLine(total / nr);
total = 0;
for (int i = 0; i < nr; i++)
{
sw.Reset();
sw.Start();
list.Contains("zzz");
total += sw.ElapsedMilliseconds;
}
Console.Out.WriteLine(total / nr);
}
}
I made an average of 100 running times for two versions of my function(forward and backward loop) and for the default Contains function. The times I've got are 136 and
133 milliseconds for my functions and a distant winner of 87 for the Contains version. Well now, if before you could argue that the data was scarce and I based my conclusions on a first, isolated run, what do you say about this test? Not does only on average Contains perform better, but it achieves consistently better results in each run. So, is there some kind of disadvantage in here for 3rd party functions, or what?
First, you're not running it many times and comparing averages.
Second, your method isn't being jitted until it actually runs. So the just in time compile time is added into its execution time.
A true test would run each multiple times and average the results (any number of things could cause one or the other to be slower for run X out of a total of Y), and your assemblies should be pre-jitted using ngen.exe.
As you're using .NET 3.5, why are you using ArrayList to start with, rather than List<string>?
A few things to try:
You could see whether using foreach instead of a for loop helps
You could cache the count:
public bool DoesContain(ArrayList list, object element)
{
int count = list.Count;
for (int i = 0; i < count; i++)
{
if (list[i].Equals(element))
{
return true;
}
return false;
}
}
You could reverse the comparison:
if (element.Equals(list[i]))
While I don't expect any of these to make a significant (positive) difference, they're the next things I'd try.
Do you need to do this containment test more than once? If so, you might want to build a HashSet<T> and use that repeatedly.
I'm not sure if you're allowed to post Reflector code, but if you open the method using Reflector, you can see that's it's essentially the same (there are some optimizations for null values, but your test harness doesn't include nulls).
The only difference that I can see is that calling list[i] does bounds checking on i whereas the Contains method does not.
Using the code below I was able to get the following timings relatively consitently (within a few ms):
1: 190ms DoesContainRev
2: 198ms DoesContainRev1
3: 188ms DoesContainFwd
4: 203ms DoesContainFwd1
5: 199ms Contains
Several things to notice here.
This is run with release compiled code from the commandline. Many people make the mistake of benchmarking code inside the Visual Studio debugging environment, not to say anyone here did but something to be careful of.
The list[i].Equals(element) appears to be just a bit slower than element.Equals(list[i]).
using System;
using System.Diagnostics;
using System.Collections;
namespace ArrayListBenchmark
{
class Program
{
static void Main(string[] args)
{
Stopwatch sw = new Stopwatch();
const int arrayCount = 10000000;
ArrayList list = new ArrayList(arrayCount);
for (int i = 0; i < arrayCount; i++) list.Add("zzz " + i);
sw.Start();
DoesContainRev(list, "zzz");
sw.Stop();
Console.WriteLine(String.Format("1: {0}", sw.ElapsedMilliseconds));
sw.Reset();
sw.Start();
DoesContainRev1(list, "zzz");
sw.Stop();
Console.WriteLine(String.Format("2: {0}", sw.ElapsedMilliseconds));
sw.Reset();
sw.Start();
DoesContainFwd(list, "zzz");
sw.Stop();
Console.WriteLine(String.Format("3: {0}", sw.ElapsedMilliseconds));
sw.Reset();
sw.Start();
DoesContainFwd1(list, "zzz");
sw.Stop();
Console.WriteLine(String.Format("4: {0}", sw.ElapsedMilliseconds));
sw.Reset();
sw.Start();
list.Contains("zzz");
sw.Stop();
Console.WriteLine(String.Format("5: {0}", sw.ElapsedMilliseconds));
sw.Reset();
Console.ReadKey();
}
public static bool DoesContainRev(ArrayList list, object element)
{
int count = list.Count;
for (int i = count - 1; i >= 0; i--)
if (element.Equals(list[i])) return true;
return false;
}
public static bool DoesContainFwd(ArrayList list, object element)
{
int count = list.Count;
for (int i = 0; i < count; i++)
if (element.Equals(list[i])) return true;
return false;
}
public static bool DoesContainRev1(ArrayList list, object element)
{
int count = list.Count;
for (int i = count - 1; i >= 0; i--)
if (list[i].Equals(element)) return true;
return false;
}
public static bool DoesContainFwd1(ArrayList list, object element)
{
int count = list.Count;
for (int i = 0; i < count; i++)
if (list[i].Equals(element)) return true;
return false;
}
}
}
With a really good optimizer there should not be difference at all, because the semantics seems to be the same. However the existing optimizer can optimize your function not so good as the hardcoded Contains is optimized. Some of the points for optimization:
comparing to a property each time can be slower than counting downwards and comparing against 0
function call itself has its performance penalty
using iterators instead of explicit indexing can be faster (foreach loop instead of plain for)
First, if you are using types you know ahead of time, I'd suggest using generics. So List instead of ArrayList. Underneath the hood, ArrayList.Contains actually does a bit more than what you are doing. The following is from reflector:
public virtual bool Contains(object item)
{
if (item == null)
{
for (int j = 0; j < this._size; j++)
{
if (this._items[j] == null)
{
return true;
}
}
return false;
}
for (int i = 0; i < this._size; i++)
{
if ((this._items[i] != null) && this._items[i].Equals(item))
{
return true;
}
}
return false;
}
Notice that it forks itself on being passed a null value for item. However, since all the values in your example are not null, the additional check on null at the beginning and in the second loop should in theory take longer.
Are you positive you are dealing with fully compiled code? I.e., when your code runs the first time it gets JIT compiled where as the framework is obviously already compiled.
After your Edit, I copied the code and made a few improvements to it.
The difference was not reproducable, it turns out to be a measuring/rounding issue.
To see that, change your runs to this form:
sw.Reset();
sw.Start();
for (int i = 0; i < nr; i++)
{
DoesContain(list,"zzz");
}
total += sw.ElapsedMilliseconds;
Console.WriteLine(total / nr);
I just moved some lines. The JIT issue was insignificant with this numbr of repetitions.
My guess would be that ArrayList is written in C++ and could be taking advantage of some micro-optimizations (note: this is a guess).
For instance, in C++ you can use pointer arithmetic (specifically incrementing a pointer to iterate an array) to be faster than using an index.
using an array structure, you can't search faster than O(n) whithout any additional information.
if you know that the array is sorted, then you can use binary search algorithm and spent only o(log(n))
otherwise you should use a set.
Revised after reading comments:
It does not use some Hash-alogorithm to enable fast lookup.
Use SortedList<TKey,TValue>, Dictionary<TKey, TValue> or System.Collections.ObjectModel.KeyedCollection<TKey, TValue> for fast access based on a key.
var list = new List<myObject>(); // Search is sequential
var dictionary = new Dictionary<myObject, myObject>(); // key based lookup, but no sequential lookup, Contains fast
var sortedList = new SortedList<myObject, myObject>(); // key based and sequential lookup, Contains fast
KeyedCollection<TKey, TValue> is also fast and allows indexed lookup, however, it needs to be inherited as it is abstract. Therefore, you need a specific collection. However, with the following you can create a generic KeyedCollection.
public class GenericKeyedCollection<TKey, TValue> : KeyedCollection<TKey, TValue> {
public GenericKeyedCollection(Func<TValue, TKey> keyExtractor) {
this.keyExtractor = keyExtractor;
}
private Func<TValue, TKey> keyExtractor;
protected override TKey GetKeyForItem(TValue value) {
return this.keyExtractor(value);
}
}
The advantage of using the KeyedCollection is that the Add method does not require that a key is specified.