Is there a faster way of doing this using C#?
double[,] myArray = new double[length1, length2];
for(int i=0;i<length1;i++)
for(int j=0;j<length2;j++)
myArray[i,j] = double.PositiveInfinity;
I remember using C++, there was something called memset() for doing these kind of things...
A multi-dimensional array is just a large block of memory, so we can treat it like one, similar to how memset() works. This requires unsafe code. I wouldn't say it's worth doing unless it's really performance critical. This is a fun exercise, though, so here are some benchmarks using BenchmarkDotNet:
public class ArrayFillBenchmark
{
const int length1 = 1000;
const int length2 = 1000;
readonly double[,] _myArray = new double[length1, length2];
[Benchmark]
public void MultidimensionalArrayLoop()
{
for (int i = 0; i < length1; i++)
for (int j = 0; j < length2; j++)
_myArray[i, j] = double.PositiveInfinity;
}
[Benchmark]
public unsafe void MultidimensionalArrayNaiveUnsafeLoop()
{
fixed (double* a = &_myArray[0, 0])
{
double* b = a;
for (int i = 0; i < length1; i++)
for (int j = 0; j < length2; j++)
*b++ = double.PositiveInfinity;
}
}
[Benchmark]
public unsafe void MultidimensionalSpanFill()
{
fixed (double* a = &_myArray[0, 0])
{
double* b = a;
var span = new Span<double>(b, length1 * length2);
span.Fill(double.PositiveInfinity);
}
}
[Benchmark]
public unsafe void MultidimensionalSseFill()
{
var vectorPositiveInfinity = Vector128.Create(double.PositiveInfinity);
fixed (double* a = &_myArray[0, 0])
{
double* b = a;
ulong i = 0;
int size = Vector128<double>.Count;
ulong length = length1 * length2;
for (; i < (length & ~(ulong)15); i += 16)
{
Sse2.Store(b+size*0, vectorPositiveInfinity);
Sse2.Store(b+size*1, vectorPositiveInfinity);
Sse2.Store(b+size*2, vectorPositiveInfinity);
Sse2.Store(b+size*3, vectorPositiveInfinity);
Sse2.Store(b+size*4, vectorPositiveInfinity);
Sse2.Store(b+size*5, vectorPositiveInfinity);
Sse2.Store(b+size*6, vectorPositiveInfinity);
Sse2.Store(b+size*7, vectorPositiveInfinity);
b += size*8;
}
for (; i < (length & ~(ulong)7); i += 8)
{
Sse2.Store(b+size*0, vectorPositiveInfinity);
Sse2.Store(b+size*1, vectorPositiveInfinity);
Sse2.Store(b+size*2, vectorPositiveInfinity);
Sse2.Store(b+size*3, vectorPositiveInfinity);
b += size*4;
}
for (; i < (length & ~(ulong)3); i += 4)
{
Sse2.Store(b+size*0, vectorPositiveInfinity);
Sse2.Store(b+size*1, vectorPositiveInfinity);
b += size*2;
}
for (; i < length; i++)
{
*b++ = double.PositiveInfinity;
}
}
}
}
Results:
| Method | Mean | Error | StdDev | Ratio |
|------------------------------------- |-----------:|----------:|----------:|------:|
| MultidimensionalArrayLoop | 1,083.1 us | 11.797 us | 11.035 us | 1.00 |
| MultidimensionalArrayNaiveUnsafeLoop | 436.2 us | 8.567 us | 8.414 us | 0.40 |
| MultidimensionalSpanFill | 321.2 us | 6.404 us | 10.875 us | 0.30 |
| MultidimensionalSseFill | 231.9 us | 4.616 us | 11.323 us | 0.22 |
MultidimensionalArrayLoop is slow because of bounds checking. The JIT emits code each loop that makes sure that [i, j] is inside the bounds of the array. The JIT can elide bounds checking sometimes, I know it does for single-dimensional arrays. I'm not sure if it does it for multi-dimensional.
MultidimensionalArrayNaiveUnsafeLoop is essentially the same code as MultidimensionalArrayLoop but without bounds checking. It's considerably faster, taking 40% of the time. It's considered 'Naive', though, because the loop could still be improved by unrolling the loop.
MultidimensionalSpanFill also has no bounds check, and is more-or-less the same as MultidimensionalArrayNaiveUnsafeLoop, however, Span.Fill internally does loop unrolling, which is why it's a bit faster than our naive unsafe loop. It only take 30% of the time as our original.
MultidimensionalSseFill improves on our first unsafe loop by doing two things: loop unrolling and vectorizing. This requires a CPU with Sse2 support, but it allows us to write 128-bits (16 bytes) in a single instruction. This gives us an additional speed boost, taking it down to 22% of the original. Interestingly, this same loop with Avx (256-bits) was consistently slower than the Sse2 version, so that benchmark is not included here.
But these numbers only apply to an array that is 1000x1000. As you change the size of the array, the results differ. For example, when we change the array size to 10000x10000, the results for all of the unsafe benchmarks are very close. Probably because there are more memory fetches for the larger array that it tends to equalize the smaller iterative improvements seen in the last three benchmarks.
There's a lesson in there somewhere, but I mostly just wanted to share these results, since it was a pretty fun experiment to do.
I wrote the method that is not faster, but it works with actual multidimensional arrays, not only 2D.
public static class ArrayExtensions
{
public static void Fill(this Array array, object value)
{
var indicies = new int[array.Rank];
Fill(array, 0, indicies, value);
}
public static void Fill(Array array, int dimension, int[] indicies, object value)
{
if (dimension < array.Rank)
{
for (int i = array.GetLowerBound(dimension); i <= array.GetUpperBound(dimension); i++)
{
indicies[dimension] = i;
Fill(array, dimension + 1, indicies, value);
}
}
else
array.SetValue(value, indicies);
}
}
double[,] myArray = new double[x, y];
if( parallel == true )
{
stopWatch.Start();
System.Threading.Tasks.Parallel.For( 0, x, i =>
{
for( int j = 0; j < y; ++j )
myArray[i, j] = double.PositiveInfinity;
});
stopWatch.Stop();
Print( "Elapsed milliseconds: {0}", stopWatch.ElapsedMilliseconds );
}
else
{
stopWatch.Start();
for( int i = 0; i < x; ++i )
for( int j = 0; j < y; ++j )
myArray[i, j] = double.PositiveInfinity;
stopWatch.Stop();
Print("Elapsed milliseconds: {0}", stopWatch.ElapsedMilliseconds);
}
When setting x and y to 10000 I get 553 milliseconds for the single-threaded approach and 170 for the multi-threaded one.
There is a possibility to quickly fill an md-array that does not use the keyword unsafe (see answers for this question)
Related
I've run into something strange, when using AsSpan.Fill it's twice as fast on a byte[] array as opposed to an int or float array, and they are all of the same size in bytes. BUT it depends on the size of the arrays, on small arrays it is the same, but on larger ones the difference shows.
Here is a sample console application to illustrate
internal unsafe class Program {
static byte[]? ByteFrame;
static Int32[]? Int32Frame;
static float[]? FloatFrame;
static int[]? ResetCacheArray;
static void Main(string[] args) {
// size vars
int Width = 1500;
int Height = 1500;
// Init frames
ByteFrame = new byte[Width * Height * 4];
ByteFrame.AsSpan().Fill(0);
Int32Frame = new Int32[Width * Height];
Int32Frame.AsSpan().Fill(0);
FloatFrame = new float[Width * Height];
FloatFrame.AsSpan().Fill(1);
ResetCacheArray = new int[10000 * 10000];
ResetCacheArray.AsSpan().Fill(1);
// warmup jitter
for(int i = 0; i < 200; i++) {
ClearByteFrameAsSpanFill(0);
ClearInt32FrameAsSpanFill(0);
ClearFloatFrameAsSpanFill(0f);
ClearCache();
}
Console.WriteLine(Environment.Is64BitProcess);
int TestIterations;
double nanoseconds;
double MsDuration;
double MB = 0;
double MBSec;
double GBSec;
TestIterations = 1;
nanoseconds = 1_000_000_000.0 * Stopwatch.GetTimestamp() / Stopwatch.Frequency;
for (int i = 0; i < TestIterations; i++) {
MB = ClearByteFrameAsSpanFill(0);
}
MsDuration = (((1_000_000_000.0 * Stopwatch.GetTimestamp() / Stopwatch.Frequency) - nanoseconds) / TestIterations) / 1000000;
MBSec = (MB / MsDuration) * 1000;
GBSec = MBSec / 1000;
Console.WriteLine("ClearByteFrameAsSpanFill: MS:" + MsDuration + " GB/s:" + (int)GBSec + " MB/s:" + (int)MBSec);
ClearCache();
TestIterations = 1;
nanoseconds = 1_000_000_000.0 * Stopwatch.GetTimestamp() / Stopwatch.Frequency;
for (int i = 0; i < TestIterations; i++) {
MB = ClearInt32FrameAsSpanFill(1);
}
MsDuration = (((1_000_000_000.0 * Stopwatch.GetTimestamp() / Stopwatch.Frequency) - nanoseconds) / TestIterations) / 1000000;
MBSec = (MB / MsDuration) * 1000;
GBSec = MBSec / 1000;
Console.WriteLine("ClearInt32FrameAsSpanFill: MS:" + MsDuration + " GB/s:" + (int)GBSec + " MB/s:" + (int)MBSec);
ClearCache();
TestIterations = 1;
nanoseconds = 1_000_000_000.0 * Stopwatch.GetTimestamp() / Stopwatch.Frequency;
for (int i = 0; i < TestIterations; i++) {
MB = ClearFloatFrameAsSpanFill(1f);
}
MsDuration = (((1_000_000_000.0 * Stopwatch.GetTimestamp() / Stopwatch.Frequency) - nanoseconds) / TestIterations) / 1000000;
MBSec = (MB / MsDuration) * 1000;
GBSec = MBSec / 1000;
Console.WriteLine("ClearFloatFrameAsSpanFill: MS:" + MsDuration + " GB/s:" + (int)GBSec + " MB/s:" + (int)MBSec);
ClearCache();
Console.ReadLine();
}
static double ClearByteFrameAsSpanFill(byte clearValue) {
ByteFrame.AsSpan().Fill(clearValue);
return ByteFrame.Length / 1000000;
}
static double ClearInt32FrameAsSpanFill(Int32 clearValue) {
Int32Frame.AsSpan().Fill(clearValue);
return (Int32Frame.Length * 4) / 1000000;
}
static double ClearFloatFrameAsSpanFill(float clearValue) {
FloatFrame.AsSpan().Fill(clearValue);
return (FloatFrame.Length * 4) / 1000000;
}
static void ClearCache() {
int sum = 0;
for (int i = 0; i < ResetCacheArray.Length; i++) {
sum += ResetCacheArray[i];
}
}
}
On my machine it outputs the following:
ClearByteFrameAsSpanFill: MS:0,4913 GB/s:18 MB/s:18318
ClearInt32FrameAsSpanFill: MS:0,4851 GB/s:18 MB/s:18552
ClearFloatFrameAsSpanFill: MS:0,458 GB/s:19 MB/s:19650
It varies a little from run to run, + - a few GB/s but roughly each operation takes the same amount of time.
Now when i change the size variables to: Width = 4500, Height = 4500 then it outputs the following:
ClearByteFrameAsSpanFill: MS:3,4015 GB/s:23 MB/s:23813
ClearInt32FrameAsSpanFill: MS:7,635 GB/s:10 MB/s:10609
ClearFloatFrameAsSpanFill: MS:7,4429 GB/s:10 MB/s:10882
This will obviously change depending on ram speed from machine to machine, but on mine at least it is as such, on "small" arrays it is the same, but on large arrays filling a byte array is twice as fast as a int or float array of same byte length.
Does anyone have an explanation of this?
You are testing filling the byte array with 0 and filling the int array with 1:
ClearByteFrameAsSpanFill(0);
ClearInt32FrameAsSpanFill(1);
These cases have different optimisations.
If you fill an array of bytes with any value it will be around the same speed, because there's a processor instruction to fill a block of bytes with a specific byte value.
Although there may be processor instructions to fill an array of int or float values with non-zero values, they are likely to be slower than filling the block of memory with zero values.
I tried this out with the following code using BenchmarkDotNet:
[SimpleJob(RuntimeMoniker.Net60)]
public class UnderTest
{
[Benchmark]
public void FillBytesWithZero()
{
_bytes.AsSpan().Fill(0);
}
[Benchmark]
public void FillBytesWithOne()
{
_bytes.AsSpan().Fill(1);
}
[Benchmark]
public void FillIntsWithZero()
{
_ints.AsSpan().Fill(0);
}
[Benchmark]
public void FillIntsWithOne()
{
_ints.AsSpan().Fill(1);
}
const int COUNT = 1500 * 1500;
static readonly byte[] _bytes = new byte[COUNT * sizeof(int)];
static readonly int[] _ints = new int[COUNT];
}
With the following results:
For COUNT = 1500 * 1500:
| Method | Mean | Error | StdDev | Median |
|------------------ |---------:|---------:|---------:|---------:|
| FillBytesWithZero | 299.7 us | 7.82 us | 22.95 us | 299.3 us |
| FillBytesWithOne | 305.6 us | 11.46 us | 33.80 us | 293.3 us |
| FillIntsWithZero | 322.4 us | 2.37 us | 2.10 us | 321.6 us |
| FillIntsWithOne | 502.9 us | 27.68 us | 81.60 us | 534.4 us |
For COUNT = 4500 * 4500:
| Method | Mean | Error | StdDev |
|------------------ |---------:|----------:|----------:|
| FillBytesWithZero | 2.554 ms | 0.0307 ms | 0.0240 ms |
| FillBytesWithOne | 2.632 ms | 0.0522 ms | 0.1101 ms |
| FillIntsWithZero | 4.169 ms | 0.0258 ms | 0.0229 ms |
| FillIntsWithOne | 4.979 ms | 0.0488 ms | 0.0433 ms |
Note how filling a byte array with 0 or 1 is significantly faster.
If you inspect the source code for Span<T>.Fill() you'll see this:
public void Fill(T value)
{
if (Unsafe.SizeOf<T>() == 1)
{
// Special-case single-byte types like byte / sbyte / bool.
// The runtime eventually calls memset, which can efficiently support large buffers.
// We don't need to check IsReferenceOrContainsReferences because no references
// can ever be stored in types this small.
Unsafe.InitBlockUnaligned(ref Unsafe.As<T, byte>(ref _reference), Unsafe.As<T, byte>(ref value), (uint)_length);
}
else
{
// Call our optimized workhorse method for all other types.
SpanHelpers.Fill(ref _reference, (uint)_length, value);
}
}
This explains why filling a byte array is faster than filling an int array: It uses Unsafe.InitBlockUnaligned() for a byte array and SpanHelpers.Fill(ref _reference, (uint)_length, value); for a non-byte array.
Unsafe.InitBlockUnaligned() happens to be more performant; it's implemented as an intrinsic which performs the following:
ldarg .0
ldarg .1
ldarg .2
unaligned. 0x1
initblk
ret
Whereas SpanHelpers.Fill() is much less optimised.
It tries its best, using vectorised instructions to fill the memory if possible, but it can't compete with initblk. (It's too long to post here, but you can follow that link to look at it.)
One thing this doesn't explain is why filling an int array with zeroes is slightly faster than filling it with ones. To explain this you'd have to look at the actual processor instructions that the JIT produces, but it's definitely faster to fill a block of bytes with all 0's than it is to fill a block of bytes with 1,0,0,0 (which it would have to do for an int value of 1).
It's probably down to the comparative speeds of instructions like rep stosb (for bytes) and rep stosw (for words).
The outlier in these results is that the unaligned.1 initblk opcode sequence is about 50% faster for the smaller block size. The other times all scale up by approximately the increase in size of the memory block, i.e. around 9 times slower for the blocks that are 9 times bigger.
So the remaining question is: Why is initblk 50% faster per-byte for smaller buffer sizes (2_250_000 versus 20_250_000 bytes)?
I am trying to figure out, if this really is the fastest approach. I want this to be as fast as possible, cache friendly, and serve a good time complexity.
DEMO: https://dotnetfiddle.net/BUGz8s
private static void InvokeMe()
{
int hz = horizontal.GetLength(0) * horizontal.GetLength(1);
int vr = vertical.GetLength(0) * vertical.GetLength(1);
int hzcol = horizontal.GetLength(1);
int vrcol = vertical.GetLength(1);
//Determine true from Horizontal information:
for (int i = 0; i < hz; i++)
{
if(horizontal[i / hzcol, i % hzcol] == true)
System.Console.WriteLine("True, on position: {0},{1}", i / hzcol, i % hzcol);
}
//Determine true position from vertical information:
for (int i = 0; i < vr; i++)
{
if(vertical[i / vrcol, i % vrcol] == true)
System.Console.WriteLine("True, on position: {0},{1}", i / vrcol, i % vrcol);
}
}
Pages I read:
Is there a "faster" way to iterate through a two-dimensional array than using nested for loops?
Fastest way to loop through a 2d array?
Time Complexity of a nested for loop that parses a matrix
Determining the big-O runtimes of these different loops?
EDIT: The code example, is now, more towards what I am dealing with. It's about determining a true point x,y from a N*N Grid. The information available at disposal is: horizontal and vertical 2D arrays.
To NOT cause confusion. Imagine, that overtime, some positions in vertical or horizontal get set to True. This works currently perfectly well. All I am in for, is, the current approach of using one for-loop per 2D array like this, instead of using two for loops per 2D array.
Time complexity for approach with one loop and nested loops is the same - O(row * col) (which is O(n^2) for row == col as in your example for both cases) so the difference in the execution time will come from the constants for operations (since the direction of traversing should be the same). You can use BenchmarkDotNet to measure those. Next benchmark:
[SimpleJob]
public class Loops
{
int[, ] matrix = new int[10, 10];
[Benchmark]
public void NestedLoops()
{
int row = matrix.GetLength(0);
int col = matrix.GetLength(1);
for (int i = 0; i < row ; i++)
for (int j = 0; j < col ; j++)
{
matrix[i, j] = i * row + j + 1;
}
}
[Benchmark]
public void SingleLoop()
{
int row = matrix.GetLength(0);
int col = matrix.GetLength(1);
var l = row * col;
for (int i = 0; i < l; i++)
{
matrix[i / col, i % col] = i + 1;
}
}
}
Gives on my machine:
Method
Mean
Error
StdDev
Median
NestedLoops
144.5 ns
2.94 ns
4.58 ns
144.7 ns
SingleLoop
578.2 ns
11.37 ns
25.42 ns
568.6 ns
Making single loop actually slower.
If you will change loop body to some "dummy" operation - for example incrementing some outer variable or updating fixed (for example first) element of the martix you will see that performance of both loops is roughly the same.
Did you consider
for (int i = 0; i < row; i++)
{
for (int j = 0; j < col; j++)
{
Console.Write(string.Format("{0:00} ", matrix[i, j]));
Console.Write(Environment.NewLine + Environment.NewLine);
}
}
It is basically the same loop as yours, but without / and % that compiler may or may not optimize.
I have implemented a very simple binarySearch implementation in C# for finding integers in an integer array:
Binary Search
static int binarySearch(int[] arr, int i)
{
int low = 0, high = arr.Length - 1, mid;
while (low <= high)
{
mid = (low + high) / 2;
if (i < arr[mid])
high = mid - 1;
else if (i > arr[mid])
low = mid + 1;
else
return mid;
}
return -1;
}
When comparing it to C#'s native Array.BinarySearch() I can see that Array.BinarySearch() is more than twice as fast as my function, every single time.
MSDN on Array.BinarySearch:
Searches an entire one-dimensional sorted array for a specific element, using the IComparable generic interface implemented by each element of the Array and by the specified object.
What makes this approach so fast?
Test code
using System;
using System.Diagnostics;
class Program
{
static void Main()
{
Random rnd = new Random();
Stopwatch sw = new Stopwatch();
const int ELEMENTS = 10000000;
int temp;
int[] arr = new int[ELEMENTS];
for (int i = 0; i < ELEMENTS; i++)
arr[i] = rnd.Next(int.MinValue,int.MaxValue);
Array.Sort(arr);
// Custom binarySearch
sw.Restart();
for (int i = 0; i < ELEMENTS; i++)
temp = binarySearch(arr, i);
sw.Stop();
Console.WriteLine($"Elapsed time for custom binarySearch: {sw.ElapsedMilliseconds}ms");
// C# Array.BinarySearch
sw.Restart();
for (int i = 0; i < ELEMENTS; i++)
temp = Array.BinarySearch(arr,i);
sw.Stop();
Console.WriteLine($"Elapsed time for C# BinarySearch: {sw.ElapsedMilliseconds}ms");
}
static int binarySearch(int[] arr, int i)
{
int low = 0, high = arr.Length - 1, mid;
while (low <= high)
{
mid = (low+high) / 2;
if (i < arr[mid])
high = mid - 1;
else if (i > arr[mid])
low = mid + 1;
else
return mid;
}
return -1;
}
}
Test results
+------------+--------------+--------------------+
| Attempt No | binarySearch | Array.BinarySearch |
+------------+--------------+--------------------+
| 1 | 2700ms | 1099ms |
| 2 | 2696ms | 1083ms |
| 3 | 2675ms | 1077ms |
| 4 | 2690ms | 1093ms |
| 5 | 2700ms | 1086ms |
+------------+--------------+--------------------+
Your code is faster when run outside Visual Studio:
Yours vs Array's:
From VS - Debug mode: 3248 vs 1113
From VS - Release mode: 2932 vs 1100
Running exe - Debug mode: 3152 vs 1104
Running exe - Release mode: 559 vs 1104
Array's code might be already optimized in the framework but also does a lot more checking than your version (for instance, your version may overflow if arr.Length is greater than int.MaxValue / 2) and, as already said, is designed for a wide range of types, not just int[].
So, basically, it's slower only when you are debugging your code, because Array's code is always run in release and with less control behind the scenes.
Not so sure how to ask this question, but I have 2 ways (so far) for a lookup array
Option 1 is:
bool[][][] myJaggegArray;
myJaggegArray = new bool[120][][];
for (int i = 0; i < 120; ++i)
{
if ((i & 0x88) == 0)
{
//only 64 will be set
myJaggegArray[i] = new bool[120][];
for (int j = 0; j < 120; ++j)
{
if ((j & 0x88) == 0)
{
//only 64 will be set
myJaggegArray[i][j] = new bool[60];
}
}
}
}
Option 2 is:
bool[] myArray;
// [998520]
myArray = new bool[(120 | (120 << 7) | (60 << 14))];
Both ways work nicely, but is there another (better) way of doing a fast lookup and which one would you take if speed / performance is what matter?
This would be used in a chessboard implementation (0x88) and mostly is
[from][to][dataX] for option 1
[(from | (to << 7) | (dataX << 14))] for option 2
I would suggest using one large array, because of the advantages of having one large memory block, but I would also encourage writing a special accessor to that array.
class MyCustomDataStore
{
bool[] array;
int sizex, sizey, sizez;
MyCustomDataStore(int x, int y, int z) {
array=new bool[x*y*z];
this.sizex = x;
this.sizey = y;
this.sizez = z;
}
bool get(int px, int py, int pz) {
// change the order in whatever way you iterate
return array [ px*sizex*sizey + py*sizey + pz ];
}
}
I just update dariusz's solution with an array of longs for z-size <= 64
edit2: updated to '<<' version, size fixed to 128x128x64
class MyCustomDataStore
{
long[] array;
MyCustomDataStore()
{
array = new long[128 | 128 << 7];
}
bool get(int px, int py, int pz)
{
return (array[px | (py << 7)] & (1 << pz)) == 0;
}
void set(int px, int py, int pz, bool val)
{
long mask = (1 << pz);
int index = px | (py << 7);
if (val)
{
array[index] |= mask;
}
else
{
array[index] &= ~mask;
}
}
}
edit: performance test:
used 100 times 128x128x64 fill and read
long: 9885ms, 132096B
bool: 9740ms, 1065088B
This is basically a restatement of this question: Java: Multi-dimensional array vs. One-dimensional but for C#.
I have a set amount of elements that make sense to store as a grid.
Should I use a array[x*y] or a array[x][y]?
EDIT: Oh, so there are one dimensional array[x*y], multidimensional array[x,y] and jagged array[x][y], and I probably want jagged?
There are many advantages in C# to using jagged arrays (array[][]). They actually will often outperform multidimensional arrays.
That being said, I would personally use a multidimensional or jagged array instead of a single dimensional array, as this matches the problem space more closely. Using a one dimensional array is adding complexity to your implementation that does not provide real benefits, especially when compared to a 2D array, as internally, it's still a single block of memory.
I ran a test on unreasonably large arrays and was surprised to see that Jagged arrays([y][x]) appear to be faster than the single dimension array with manual multiplication [y * ySize + x]. And multi dimensional arrays [,] are slower but not by that much.
Of course you would have to test out on your particular arrays, but it would seem like the different isn't much so you should just use whichever approach fits what you are doing the best.
0.280 (100.0% | 0.0%) 'Jagged array 5,059x5,059 - 25,593,481'
| 0.006 (2.1% | 2.1%) 'Allocate'
| 0.274 (97.9% | 97.9%) 'Access'
0.336 (100.0% | 0.0%) 'TwoDim array 5,059x5,059 - 25,593,481'
| 0.000 (0.0% | 0.0%) 'Allocate'
| 0.336 (99.9% | 99.9%) 'Access'
0.286 (100.0% | 0.0%) 'SingleDim array 5,059x5,059 - 25,593,481'
| 0.000 (0.1% | 0.1%) 'Allocate'
| 0.286 (99.9% | 99.9%) 'Access'
0.552 (100.0% | 0.0%) 'Jagged array 7,155x7,155 - 51,194,025'
| 0.009 (1.6% | 1.6%) 'Allocate'
| 0.543 (98.4% | 98.4%) 'Access'
0.676 (100.0% | 0.0%) 'TwoDim array 7,155x7,155 - 51,194,025'
| 0.000 (0.0% | 0.0%) 'Allocate'
| 0.676 (100.0% | 100.0%) 'Access'
0.571 (100.0% | 0.0%) 'SingleDim array 7,155x7,155 - 51,194,025'
| 0.000 (0.1% | 0.1%) 'Allocate'
| 0.571 (99.9% | 99.9%) 'Access'
for (int i = 6400000; i < 100000000; i *= 2)
{
int size = (int)Math.Sqrt(i);
int totalSize = size * size;
GC.Collect();
ProfileTimer.Push(string.Format("Jagged array {0:N0}x{0:N0} - {1:N0}", size, totalSize));
ProfileTimer.Push("Allocate");
double[][] Jagged = new double[size][];
for (int x = 0; x < size; x++)
{
Jagged[x] = new double[size];
}
ProfileTimer.PopPush("Allocate", "Access");
double total = 0;
for (int trials = 0; trials < 10; trials++)
{
for (int y = 0; y < size; y++)
{
for (int x = 0; x < size; x++)
{
total += Jagged[y][x];
}
}
}
ProfileTimer.Pop("Access");
ProfileTimer.Pop("Jagged array");
GC.Collect();
ProfileTimer.Push(string.Format("TwoDim array {0:N0}x{0:N0} - {1:N0}", size, totalSize));
ProfileTimer.Push("Allocate");
double[,] TwoDim = new double[size,size];
ProfileTimer.PopPush("Allocate", "Access");
total = 0;
for (int trials = 0; trials < 10; trials++)
{
for (int y = 0; y < size; y++)
{
for (int x = 0; x < size; x++)
{
total += TwoDim[y, x];
}
}
}
ProfileTimer.Pop("Access");
ProfileTimer.Pop("TwoDim array");
GC.Collect();
ProfileTimer.Push(string.Format("SingleDim array {0:N0}x{0:N0} - {1:N0}", size, totalSize));
ProfileTimer.Push("Allocate");
double[] Single = new double[size * size];
ProfileTimer.PopPush("Allocate", "Access");
total = 0;
for (int trials = 0; trials < 10; trials++)
{
for (int y = 0; y < size; y++)
{
int yOffset = y * size;
for (int x = 0; x < size; x++)
{
total += Single[yOffset + x];
}
}
}
ProfileTimer.Pop("Access");
ProfileTimer.Pop("SingleDim array");
}
Pros of array[x,y]:
- Runtime will perform more checks for you. Each index access will be checked to be within allowed range. With another approach you could easily do smth like a[y*numOfColumns + x] where x can be more than "number of columns" and this code will extract some wrong value without throwing an exception.
- More clear index access. a[x,y] is cleaner than a[y*numOfColumns + x]
Pros of array[x*y]:
- Easier iteration over the entire array. You need only one loop instead of two.
And winner is... I would prefer array[x,y]