According to many sources on the internet, in C# arrays are stored sequentially. That is if I have a pointer to the first element in the array, say int *start = &array[0], then I can access array[i] by doing *(start + i).
However, I was looking through the C# Language Specification which is stored in C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC#\Specifications\1033 and I cannot find anyplace that guarantees that this will be the case.
In practice this might not be an issue, if say Microsoft and Mono keep their implementations in sync, but I was wondering if there is an official source that guarantees that arrays are stored sequentially in memory.
Thanks!
From the ECMA specification for the CLR:
I.8.9.1 Array types
....
Array elements shall be laid out within the array object in row-major
order (i.e., the elements associated with the rightmost array
dimension shall be laid out contiguously from lowest to highest
index). The actual storage allocated for each array element can
include platform-specific padding. (The size of this storage, in
bytes, is returned by the sizeof instruction when it is applied to the
type of that array’s elements.)
So yes in a compliment implementation of ECMA-335 Common Language Infrastructure elements in an a array will be laid out sequentially.
But there may be platform specific padding applied, so implementations on 64 bit platforms may chose to allocate 64bits for each Int32.
Yes, one-dimensional, zero-based arrays (vectors) in .NET are stored sequentially. In fact, you can use unsafe code with pointers to access array elements one-by-one by incrementing the pointer.
ECMA-335 specification of CLI, section 1.8.9.1 says the following about arrays:
Array elements shall be laid out within the array object in row-major order (i.e., the elements associated with the rightmost array dimension shall be laid out contiguously from lowest to highest index). The actual storage allocated for each array element can include platform-specific
padding.
Related
I'm looking for some documentation on the differences between an array's .Length and .LongLength properties.
Specifically, if the array's length is larger than Int32.MaxValue, will .Length throw an exception, return Int32.MaxValue, go negative, return 0?
(to clear up "possible duplicate" concerns: I'm not asking about the maximum length of an array, or the maximum size of a .NET CLR object. Assume a 64 bit system and a CLR version which supports large objects)
It is not possible to create a one dimensional array having more than 2,147,483,591 elements (for comparison, int.MaxValue is 2,147,483,647). OutOfMemoryException is thrown if an attempt is made to create an array with more elements. It means that the LongLength property is still useless and you can use the Length property instead.
I've tested it on the x64 platform using .NET 4.5. In order to create the array with 2,147,483,591 elements I've modified the configuration file and added:
<configuration>
<runtime>
<gcAllowVeryLargeObjects enabled="true" />
</runtime>
</configuration>
Basically, I used this MSDN page to enable arrays that are greater than 2 (GB) in total size. The real limit for arrays:
The maximum index in any single dimension is 2,147,483,591
(0x7FFFFFC7) for byte arrays and arrays of single-byte structures, and
2,146,435,071 (0X7FEFFFFF) for other types.
In my application I do System.Array.Resize once per frame. Initially I set my arrays to a maximum possible size, and then Resize them to something smaller. In some cases it may be a lot smaller, in others it may be just a little smaller. It appears to me though that the more elements there are to resize, the longer it takes. Perhaps my observations are wrong, and that is why I am asking here.
It should do yes, resizing involves allocating new memory to the size you want and copying the old array into the new one. The larger the array, the more to copy.
From MSDN:
This method allocates a new array with the specified size, copies
elements from the old array to the new one, and then replaces the old
array with the new one.
Without knowing too much about the code, try using List<T> to manage the list and the resizing you need to do and when you need to provide it to Unity, call list.ToArray();.
This will still create the array and copy it, but only once per frame.
As other answers note, "resizing" an array requires copying all the elements, which is an O(N) operation when N gets large. Note that there are a number of approaches that can be used for copying arrays, with differing "setup" and "per-item" costs. A small array-copy operation may be processed 4 bytes at a time (or in some cases, one byte at a time), while a larger array operation would use special 16-byte operations to do most of the copying. These operations are limited to writing aligned 16-byte chunks of memory at a time. Depending upon source and destination alignment, a large array operation might require copying four groups of four bytes (the last byte of which will overlap the next group), many groups of 16 bytes, and four more groups of four bytes (the first byte of which will overlap the previous group). Determining how to subdivide the groups is a little tricky, so for smaller block-copy requests it's more efficient to use one- or four-byte operations.
Note that the real key to minimizing the expense of array resizing is to do it as seldom as possible. Whenever the List<T> type has to expand the size of its array, it doubles it. If its array starts at 16 items, then at the time it doubles the array to 256 elements, 128 will be empty, 64 will have been copied once, 32 will be copied twice, and 16 will have been copied three times. Note that while some elements will end up being copied lg(N) times, the total number of element copy operations in the process of building a list of size N will always be less than 2N.
There's no way to access the backing array of a List<T> as an array, but it's fairly easy to re-implement the class in such a way as to expose the array, and make sure any methods that accept an array as a parameter allow one to specify the length of the portion to be used (instead of just accessing the Length property of the array).
Yes. Array resizing is an O(n) operation. It has to copy each element into the new array.
Maybe it would be better however if you did not use arrays? What are the arrays used for? There might be a better data structure suitable for you application.
This is a purelly theoretical question, so please do not warn me of that in your answers.
If I am not mistaken, and since every array in .NET is indexed by an Int32, meaning the index ranges from 0 to Int32.MaxValue.
Supposing no memory/GC constraints are involved an array in .NET can have up to 2147483648 (and not 2147483647) elements. Right?
Well, in theory that's true. In fact, in theory there could be support for larger arrays - see this Array.CreateInstance signature which takes long values for the lengths. You wouldn't be able to index such an array using the C# indexers, but you could use GetValue(long).
However, in practical terms, I don't believe any implementation supports such huge arrays. The CLR has a per-object limit a bit short of 2GB, so even a byte array can't actually have 2147483648 elements. A bit of experimentation shows that on my box, the largest array you can create is new byte[2147483591]. (That's on the 64 bit .NET CLR; the version of Mono I've got installed chokes on that.)
EDIT: Just looking at the CLI spec, it specifies that arrays have a lower bound and upper bound of an Int32. That would mean upper bounds over Int32.MaxValue are prohibited even though they can be expressed with the Array.CreateInstance calls. However, it also means it's permissable to have an array with bounds Int32.MinValue...Int.MaxValue, i.e. 4294967296 elements in total.
EDIT: Looking again, ECMA 335 partition III section 4.20 (newarr) specifies that a initializing a vector type with newarr has to take either a native int or int32 value. So it looks like while the normally-more-lenient "array" type in CLI terminology has to have int32 bounds, a "vector" type doesn't.
From the MSDN:
Parameters sourceArray
The Array that contains the data to
copy. destinationArray
The Array that receives the data.
length
A 64-bit integer that represents the
number of elements to copy. The
integer must be between zero and
Int32.MaxValue, inclusive
Given that the permitted range of values is 0 to Int32.MaxValue, what is the motivation for adding this signature? It did not exist in .Net 1.0 and was only added in .Net 1.1. My only guess is to prepare for 64-bit Framework implementations.
Curiously an array also has overloads for GetItem that take either an Int32 and an Int64. But in practice you cannot have a single object larger than 2 gigabytes in the current implementation of the .NET framework so you can't actually create an array that allows such large indexes.
I guess if this restriction were lifted later then it would mean that they don't need to change the interface.
Suppose that we have previously instantiated three objects A, B, C from class D
now an array defines as below:
D[] arr = new D[3];
arr[0]=A;
arr[1]=B;
arr[2]=C;
does array contains references to objects or has separate copy?
C# distinguishes reference types and value types.
A reference type is declared using the word class. Variables of these types contain references, so an array will be an array of references to the objects. Each reference is 4 bytes (on a 32-bit system) or 8 bytes (on a 64-bit system) large.
A value type is declared using the word struct. Values of this type are copied every time you assign them. An array of a value type contains copies of the values, so the size of the array is the size of the struct times the number of elements.
Normally when we say “object”, we refer to instances of a reference type, so the answer to your question is “yes”, but remember the difference and make sure that you don’t accidentally create a large array of a large struct.
An array of reference types only contains references.
In a 32 bit application references are 32 bits (4 bytes), and in a 64 bit application references are 64 bits (8 bytes). So, you can calculate the approximate size by multiplying the array length with the reference size. (There are also a few extra bytes for internal variables for the array class, and some extra bytes are used for memory management.)
You can look at the memory occupied by an array using WinDBG + SOS (or PSSCOR2). IIRC, an array of reference types is represented in memory by its length, followed by references to its elements, i.e. it's exact size is PLATFORM_POINTER_SIZE * (array.Length + 1)
The array is made out of pointers (32bit or 64bit) that points to the objects. An object is a reference type, only value types are copied to the array itself.
As #Yves said it has references to the objects. The array is a block of memory as it as in C.
So it size is sizeof(element) * count + the amount of memory needed by oop.