Fast array copy in C#

Fast array copy in C# - c#

I have a C# class that contains an int[] array (and a couple of other fields, but the array is the main thing). The code often creates copies of this class and profiling shows that the Array.Copy() call to copy this array takes a lot of time. What can I do to make it faster?
The array size is very small and constant: 12 elements. So ideally I'd like something like a C-style array: a single block of memory that's inside the class itself (not a pointer). Is this possible in C#? (I can use unsafe code if needed.)
I've already tried:
1) Using a UIn64 and bit-shifting instead of the array. (The values of each element are also very small.) This does make the copy fast, but slows down the program overall.
2) Using separate fields for each array element: int element0, int element1, int element2, etc. Again, this is slower overall when I have to access the element at a given index.

I would checkout the System.Buffer.BlockCopy if you are really concerned about speed.
http://msdn.microsoft.com/en-us/library/system.buffer.blockcopy.aspx
Simple Example:
int[] a = new int[] {1,2,3,4,5,6,7,8};
int[] b = new int[a.Length];
int size = sizeof(int);
int length = a.Length * size;
System.Buffer.BlockCopy(a, 0, b, 0, length);
Great discussion on it over here: Array.Copy vs Buffer.BlockCopy

This post is old, but anyone in a similar situation as the OP should have a look at fixed size buffers in structs. They are exactly what OP was asking for: an array of primitive types with a constant size stored directly in the class.
You can create a struct to represent your collection, which will contain the fixed size buffer. The data will be stored directly within the struct, which will be stored directly within your class. You can copy through simple assignment.
They come with a few caveats:
They can only be used with primitive types.
They require the "unsafe" keyword on your struct.
Size must be known at compile time.
It used to be that you had to use the fixed keyword and pointers to access them, but recent changes to C# catering to performance programming have made that unnecessary. You can now work with them just like arrays.
public unsafe struct MyIntContainer
{
private fixed int myIntegers[12];
public int this[int index]
{
get => this.myIntegers[index];
set => this.myIntegers[index] = value;
}
}
There is no built-in bound checking, so it would be best for you to include that yourself on such a property, encapsulating any functionality which skips bound checks inside of a method. I am on mobile, or I would have worked that into my example.

You asked about managed arrays. If you are content to use fixed / unsafe, this can be very fast.
struct is assignable, like any primitive. Almost certainly faster than Buffer.BlockCopy() or any other method, due to the lack of method call overhead:
public unsafe struct MyStruct //the actual struct used, contains all
{
public int a;
public unsafe fixed byte buffer[16];
public ulong b;
//etc.
}
public unsafe struct FixedSizeBufferWrapper //contains _only_ the buffer
{
public unsafe fixed byte buffer[16];
}
unsafe
{
fixed (byte* bufferA = myStructA.buffer, bufferB = myStructB.buffer)
{
*((FixedSizeBufferWrapper*)bufferA) =
*((FixedSizeBufferWrapper*)bufferB);
}
}
We cast fixed-size byte buffers from each of your original structs to the wrapper pointer type and dereference each pointer SO THAT we can assign one to the other by value; assigning fixed buffers directly is not possible, hence the wrapper, which is basically zero overhead (it just affects values used in pointer arithmetic that is done anyway). That wrapper is only ever used for casting.
We have to cast because (at least in my version of C#) we cannot assign anything other than a primitive type (usually byte[]) as the buffer, and we aren't allowed to cast inside fixed(...).
EDIT: This appears get translated into a call to Buffer.Memcpy() (specifically Buffer.memcpy4() in my case, in Unity / Mono) under the hood to do the copy.

Related

Stackalloc vs. Fixed sized buffer in C#. What is the difference

As far as I'm concerned the following code will create an array on the stack:
unsafe struct Foo
{
public fixed int bar[10];
}
var foo = new Foo();
And stackalloc statement will do the same thing:
public void unsafe foo(int length)
{
Span<int> bar = stackalloc int[length];
}
So I'm wondering what is the difference between those approaches. And also what is the purpose of fixed size buffers at all? Everyone talks about performance boost, but I can not understand why do I need them, when I already can create an array on the stack with stackalloc. MSDN says that fixed sized buffers are used to "interoperate" with other platforms. So what is this "interoperation" looks like?

The fixed statement only says that an array is inlined (fixed inside) the struct. Meaning the data stored in the array is directly stored in your struct. In your example the Foo struct would have the size needed to store 10 integer values. As structs are value types they are allocated on the stack. However they can also be copied to the heap when for example storing them in a reference type.
class Test1
{
private Foo Foo = new();
}
unsafe struct Foo
{
public fixed int bar[10];
}
The code above will compile and the private Foo instance will live on the managed heap.
Without the fixed statement (just a "normal" int[]) the data of the array would not be stored in the struct itself but on the heap. The struct would only own a reference to that array.
When using stackalloc the data is allocated on the stack and cannot be automatically moved to the heap by the CLR. This means that stackalloc'ed data remains on the stack and the compiler will enforce this by not allowing code like this:
unsafe class Test1
{
// CS8345: Field or auto-implemented property cannot be of type 'Span<int>' unless it is an instance member of a ref struct.
private Span<int> mySpan;
public Test1()
{
// CS8353: A result of a stackalloc expression of type 'Span<int>' cannot be used in this context because it may be exposed outside of the containing method
mySpan = stackalloc int[10];
}
}
Therefore you use stackalloc when you absolutely want to make sure that your allocated data can not escape to the heap resulting in increased pressure on the grabage collector (it's a performance thing). fixed on the other hand is mainly used for interop scenarios with native C/C++ libraries which may use inlined buffers for some reason or another. So when calling methods from the native world which take structs with inlined buffers as a parameter you must be able to recreate this in .NET or else you wouldn't be able to easily work with native code (thus the fixed statement exists). Another reason to use fixed is for inlining the data in your struct which can allow for better caching when the CPU is accessing it as it can just read all the data in Foo in one go without having to dereference a reference and jump around the memory to access some array stored elsewhere.

Contiguous hierarchical struct memory with fixed-size arrays in C#?

I have a task which in C would be trivial but which C# seems to make (intentionally?) impossible.
In C I would pre-allocate the entire data model of my simulation, via structs set up as a single, monolithic hierarchy, including fixed-size arrays of yet more structs, maybe containing more arrays. This is nigh-doable in C#, except for one thing...
In C#, we have the fixed keyword to specify fixed-size buffers (arrays) in each struct type - Cool. However, this supports only primitives as the fixed buffer element type, throwing a major spanner in these works of having a single monolithic, hierarchical and contiguously-allocated data model that begins to ensure optimal CPU cache access.
Other approaches I can see are the following:
Use structs that allocate the array elsewhere through a separate new (which would seem to defeat contiguity entirely) - standard practice but not efficient.
Use the fixed arrays of primitive types (say byte) but then have to marshal these back and forth when I want to change things... will this even work easily? Could be very tedious.
Do (1) while assuming that the platform knows to moves things around for maximum contiguity.
I am using .NET 2.0 under Unity 5.6.

Please take a look on Span<T> and Memory<T> features of C# 7.2. I think that would solve your problem.
What is the difference between Span<T> and Memory<T> in C# 7.2?

Without access to Memory<T>, ended up going with option (2), but no marshalling was necessary, only casting: use a fixed array of bytes in an unsafe struct and cast to/from these as follows:
using System.Collections;
using System.Collections.Generic;
using System.Runtime.InteropServices;
using UnityEngine;
public class TestStructWithFixed : MonoBehaviour
{
public const int MAX = 5;
public const int SIZEOF_ELEMENT = 8;
public struct Element
{
public uint x;
public uint y;
//8 bytes
}
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public unsafe struct Container
{
public int id; //4 bytes
public unsafe fixed byte bytes[MAX * SIZEOF_ELEMENT];
}
public Container container;
void Start ()
{
Debug.Log("SizeOf container="+Marshal.SizeOf(container));
Debug.Log("SizeOf element ="+Marshal.SizeOf(new Element()));
unsafe
{
Element* elements;
fixed (byte* bytes = container.bytes)
{
elements = (Element*) bytes;
//show zeroed bytes first...
for (int i = 0; i < MAX; i++)
Debug.Log("i="+i+":"+elements[i].x);
//low order bytes of Element.x are at 0, 8, 16, 24, 32 respectively for the 5 Elements
bytes[0 * SIZEOF_ELEMENT] = 4;
bytes[4 * SIZEOF_ELEMENT] = 7;
}
elements[2].x = 99;
//show modified bytes as part of Element...
for (int i = 0; i < MAX; i++)
Debug.Log("i="+i+":"+elements[i].x); //shows 4, 99, 7 at [0], [2], [4] respectively
}
}
}
unsafe access is very fast, and with no marshalling or copies - is exactly what I wanted.
If likely to be using 4-byte ints or floats for all your struct members, you might even do better to base your fixed buffer off such a type (uint is always a clean choice) - readily debuggable.
UPDATE 2021
I've revisited this topic this year, for prototyping in Unity 5 (due to fast compile / iteration times).
It can be easier to stick with one very large byte array, and use this in managed code, rather than bothering with fixed + unsafe (by the way since C# 7.3 it is no longer necessary to use the fixed keyword every time to pin a fixed-size buffer in order to access it).
With fixed we lose type-safety; this being a natural shortcoming of interop data - whether interop between native and managed; CPU and GPU; or between Unity main thread code and that used for the new Burst / Jobs systems. The same applies for managed byte buffers.
Thus it can be easier to accept working with untyped managed buffers and writing offset + sizes yourself. fixed / unsafe offers (a little) more convenience, but not by much, since you equally have to specify compile-time struct field offsets and change these each time the data design changes. At least with managed VLAs, I can sum offsets in code, however this does mean these are not compile-time constants, thus losing some optimisations.
The only real benefit of allocating a fixed buffer this way vs. a managed VLA (in Unity), is that with the latter, there is a chance the GC will move your entire data model somewhere else in mid-play, which could cause hiccups, though I've yet to see how serious this is in production.
Managed arrays are not, however, directly supported by Burst.

Converting C++ Pointer Math to C#

I'm currently working on a project that requires converting some C++ code to a C# environment. For the most part, it's actually pretty straightforward, but I'm currently converting some lower-level memory manipulation functions and running into some uncertainty as to how to proceed.
In the C++ code, I've got a lot of instances of things like this (obviously quite simplified):
void SomeFunc(unsigned char* myMemoryBlock)
{
AnotherFunc(myMemoryBlock);
AnotherFunc(myMemoryBlock + memoryOffset);
}
void AnotherFunc(unsigned char* data)
{
// Also simplified - basically, modifying the
// byte pointed to by data and then increasing to the next item.
*data = 2;
data++;
*data = 5;
data++;
// And so on...
}
I'm thinking that in C#, I've basically got to treat the "unsigned char*" as a byte array (byte[]). But to perform a similar operation to the pointer arithmetic, is that essentially just increasing a "currentIndex" value for accessing the byte array? For something like AnotherFunc, I guess that means I'd also need to pass in a starting index, if the starting index isn't 0?
Just want to confirm this is how it should be done in C#, or if there's a better way to make that sort of conversion. Also, I can't use the "unsafe" keyword in my current environment, so actually using pointers is not possible!

The two functions treat myMemoryBlock as if it represented an array. You could replace a single myMemoryBlock parameter with a pair of myArray and myOffset, like this:
void SomeFunc(char[] myArray)
{
AnotherFunc(myArray, 0);
AnotherFunc(myArray, memoryOffset);
}
void AnotherFunc(char[] data, int offset)
{
// Also simplified - basically, modifying the
// byte pointed to by data and then increasing to the next item.
data[offset++] = 2;
data[offset++] = 5;
// And so on...
}
Note: C++ type unsigned char is often used as a stand-in for "untyped block of memory" (as opposed to "a block of memory representing character data"). If this is the case in your situation, i.e. the pointer points to memory that is not necessarily character, an array of byte would be a more appropriate choice.

Just like #dasblinkenlight said, the C# (and Java) way to deal with arbitrary pointers to memory data blocks (which are usually byte or char arrays) is to add an additional offset parameter to the methods that access the data blocks.
It is also common to add a third length parameter. Thus the general form for a method Foo() that is passed a block of memory is:
// Operate on 'block', starting at index 'offset',
// for 'length' elements
//
int Foo(byte[] block, int offset, int length)
{ ... }
You see this all over the place in the C# library. Another form that is common for methods that operate on two memory blocks (e.g., copying one block to another, or comparing one block to another, etc.) is this:
// Operate on blocks 'src' starting at index 'srcOff',
// and on block 'dst' starting at 'dstOff',
// for a total of 'length' elements
//
int Bar(byte[] src, int srcOff, byte[] dst, int dstOff, int length)
{ ... }
For methods that expect to operate on an entire memory block (array), these generally look like this:
// Overloaded version of Foo() that
// operates on the entire array 'block'
//
int Foo(byte[] block)
{
return Foo(block, 0, block.Length);
}

C# does away with pointers for the exact reasons of preventing pointer arithmetic (rather, the errors that pointer arithmetic is vulnerable to).
Generally any C++ memory block referred to by a pointer and memory offset is indeed best translated as an array in C# (hence why even C# arrays start with [0]).However, you should keep the array the same type as the underlying data -char[] instead of byte[]. Because this is also a char[], you should look at what the overall use of the function is and consider switching to a string.

Accommodating nested unsafe structs in C#

What is the best way to accommodate the following:
Real time, performance critical application that interfaces with a native C dll for communicating with a proprietary back end.
The native api has hundreds upon hundreds of structs, nested structs and methods that pass data back and forth via these structs.
Want to use c# for logic, so decided on unsafe c# in favor of cli and marshaling. I know how and have implemented this via the later so please don't reply "use cli". Marshaling hundreds of structs a hundred times a second introduces a significant enough delay that it warranted investigating unsafe c#.
Most of the c structs contain dozens of fields, so looking for a method to do minimal typing on each. At this point, got it down to running a VS macro to convert each line element to c# equivalent setting arrays to fixed size when necessary. This work pretty well until I hit a nested struct array. So for example, I have these 2 structs:
[StructLayout(LayoutKind.Sequential,Pack=1)]
unsafe struct User{
int id;
fixed char name[12];
}
[StructLayout(LayoutKind.Sequential,Pack=1)]
unsafe structs UserGroup{
fixed char name[12];
fixed User users[512]
int somethingElse;
fixed char anotherThing[16]
}
What is the best way to accommodate fixed User users[512] so that to not have to do much during run time?
I have seen examples where the suggestion is to do
[StructLayout(LayoutKind.Sequential,Pack=1)]
unsafe structs UserGroup{
fixed char name[12];
User users_1;
User users_2;
...
User users_511;
int somethingElse;
fixed char anotherThing[16]
}
Another idea has been, to compute the size of User in bytes and just do this
[StructLayout(LayoutKind.Sequential,Pack=1)]
unsafe structs UserGroup{
fixed char name[12];
fixed byte Users[28*512];
int somethingElse;
fixed char anotherThing[16]
}
But that would mean that I would have to do special treatment to this struct every time I need to use it, or wrap it with some other code. There are enough of those in the api that I would like to avoid this approach, but if someone can demonstrate an elegant way I that could work as well
A third approach that eludes me enough that I can't produce and example(i think i saw somewhere but cant find it anymore), is to specify size for User or somehow make it strictly sized so that you could use a "fixed" keyword on it.
Can anyone recommend a reasonable approach that they have utilized and scales well under load?

The best way I could find nested struct in unsafe structs is by defining them as fixed byte arrays and then providing a runtime conversion property for the field. For example:
[StructLayout(LayoutKind.Sequential,Pack=1)]
unsafe struct UserGroup{
fixed char name[12];
fixed User users[512]
int somethingElse;
fixed char anotherThing[16]
}
Turns into:
[StructLayout(LayoutKind.Sequential,Pack=1)]
unsafe struct UserGroup{
fixed char name[12];
fixed byte users[512 * Constants.SizeOfUser]
int somethingElse;
fixed char anotherThing[16];
public User[] Users
{
get
{
var retArr = new User[512];
fixed(User* retArrRef = retArr){
fixed(byte* usersFixed = users){
{
Memory.Copy(usersFixed, retArrRef, 512 * Constants.SizeOfUser);
}
}
}
return retArr;
}
}
}
Pleas note, this code uses Memory.Copy function provided here: http://msdn.microsoft.com/en-us/library/aa664786(v=vs.71).aspx
The general explanation of the geter is as follows:
allocate a managed array for the return value
get and fix an unsafe pointer to it
get and fix an unsafe pointer to the byte array for the struct
copy the memory from one to the other
The reason why the managed array is not getting stored back into the struct it self is because it would modify its layout and would not translate correctly anymore, while the prop is a no issue when getting it from un-managed. Alternatively, this could be wrapped in another managed object that does the storing.

Getting the size of a field in bytes with C#

I have a class, and I want to inspect its fields and report eventually how many bytes each field takes. I assume all fields are of type Int32, byte, etc.
How can I find out easily how many bytes does the field take?
I need something like:
Int32 a;
// int a_size = a.GetSizeInBytes;
// a_size should be 4

You can't, basically. It will depend on padding, which may well be based on the CLR version you're using and the processor etc. It's easier to work out the total size of an object, assuming it has no references to other objects: create a big array, use GC.GetTotalMemory for a base point, fill the array with references to new instances of your type, and then call GetTotalMemory again. Take one value away from the other, and divide by the number of instances. You should probably create a single instance beforehand to make sure that no new JITted code contributes to the number. Yes, it's as hacky as it sounds - but I've used it to good effect before now.
Just yesterday I was thinking it would be a good idea to write a little helper class for this. Let me know if you'd be interested.
EDIT: There are two other suggestions, and I'd like to address them both.
Firstly, the sizeof operator: this only shows how much space the type takes up in the abstract, with no padding applied round it. (It includes padding within a structure, but not padding applied to a variable of that type within another type.)
Next, Marshal.SizeOf: this only shows the unmanaged size after marshalling, not the actual size in memory. As the documentation explicitly states:
The size returned is the actually the
size of the unmanaged type. The
unmanaged and managed sizes of an
object can differ. For character
types, the size is affected by the
CharSet value applied to that class.
And again, padding can make a difference.
Just to clarify what I mean about padding being relevant, consider these two classes:
class FourBytes { byte a, b, c, d; }
class FiveBytes { byte a, b, c, d, e; }
On my x86 box, an instance of FourBytes takes 12 bytes (including overhead). An instance of FiveBytes takes 16 bytes. The only difference is the "e" variable - so does that take 4 bytes? Well, sort of... and sort of not. Fairly obviously, you could remove any single variable from FiveBytes to get the size back down to 12 bytes, but that doesn't mean that each of the variables takes up 4 bytes (think about removing all of them!). The cost of a single variable just isn't a concept which makes a lot of sense here.

Depending on the needs of the questionee, Marshal.SizeOf might or might not give you what you want. (Edited after Jon Skeet posted his answer).
using System;
using System.Runtime.InteropServices;
public class MyClass
{
public static void Main()
{
Int32 a = 10;
Console.WriteLine(Marshal.SizeOf(a));
Console.ReadLine();
}
}
Note that, as jkersch says, sizeof can be used, but unfortunately only with value types. If you need the size of a class, Marshal.SizeOf is the way to go.
Jon Skeet has laid out why neither sizeof nor Marshal.SizeOf is perfect. I guess the questionee needs to decide wether either is acceptable to his problem.

From Jon Skeets recipe in his answer I tried to make the helper class he was refering to. Suggestions for improvements are welcome.
public class MeasureSize<T>
{
private readonly Func<T> _generator;
private const int NumberOfInstances = 10000;
private readonly T[] _memArray;
public MeasureSize(Func<T> generator)
{
_generator = generator;
_memArray = new T[NumberOfInstances];
}
public long GetByteSize()
{
//Make one to make sure it is jitted
_generator();
long oldSize = GC.GetTotalMemory(false);
for(int i=0; i < NumberOfInstances; i++)
{
_memArray[i] = _generator();
}
long newSize = GC.GetTotalMemory(false);
return (newSize - oldSize) / NumberOfInstances;
}
}
Usage:
Should be created with a Func that generates new Instances of T. Make sure the same instance is not returned everytime. E.g. This would be fine:
public long SizeOfSomeObject()
{
var measure = new MeasureSize<SomeObject>(() => new SomeObject());
return measure.GetByteSize();
}

It can be done indirectly, without considering the alignment.
The number of bytes that reference type instance is equal service fields size + type fields size.
Service fields(in 32x takes 4 bytes each, 64x 8 bytes):
Sysblockindex
Pointer to methods table
+Optional(only for arrays) array size
So, for class without any fileds, his instance takes 8 bytes on 32x machine. If it is class with one field, reference on the same class instance, so, this class takes(64x):
Sysblockindex + pMthdTable + reference on class = 8 + 8 + 8 = 24 bytes
If it is value type, it does not have any instance fields, therefore in takes only his fileds size. For example if we have struct with one int field, then on 32x machine it takes only 4 bytes memory.

I had to boil this down all the way to IL level, but I finally got this functionality into C# with a very tiny library.
You can get it (BSD licensed) at bitbucket
Example code:
using Earlz.BareMetal;
...
Console.WriteLine(BareMetal.SizeOf<int>()); //returns 4 everywhere I've tested
Console.WriteLine(BareMetal.SizeOf<string>()); //returns 8 on 64-bit platforms and 4 on 32-bit
Console.WriteLine(BareMetal.SizeOf<Foo>()); //returns 16 in some places, 24 in others. Varies by platform and framework version
...
struct Foo
{
int a, b;
byte c;
object foo;
}
Basically, what I did was write a quick class-method wrapper around the sizeof IL instruction. This instruction will get the raw amount of memory a reference to an object will use. For instance, if you had an array of T, then the sizeof instruction would tell you how many bytes apart each array element is.
This is extremely different from C#'s sizeof operator. For one, C# only allows pure value types because it's not really possible to get the size of anything else in a static manner. In contrast, the sizeof instruction works at a runtime level. So, however much memory a reference to a type would use during this particular instance would be returned.
You can see some more info and a bit more in-depth sample code at my blog

if you have the type, use the sizeof operator. it will return the type`s size in byte.
e.g.
Console.WriteLine(sizeof(int));
will output:
4

You can use method overloading as a trick to determine the field size:
public static int FieldSize(int Field) { return sizeof(int); }
public static int FieldSize(bool Field) { return sizeof(bool); }
public static int FieldSize(SomeStructType Field) { return sizeof(SomeStructType); }

Simplest way is: int size = *((int*)type.TypeHandle.Value + 1)
I know this is implementation detail but GC relies on it and it needs to be as close to start of the methodtable for efficiency plus taking into consideration how GC code complex is nobody will dare to change it in future. In fact it works for every minor/major versions of .net framework+.net core. (Currently unable to test for 1.0)
If you want more reliable way, emit a struct in a dynamic assembly with [StructLayout(LayoutKind.Auto)] with exact same fields in same order, take its size with sizeof IL instruction. You may want to emit a static method within struct which simply returns this value. Then add 2*IntPtr.Size for object header. This should give you exact value.
But if your class derives from another class, you need to find each size of base class seperatly and add them + 2*Inptr.Size again for header. You can do this by getting fields with BindingFlags.DeclaredOnly flag.

System.Runtime.CompilerServices.Unsafe
Use System.Runtime.CompilerServices.Unsafe.SizeOf<T>() where T: unmanaged
(when not running in .NET Core you need to install that NuGet package)
Documentation states:
Returns the size of an object of the given type parameter.
It seems to use the sizeof IL-instruction just as Earlz solution does as well. (source)
The unmanaged constraint is new in C# 7.3

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Fast array copy in C# - c#

Related

Stackalloc vs. Fixed sized buffer in C#. What is the difference

Contiguous hierarchical struct memory with fixed-size arrays in C#?

Converting C++ Pointer Math to C#

Accommodating nested unsafe structs in C#

Getting the size of a field in bytes with C#

Categories

Resources