I'm looking for a data structure in .Net which keep heterogeneous structs contiguous in memory in order to be cpu-cache-friendly.
This type of data structure is explained in this blog : T-machine.org at the Iteration 4.
In .Net an array of value types (structs) keeps data contiguous in memory, but this is only working for a no-generic array.
I tried to create a ValueType[], but structs are boxed. So the references are contiguous in memory but not the real data.
After many tries I don't think this is possible natively in .Net. The only possible solution I see, it's to manually managed the seralization and deserialization of structs in a byte array, but I don't think it will be performant.
Did you find a native solution? or a better solution that mine?
Edit 1:
I'm trying to implement an entity-component-system as described in the T-Machine.org blog.
No. There is no way to do Iteration 4 in C#. You can't decide where in memory a .NET struct or class will be put. There is nothing similar to Placement New of C++.
But note that even Iteration 4 seems to have more problems than solutions:
At this point, our iterations are quite good, but we’re seeing some recurring problems:
Re-allocation of arrays when Components are added/removed (I’ve not covered this above – if you’re not familiar with the problem, google “C dynamic array”)
Fragmentation (affects every iteration after Iteration 1, which doesn’t get any worse simple because it’s already as bad as it could be)
Cross-referencing (which I skipped)
but
if you have struct of around the same size, the union trick could be enough...
public enum StructType
{
Velocity = 0,
Position = 1,
Foo = 2,
Bar = 3,
}
public struct Velocity
{
public int Vx;
public int Vy;
}
public struct Position
{
public int X;
public int Y;
public int Z;
}
public struct Foo
{
public double Weight;
public double Height;
public int Age;
}
public struct Bar
{
public int ColorR;
public int ColorG;
public int ColorB;
public int Transparency;
}
[StructLayout(LayoutKind.Explicit)]
public struct SuperStruct
{
[FieldOffset(0)]
public StructType StructType;
[FieldOffset(4)]
public Velocity Velocity;
[FieldOffset(4)]
public Position Position;
[FieldOffset(4)]
public Foo Foo;
[FieldOffset(4)]
public Bar Bar;
}
"officially" in C# there are no C unions. But thorugh the use of FixedLayout and FieldOffset you can create them. Note that they are totally incompatible with reference types, and clearly the size of the SuperStruct will be the size of the biggest possible element. In this case, 32 bytes, because Foo is 20 bytes, but there is some padding needed before and after it to align to the 8 bytes boundary.
Clearly your array would be of SuperStruct types. Note that by following the Iterion 4 example, the StructType isn't strictly necessary, because the types of the elements is written in other places.
Related
I came across something very much like the below at work. I have never worked with a C# codebase that makes such heavy use of structs before.
I have used fixed before to prevent the garbage collector from moving things while I work on something unsafe using pointers. But I've never seen it used while taking a pointer and passing it to a Span like this and then using the Span outside of the fixed statement.
Is this okay? I guess since Span is managed, once we pass the location to it, then if the GC moves the location of MyStruct, it should get the new location okay right?
[StructLayout(LayoutKind.Sequential)]
public unsafe struct MyInnerStruct
{
public uint InnerValueA;
public uint InnerValueB;
public float InnerValueC;
public long InnerValueD;
}
[StructLayout(LayoutKind.Sequential)]
public unsafe struct MyStruct
{
public MyInnerStruct Value0;
public MyInnerStruct Value1;
public MyInnerStruct Value2;
public MyInnerStruct Value3;
public MyInnerStruct Value4;
public MyInnerStruct Value5;
public MyInnerStruct Value6;
public MyInnerStruct Value7;
public MyInnerStruct Value8;
public MyInnerStruct Value9;
public int ValidValueCount;
public Span<MyInnerStruct> Values
{
get
{
fixed (MyInnerStruct* ptr = &Value0)
{
return new Span<MyInnerStruct>(ptr, ValidValueCount);
}
}
}
}
This is unsafe
Returning a Span<T> from a method is, in the general case, unsafe. It works, as long as one guarantees that the object it references to does not get out of scope and is not moved. These conditions apply only to stack variables that live at most as long as the Span itself. With the above code, consider this example:
public Span<MyInnerStruct> Foo()
{
MyStruct s = default;
s.Value0.InnerValueA = 3;
s.Values[1].InnerValueB = 4; // this is fine
return s.Values; // Peng! We're returning a reference to s, which is illegal
}
With the use of a pointer to initialize the Span, the runtime will also lose all means of tracking the reference it points to. So if MyStruct is part of an object, you'll easily get a dangling reference after the next GC run. .NET 5.0 and above has a method MemoryMarshal.CreateSpan<T> that works around some of the issues, but it's use is still dangerous.
A detailed analysis of the problems with such a struct can be found here. The discussed struct ValueArray<T> looks very similiar to the example given in the OP. In the end, the decision was to remove it from the code base, because it's use is just too dangerous.
This is the source code in .NET 6
public unsafe Span(void* pointer, int length)
{
_pointer = new ByReference<T>(ref Unsafe.As<byte, T>(ref *(byte*)pointer));
_length = length;
}
Your code:
fixed (MyInnerStruct* ptr = &Value0)
{
return new Span<MyInnerStruct>(ptr, ValidValueCount);
}
I did some tests, and using the span like you did is always safe to do even when the PTR gets moved around by the garbarge collector.
I also tested to write to the Span after the ptr had been moved and the original object was still correctly updated.
The Span will always return the correct values because of the ByReference
I am trying to allocate an array of structures in C#. For example,
public struct Channel {
int ChannelId;
// other stuff goes here...
}
public struct FrameTraffic {
public int FrameId;
public int MaxChannels;
public Channel[] Channels;
public FrameTraffic(int dummyCS0568 = 0)
{
this.FrameId = 0;
MaxChannels = TableMgr.MaxChannels;
Channels = new Channel[TableMgr.MaxChannels];
}
}
But when I go to allocate an array of FrameTraffic structures, I see that Channels is null. This tells me that Channels is a reference rather than an array of structures. Am I correct? If so, then allocating the Channels array shouldn't embed the array into the structure, but simply satisfy the reference in the structure. I want the structures embedded. Is there a way to do this? Or am I incorrect in my assumptions?
Answering the later part of your question and disregarding any other problem. Yes you are correct, this will be a reference to an array. However, if you wanted to embed the array in the struct you can use a fixed sized buffer using the fixed and unsafe keywords. However that can only be known at design time, also it can only be of the following value types and not a user defined struct.
bool, byte, char, short, int, long, sbyte, ushort, uint, ulong, float, or double.
So in short, what you want to do is not possible, you may need to clarify why you need this or re-think your problem
You need to use the correct marshalling attribute, and it needs to have a fixed size, say 40
public struct FrameTraffic
{
public int FrameId;
public int MaxChannels;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 40)]
public Channel[] Channels;
}
I was able to replicate the null issue not sure if its the same with yours :
There are two things that i think is possibly causing this :
You are just initializing the array with size but not assigning any values
You might be initializing FrameTraffic with default construct instead of what you have defined (this caused the actual NPE for me)
Below is how you can adjust your code: (I have hardcoded values which is brought by TableMgr.MaxChannels since i dont have that)
class Program
{
static void Main()
{
FrameTraffic fT = new FrameTraffic(0);
foreach (var item in fT.Channels)
{
Console.WriteLine(item.ChannelId);
}
Console.Read();
}
}
public struct Channel
{
public int ChannelId; //missing public exposer if you really want to reassign
// other stuff goes here...
}
public struct FrameTraffic
{
public int FrameId;
public Channel[] Channels;
public FrameTraffic(int dummyCS0568 = 0)
{
this.FrameId = 0;
const int MaxChannels = 1;
//array requires size and its values assigned here
Channels = new Channel[MaxChannels]{ new Channel { ChannelId = 1 } };
}
}
I started to learn about C# and I usually use C++.
There is a bunch of things that I'm trying to adapt, but std::array seem like impossible...
I just want to run this kind of code:
public struct Foo {};
public struct Test
{
public Foo value[20];
};
I don't want to allocate each time I use this struct and I don't want to use a class ever...
I saw fixed keyword but it works only for basic types...
There is not equivalent to something as simple as std::array?
I can even do that in C.
How would you sove this problem? (Even if it's still dynamically alocated..)
Using a fixed size buffer (fixed) is only possible for primitive types since its use is intended for interop. Array types are reference types, and so they can have dynamic size:
public struct Test
{
public Foo[] value;
}
Note however that copying the struct will only copy the reference, so the arrays will be identical. I suggest you either make the type immutable (by disabling writing to the array), or change struct to class and control cloning explicitly.
There is no such thing as a fixed size by-value array type in C# (although I have proposed it once). The closest thing you can get to it is a value tuple.
So it seems like there is no way to not do something as stupid as dynamically allocate something know at compile time. But that's C# so I just need to... try to close my eyes.
Anyway I did something to solve array alias and fixed array at the same time (I didn't ask about array alias on this question thought).
public abstract
class Array<T>
{
private T[] data;
protected Array(int size) { data = new T[size]; }
public T this[int i]
{
get { return data[i]; }
set { data[i] = value; }
}
};
public Alias : Array<int>
{
static public int Length = 10;
public Area() : base(Length) {}
};
And some people say it's quicker to write code with C#...
If someone have better I'll glady take it!
I'm translating a library written in C++ to C#, and the keyword 'union' exists once. In a struct.
What's the correct way of translating it into C#? And what does it do? It looks something like this;
struct Foo {
float bar;
union {
int killroy;
float fubar;
} as;
}
You can use explicit field layouts for that:
[StructLayout(LayoutKind.Explicit)]
public struct SampleUnion
{
[FieldOffset(0)] public float bar;
[FieldOffset(4)] public int killroy;
[FieldOffset(4)] public float fubar;
}
Untested. The idea is that two variables have the same position in your struct. You can of course only use one of them.
More informations about unions in struct tutorial
You can't really decide how to deal with this without knowing something about how it is used. If it is merely being used to save space, then you can ignore it and just use a struct.
However that is not usually why unions are used. There two common reasons to use them. One is to provide 2 or more ways to access the same data. For instance, a union of an int and an array of 4 bytes is one (of many) ways to separate out the bytes of a 32 bit integer.
The other is when the data in the struct came from an external source such as a network data packet. Usually one element of the struct enclosing the union is an ID that tells you which flavor of the union is in effect.
In neither of these cases can you blindly ignore the union and convert it to a struct where the two (or more) fields do not coincide.
In C/C++ union is used to overlay different members in the same memory location, so if you have a union of an int and a float they both use the same 4 bytes of memory to store, obviously writing to one corrupts the other (since int and float have different bit layout).
In .Net Microsoft went with the safer choice and didn't include this feature.
EDIT: except for interop
If you're using the union to map the bytes of one of the types to the other then in C# you can use BitConverter instead.
float fubar = 125f;
int killroy = BitConverter.ToInt32(BitConverter.GetBytes(fubar), 0);
or;
int killroy = 125;
float fubar = BitConverter.ToSingle(BitConverter.GetBytes(killroy), 0);
You could write a simple wrapper but in most cases just use an object it is less confusing.
public class MyUnion
{
private object _id;
public T GetValue<T>() => (T)_id;
public void SetValue<T>(T value) => _id = value;
}
Personally, I would ignore the UNION all together and implement Killroy and Fubar as separate fields
public struct Foo
{
float bar;
int Kilroy;
float Fubar;
}
Using a UNION saves 32 bits of memory allocated by the int....not going to make or break an app these days.
public class Foo
{
public float bar;
public int killroy;
public float fubar
{
get{ return (float)killroy;}
set{ killroy = (int)value;}
}
}
I am calling a C++ function from C#. As arguments it receives a pointer to an array of structs.
struct A
{
int data;
}
int CFunction (A* pointerToFirstElementOfArray, int NumberOfArrayElements)
In C# I have created the same struct (as a class) and I marshall it correctly (the first element in the array is received correctly). Here is my C# definition of the C++ struct:
[StructLayout(LayoutKind.Sequential), Serializable]
class A
{
int data;
}
The first element is read correctly so I know all the fields in the struct are marshalled correctly. The problem occurs when I try to send over an array of elements. How do I create an array of classes that will be in a single memory block (chunk), so the C++ function can increment the pointer to the array?
I guess I would need something similar to stackalloc, however I belive that only works for primitive types?
Thank you for your help.
This is not intended to be a definitive answer, but merely an example of you could accomplish this using stackalloc and unsafe code.
public unsafe class Example
{
[DllImport("library.dll")]
private static extern int CFunction(A* pointerToFirstElementOfArray, int NumberOfArrayElements);
public void DoSomething()
{
A* a = stackalloc A[LENGTH];
CFunction(a, LENGTH);
}
}
Also, pay attention to the packing of the struct that the API accepts. You may have to play around with the Pack property of the StructLayout attribute. I believe the default is 4, but some APIs expect 1.
Edit:
For this to work you will have to change the declaration of A from a class to a struct.
public struct A
{
public int data;
}