Best way to write bytes to a byte array in a struct? - c#

I'm trying to create a packet struct that is basically a byte builder of fixed length. I have a WriteByte function written in 3 different ways. Just wondering which is best (or if there's a better way altogether) and which will keep the GC happy. BTW, I have a Position field that has to be updated when a byte(s) is written. The functions will expand to include WriteUInt16, WriteFloat etc... Not sure what the best approach is. Any advice is appreciated.
Should this be a struct? I'd like to do as little allocation as possible because these Packs will be created very frequently.
Should I put the WriteByte in the struct itself as a method (option 1), as an extension of Pack (passed by ref) (option 2), or just use a static helper class (passed by ref) (option 3)?
Here's the code:
public struct Pack
{
public byte[] Data { get; internal set; }
public int Pos { get; internal set; }
public Pack(byte opcode, ushort size)
{
++size; // make room for byte opcode
Data = new byte[2 + size]; // make room to prepend the size (ushort)
BitConverter.GetBytes(size).CopyTo(Data, 0);
Data[2] = opcode;
Pos = 3; // start writing at position 3
}
// option 1
// use: pack.WriteByte(0x01)
public void WriteByte(byte value) => Data[Pos++] = value;
}
public static class SPackExtensions
{
// option 2
// use: pack.WriteByte(0x01)
public static void WriteByte(ref this Pack pack, byte value)
{
pack.Data[pack.Pos++] = value;
}
}
public static class PackWriter
{
// option 3
// use: PackWriter.WriteByte(ref pack, 0x01)
public static void WriteByte(ref Pack pack, byte value)
{
pack.Data[pack.Pos++] = value;
}
}

Should this be a struct? I'd like to do as little allocation as possible because these Packs will be created very frequently.
Maybe. Even with a struct you will still be allocating two objects, one when converting the size to an array, and one for the array itself. So the overhead for allocating the object will probably be negligible, especially since it will probably be much smaller than the array. However, structs are recommended to be immutable, since it can be quite confusing when passing around a object if the data-array is shared, but the Pos is not.
But you should probably use BitConverter.TryWriteBytes or BinaryPrimitives.TryWriteInt16LittleEndian to avoid the allocation when writing the length.
Should I put the WriteByte in the struct itself as a method (option 1), as an extension of Pack (passed by ref) (option 2), or just use a static helper class (passed by ref) (option 3)?
The generated code should be identical, so go for whatever is easiest to use and most readable. I would probably argue for option 1. In my opinion, using extension methods or helper classes would bring little advantages in this case.
these Packs will be created very frequently
When writing performance optimized code, a good principle is avoiding allocations altogether. And as far as I can see this object would be impossible to put into a object pool and reuse, since the position parameter is not possible to reset. So I would consider basing my object around a span that is fetched from a pool of fixed size buffers.
Also, I would highly recommend that you profile your code. "Very frequently" might range from a 1hz to 10^9hz depending on context. So you should check if this is an actual problem in your particular context.

Related

What is an efficient equivalent in C# for Span<Span<T>>, which does not exist?

I was porting some older high-speed C++ code to C#, and the existing code made use of a pointer-based double-indirection pattern like this (written here in a C# syntax), using the stack as efficient temporary storage:
public struct Source {
public byte* Start;
public int Count;
}
public struct SourceSet {
public Source* Start;
public int Count;
}
Source* sources = stackalloc Source[n*k];
SourceSet* sourceSets = stackalloc SourceSet[n];
It populates the sources with sets of segments of the data, and populates the sourceSets with how many Sources are in each set.
The code works fine, but ideally, I would like to convert it to no longer use pointers and unsafe — to instead use something memory-safe like this, where each SourceSet would be populated by a .Slice() of the sources:
public struct Source {
public int Start;
public int Count;
}
Span<Source> sources = stackalloc Source[n*k];
Span<Span<Source>> sourceSets = stackalloc Span<Source>[n];
But I can't write that, because Span<Span<T>> can't exist in C# — Span<T> is a ref struct and thus can't be used as the type parameter of another Span<T>. And I can't use Memory<Memory<T>> as a replacement, because stackalloc only produces a Span<T> (or a pointer, but the goal is to avoid pointers).
(And yes, I understand that Span<Span<T>> will likely never be added to the language, because it could potentially allow the rules about span lifetimes to be violated.)
So what's a good, efficient equivalent in C# for safe double-pointer indirection on the stack, something like Span<Span<T>>, but that actually exists?

What happens when I assign a `ref` value to an `out` argument in C#?

I am writing an class that retrieves binary data and supports generically converting it into primitive types. It's supposed to be as efficient as possible. Here is what it looks like right now:
public abstract class MemorySource {
public abstract Span<byte> ReadBytes(ulong address, int count);
public unsafe bool TryRead<T>(ulong address, out T result) where T : unmanaged {
Span<byte> buffer = ReadBytes(address, sizeof(T));
result = default;
// If the above line is commented, `result = ref <...>` won't compile, showing CS0177.
if (!buffer.IsEmpty) {
result = ref Unsafe.As<byte, T>(ref buffer.GetPinnableReference());
return true;
} else
return false;
}
}
Since I'm working with large amounts of memory, and my code is going to be performing a lot of small reading operations. I want to minimize the amount of times the memory is copied around.
The ReadBytes implementation will either a) create a span across a part an already existing array on the heap, or b) stackalloc a buffer and fill it with data from a remote source (depending on the data I'll be working with). The point is, it will not be allocating anything on the heap by itself.
I want my TryRead<T> method to return a typed reference to the span's memory, rather than copy that memory into a new value, and I want to know if that's possible. I've noticed that I can't assign a ref value to an out argument without initializing it, but I can after, which makes little sense if we assume that I'm assigning a reference.
I guess what I'm asking is, what's really going on in this code? Am I returning a reference to an existing value, or is that value being copied into a new stack-allocated one? And how would the behavior be different with stack-allocated and heap-allocated spans? Would GC know to update the reference of type T when moving the data, in case of a heap-allocated span being used?
Primitive types are value types anyway, so you shouldn't worry about allocating on the heap when reading them.
You cannot use stackalloc for this code, because you can't (or shouldn't try to) return a pointer to it, as it will be destroyed at the end of the function.
The code you have so far is dangerous, because you are returning a pinnable reference which is not actually pinned.
The reason you are having problems with the ref parameter is because in the else you are not assigning it at all. You should move the result = default; line into the else branch.
Either way, you are far better off using MemoryMarshal for all of this, note that this does not require unsafe code
public bool TryRead<T>(ulong address, out T result) where T : unmanaged
{
ReadOnlySpan<byte> buffer = ReadBytes(address, sizeof(T));
if (!buffer.IsEmpty)
{
result = MemoryMarshal.Read<T>(buffer);
return true;
}
result = default;
return false;
}

C# reinterpret bool as byte/int (branch-free)

Is it possible in C# to turn a bool into a byte or int (or any integral type, really) without branching?
In other words, this is not good enough:
var myInt = myBool ? 1 : 0;
We might say we want to reinterpret a bool as the underlying byte, preferably in as few instructions as possible. The purpose is to avoid branch prediction fails as seen here.
unsafe
{
byte myByte = *(byte*)&myBool;
}
Another option is System.Runtime.CompilerServices.Unsafe, which requires a NuGet package on non-Core platforms:
byte myByte = Unsafe.As<bool, byte>(ref myBool);
The CLI specification only defines false as 0 and true as anything except 0 , so technically speaking this might not work as expected on all platforms. However, as far as I know the C# compiler also makes the assumption that there are only two values for bool, so in practice I would expect it to work outside of mostly academic cases.
The usual C# equivalent to "reinterpret cast" is to define a struct with fields of the types you want to reinterpret. That approach works fine in most cases. In your case, that would look like this:
[StructLayout(LayoutKind.Explicit)]
struct BoolByte
{
[FieldOffset(0)]
public bool Bool;
[FieldOffset(0)]
public byte Byte;
}
Then you can do something like this:
BoolByte bb = new BoolByte();
bb.Bool = true;
int myInt = bb.Byte;
Note that you only have to initialize the variable once, then you can set Bool and retrieve Byte as often as you like. This should perform as well or better than any approach involving unsafe code, calling methods, etc., especially with respect to addressing any branch-prediction issues.
It's important to point out that if you can read a bool as a byte, then of course anyone can write a bool as a byte, and the actual int value of the bool when it's true may or may not be 1. It technically could be any non-zero value.
All that said, this will make the code a lot harder to maintain. Both because of the lack of guarantees of what a true value actually looks like, and just because of the added complexity. It would be extremely rare to run into a real-world scenario that suffers from the missed branch-prediction issue you're asking about. Even if you had a legitimate real-world example, it's arguable that it would be better solved some other way. The exact alternative would depend on the specific real-world example, but one example might be to keep the data organized in a way that allows for batch processing on a given condition instead of testing for each element.
I strongly advise against doing something like this, until you have a demonstrated, reproducible real-world problem, and have exhausted other more idiomatic and maintainable options.
Here is a solution that takes more lines (and presumably more instructions) than I would like, but that actually solves the problem directly, i.e. by reinterpreting.
Since .NET Core 2.1, we have some reinterpret methods available in MemoryMarshal. We can treat our bool as a ReadOnlySpan<bool>, which in turn we can treat as a ReadOnlySpan<byte>. From there, it is trivial to read the single byte value.
var myBool = true;
var myBoolSpan = MemoryMarshal.CreateReadOnlySpan(ref myBool, length: 1);
var myByteSpan = MemoryMarshal.AsBytes(myBoolSpan);
var myByte = myByteSpan[0]; // =1
maybe this would work? (source of the idea)
using System;
using System.Reflection.Emit;
namespace ConsoleApp10
{
class Program
{
static Func<bool, int> BoolToInt;
static Func<bool, byte> BoolToByte;
static void Main(string[] args)
{
InitIL();
Console.WriteLine(BoolToInt(true));
Console.WriteLine(BoolToInt(false));
Console.WriteLine(BoolToByte(true));
Console.WriteLine(BoolToByte(false));
Console.ReadLine();
}
static void InitIL()
{
var methodBoolToInt = new DynamicMethod("BoolToInt", typeof(int), new Type[] { typeof(bool) });
var ilBoolToInt = methodBoolToInt.GetILGenerator();
ilBoolToInt.Emit(OpCodes.Ldarg_0);
ilBoolToInt.Emit(OpCodes.Ldc_I4_0); //these 2 lines
ilBoolToInt.Emit(OpCodes.Cgt_Un); //might not be needed
ilBoolToInt.Emit(OpCodes.Ret);
BoolToInt = (Func<bool, int>)methodBoolToInt.CreateDelegate(typeof(Func<bool, int>));
var methodBoolToByte = new DynamicMethod("BoolToByte", typeof(byte), new Type[] { typeof(bool) });
var ilBoolToByte = methodBoolToByte.GetILGenerator();
ilBoolToByte.Emit(OpCodes.Ldarg_0);
ilBoolToByte.Emit(OpCodes.Ldc_I4_0); //these 2 lines
ilBoolToByte.Emit(OpCodes.Cgt_Un); //might not be needed
ilBoolToByte.Emit(OpCodes.Ret);
BoolToByte = (Func<bool, byte>)methodBoolToByte.CreateDelegate(typeof(Func<bool, byte>));
}
}
}
based on microsoft documentation of each emit.
load the parameter in memory (the boolean)
load in memory a value of int = 0
compare if any the parameter is greater than the value (branching here maybe?)
return 1 if true else 0
line 2 and 3 can be removed but the return value could be something else than 0 / 1
like i said in the beginning this code is taken from another response, this seem to be working yes but it seem slow while being benchmarking, lookup .net DynamicMethod slow to find way to make it "faster"
you could maybe use the .GetHashCode of the boolean?
true will return int of 1 and false 0
you can then var myByte = (byte)bool.GetHashCode();

how to read structure which has array in c++

public struct Data
{
public long id,
public datetime time;
public string[] atts;
public string[] names;
};
Guid("3102C9D3-822E-4359-9383-9B3AF7D39F2C")]
public interface IData
{
void GetEvents([MarshalAs(UnmanagedType.LPArray)]out DataResp[] resp);
}
i want to pass and receive complex structs from C# to c++ Component and populate it into c++.
the client code is like this
Lib::Data* data;
long size = svc->GetEvents(&data);
for(size_t i = 0; i < size; ++i)
{
Lib::Data& current = data[i];
long val = current.Value;
bstr_t unit = current.Unit;
can any one help me how to read string arrays from c++ and also marshal datetime.
You will face several issues while doing so. Most of the time, you will have to decorate the structs with several attributes.
For passing strings, I wold recommend that you build a string of size expected by C++ using Stringbuilder which will be sufficient for it.
One more issue which I have faced several time is of packing. It changes the alignment of structure. To resolve, you should make sure that structs are sequentially aligned by providing proper attributes.
Further, if it requires you to pass any pointers, make sure you pin the objects before passing it otherwise the runtime will change it.

How to determine the size of an instance?

I have set my project to accept unsafe code and have the following helper Class to determine the size of an instance:
struct MyStruct
{
public long a;
public long b;
}
public static class CloneHelper
{
public unsafe static void GetSize(BookSetViewModel book)
{
long n = 0;
MyStruct inst;
inst.a = 0;
inst.b = 0;
n = Marshal.SizeOf(inst);
}
}
This works perfectly fine with a struct. However as soon as I use the actual class-instance that is passed in:
public unsafe static void GetSize(BookSetViewModel book)
{
long n = 0;
n = Marshal.SizeOf(book);
}
I get this error:
Type 'BookSetViewModel' cannot be marshaled as an unmanaged structure;
no meaningful size or offset can be computed.
Any idea how I could fix this?
Thanks,
Well, it really depends on what you mean by the "size" of an instance. There's the size of the single object in memory, but you usually need to think about any objects that the root object refers to. That's how much memory may be reclaimable after the root becomes eligible for garbage collection... but you can't just add them up, as those objects may be referred to by multiple other objects, and indeed there may be repeated references even within a single object.
This blog post shows some code I've used before to determine the size of the raw objects (header + fields), disregarding any extra cost due to the objects that one object refers to. It's not something I would use in production code, but it's useful for experimenting with how large an object is under varying circumstances.

Categories

Resources