Greetings Overflowers,
I love the flexibility of memory mapped files in that you can read/write any value type.
Is there a way to do the same with byte arrays without having to copy them into for e.g. a memory map buffers ?
Regards
You can use the BitConverter class to convert between base data types and byte arrays.
You can read values directly from the array:
int value = BitConverter.ToInt32(data, pos);
To write data you convert it to a byte array, and copy it into the data:
BitConverter.GetBytes(value).CopyTo(data, pos);
You can bind a MemoryStream to a given byte array, set it's property Position to go to a specific position within the array, and then use a BinaryReader or BinaryWriter to read / write values of different types from/to it.
You are searching the MemoryStream class which can be initialised (without copying!) from a fixed-size byte array.
(Using unsafe code)
The following sample shows how to fill a 16 byte array with two long values, which is something BitConverter still can't do without an additional copy operation:
byte[] bar = new byte[16];
long lValue1 = 1;
long lValue2 = 2;
unsafe {
fixed (byte* bptr = &bar[0]) {
long* lptr = (long*)bptr;
*lptr = lValue1;
// pointer arithmetic: for a long* pointer '+1' adds 8 bytes.
*(lptr + 1) = lValue2;
}
}
Or you could make your own StoreBytes() method:
// here the dest offset is in bytes
public static void StoreBytes(long lValue, byte[] dest, int iDestOffset) {
unsafe {
fixed (byte* bptr = &dest[iDestOffset]) {
long* lptr = (long*)bptr;
*lptr = lValue;
}
}
}
Reading values from a byte array is no problem with BitConverter since you can specify the offset in .ToInt64.
Alternative : use Buffer.BlockCopy, which can convert between array types.
Related
I have a byte[] object that I'm using as a data buffer.
I want to "read" it as an array of a either primitive/non-primitive structs without duplicating the byte[] data in memory.
The goal would be something like:
byte[] myBuffer;
//Buffer is populated
int[] asInts = PixieDust_ToInt(myBuffer);
MyStruct[] asMyStructs = PixieDust_ToMyStruct(myBuffer);
Is this possible? If so, how?
Is it possible? Practically, yes!
Since .NET Core 2.1, MemoryMarshal lets us do this for spans. If you are satisfied with a span instead of an array, then yes.
var intSpan = MemoryMarshal.Cast<byte, int>(myByteArray.AsSpan());
The int span will contain byteCount / 4 integers.
As for custom structs... The documentation claims to require a "primitive type" on both sides of the conversion. However, you might try using a ref struct and see that is the actual constraint. I wouldn't be surprised if it worked!
Note that ref structs are still very limiting, but the limitation makes sense for the kind of reinterpret casts that we are talking about.
Edit: Wow, the constraint is much less strict. It requires any struct, rather than a primitive. It does not even have to be a ref struct. There is only a runtime check that will throw if your struct contains a reference type anywhere in its hierarchy. That makes sense. So this should work for your custom structs as well as it does for ints. Enjoy!
You will not be able to do this. To have a MyStruct[] you'll need to actually create such an array of that type and copy the data over. You could, in theory, create your own custom type that acted as a collection, but was actually just a facade over the byte[], copying the bytes out into the struct objects as a given value was accessed, but if you end up actually accessing all of the values, this would end up copying all of the same data eventually, it would just potentially allow you to defer it a bit and may be helpful if you only actually use a small number of the values.
Consider class System.BitConverter
This class has functions to reinterpret the bytes starting at a given index as an Int32, Int64, Double, Boolean, etc. and back from those types into a sequence of bytes.
Example:
int32 x = 0x12345678;
var xBytes = BitConverter.GetBytes(x);
// bytes is a byte array with length 4: 0x78; 0x56; 0x34; 0x12
var backToInt32 = BitConverter.ToInt32(xBytes, 0);
Or if your array contains mixed data:
double d = 3.1415;
int16 n = 42;
Bool b = true;
Uint64 u = 0xFEDCBA9876543210;
// to array of bytes:
var dBytes = BitConverter.GetBytes(d);
var nBytes = BitConverter.GetBytes(n);
var bBytes = BitConverter.GetBytes(b);
var uBytes = BitConterter.GetBytes(u);
Byte[] myBytes = dBytes.Concat(nBytes).Concat(bBytes).Concat(uBytes).ToArray();
// startIndexes in myBytes:
int startIndexD = 0;
int startIndexN = dBytes.Count();
int startIndexB = startIndexN + nBytes.Count();
int startIndexU = startIndexB + bBytes.Count();
// back to original elements
double dRestored = Bitconverter.ToDouble(myBytes, startIndexD);
int16 nRestored = BitConverter.ToInt16(myBytes, startIndexN);
bool bRestored = BitConverter.ToBool(myBytes, startIndexB);
Uint64 uRestored = BitConverter.ToUint64(myBytes, startIndexU);
The closest you will get in order to convert a byte[] to other base-types is
Byte[] b = GetByteArray();
using(BinaryReader r = new BinaryReader(new MemoryStream(b)))
{
r.ReadInt32();
r.ReadDouble();
r.Read...();
}
There is however no simple way to convert a byte[] to any kind of object[]
I want to export a C-like union into a byte array, like this :
[StructLayout(LayoutKind.Explicit)]
struct my_struct
{
[FieldOffset(0)]
public UInt32 my_uint;
[FieldOffset(0)]
public bool other_field;
}
public static void Main()
{
var test = new my_struct { my_uint = 0xDEADBEEF };
byte[] data = new byte[Marshal.SizeOf(test)];
IntPtr buffer = Marshal.AllocHGlobal(data.Length);
Marshal.StructureToPtr(test, buffer, false);
Marshal.Copy(buffer, data, 0, data.Length);
Marshal.FreeHGlobal(buffer);
foreach (byte b in data)
{
Console.Write("{0:X2} ", b);
}
Console.WriteLine();
}
The output we get (https://dotnetfiddle.net/gb1wRf) is 01 00 00 00 instead of the expected EF BE AD DE.
Now, what do we get if we change the other_field type to byte (for instance) ?
Oddly, we get the output we wanted in the first place, EF BE AD DE (https://dotnetfiddle.net/DnXyMP)
Moreover, if we swap the original two fields, we again get the same output we wanted (https://dotnetfiddle.net/ziSQ5W)
Why is this happening? Why would the order of the fields matter ? Is there a better (reliable) solution for doing the same thing ?
This is an inevitable side-effect of the way a structure is marshaled. Starting point is that the structure value is not blittable, a side-effect of it containing a bool. Which takes 1 byte of storage in the managed struct but 4 bytes in the marshaled struct (UnmanagedType.Bool).
So the struct value cannot just be copied in one fell swoop, the marshaller needs to convert each individual member. So the my_uint is first, producing 4 bytes. The other_field is next, also producing 4 bytes at the exact same address. Which overwrites everything that my_uint produced.
The bool type is an oddity in general, it never produces a blittable struct. Not even when you apply [MarshalAs(UnmanagedType.U1)]. Which in itself has an interesting effect on your test, you'll now see that the 3 upper bytes produced by my_int are preserved. But the result is still junk since the members are still converted one-by-one, now producing a single byte of value 0x01 at offset 0.
You can easily get what you want by declaring it as a byte instead, now the struct is blittable:
[StructLayout(LayoutKind.Explicit)]
struct my_struct {
[FieldOffset(0)]
public UInt32 my_uint;
[FieldOffset(0)]
private byte _other_field;
public bool other_field {
get { return _other_field != 0; }
set { _other_field = (byte)(value ? 1 : 0); }
}
}
I admit that I don't have an authoritative answer for why Marshal.StructureToPtr() behaves this way, other than that clear it is doing more than just copying bytes. Rather, it must be interpreting the struct itself, marshaling each field individually to the destination via the normal rules for interpreting that field. Since bool is defined to only ever be one of two values, the non-zero value gets mapped to true, which marshals to raw bytes as 0x00000001.
Note that if you really just want the raw bytes from the struct value, you can do the copying yourself instead of going through the Marshal class. For example:
var test = new my_struct { my_uint = 0xDEADBEEF };
byte[] data = new byte[Marshal.SizeOf(test)];
unsafe
{
byte* pb = (byte*)&test;
for (int i = 0; i < data.Length; i++)
{
data[i] = pb[i];
}
}
Console.WriteLine(string.Join(" ", data.Select(b => b.ToString("X2"))));
Of course, for that to work you will need to enable unsafe code for your project. You can either do that for the project in question, or build the above into a separate helper assembly where unsafe is less risky (i.e. where you don't mind enabling it for other code, and/or don't care about the assembly being verifiable, etc.).
I've found questions such as this one, which have come close to solving my dilemma. However, I've yet to find a clean approach to solving this issue in a generic manner.
I have a project that has a lot of structs that will be used for binary data transmission. This data needs to be Big Endian and, of course, most .Net architecture is Little Endian. This means that when I convert my structs to bytes, the byte order for my values are reversed.
Is there a fairly straight-forward approach to either forcing my structs to contain data in Big Endian format, or is there a way to generically write these structs to byte arrays (and byte arrays to structs) that output Big Endian data?
Here is some sample code for what I've already done.
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public unsafe struct StructType_1
{
short shortVal;
ulong ulongVal;
int intVal;
}
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public unsafe struct StructType_2
{
long longVal_1;
long longVal_2;
long longVal;
int intVal;
}
...
public static class StructHelper
{
//How can I change the following methods so that the struct data
//is converted to and from BigEndian data?
public static byte[] StructToByteArray<T>(T structVal) where T : struct
{
int size = Marshal.SizeOf(structVal);
byte[] arr = new byte[size];
IntPtr ptr = Marshal.AllocHGlobal(size);
Marshal.StructureToPtr(structVal, ptr, true);
Marshal.Copy(ptr, arr, 0, size);
Marshal.FreeHGlobal(ptr);
return arr;
}
public static T StructFromByteArray<T>(byte[] bytes) where T : struct
{
int sz = Marshal.SizeOf(typeof(T));
IntPtr buff = Marshal.AllocHGlobal(sz);
Marshal.Copy(bytes, 0, buff, sz);
T ret = (T)Marshal.PtrToStructure(buff, typeof(T));
Marshal.FreeHGlobal(buff);
return ret;
}
}
If you don't mind reading and writing each field to a stream (which may have performance implications) you could use Jon Skeet's EndianBinaryWriter: https://stackoverflow.com/a/1540269/106159
The code would look something like this:
public unsafe struct StructType_2
{
long longVal_1;
long longVal_2;
long longVal;
int intVal;
}
using (MemoryStream memory = new MemoryStream())
{
using (EndianBinaryWriter writer = new EndianBinaryWriter(EndianBitConverter.Big, stream))
{
writer.Write(longVal_1);
writer.Write(longVal_2);
writer.Write(longVal);
writer.Write(intVal);
}
byte[] buffer = memory.ToArray();
// Use buffer
}
You would use the EndianBinaryReader for data going in the opposite direction.
This does of course have two fairly large drawbacks:
It's fiddly writing code to read and write each field.
It may be too slow to do it this way, depending on performance requirements.
Also have a look at this answers to this similar question: Marshalling a big-endian byte collection into a struct in order to pull out values
Given the example on the BitConverter that shows converting a uint, I would suspect not.
Once again I'm receiving structs via UDP from a C++ programm,
Now I ported the structs to C#, Example:
[Serializable]
struct sample
{
public int in;
public byte[] arr;
public int[] arr2;
public float fl;
}
Ok so how does the Deserializer know when one array ends and the other begins?
Can specify somehow how big the array is?
I don't want to use fixed, since this makes my code unsafe, and I also can't use a Constructor since structs are not allowed to contain constructors without parameters.
Any suggestions?
//edit:
the arrays are known to be 32 and 4 long.
the problem is that I don't know how to pass this information to the deserialiser
then sender is C++ an works like this:
char* pr = &sample;
int i=0;
while (i<sizeof(sample))
{
udp.send(*(pr+i))
i++;
}
Now that you have told us that the lengths are of pre-defined length, then the following statement becomes clearer:
I don't know how to pass this information to the deserialiser
In fact, it becomes moot. There is no pre-defined serializer that is going to help you here. You have two options:
A: write your own serializer, and process the data now that you know the format - perhaps using BinaryReader:
using(var reader = new BinaryReader(source)) {
int in = reader.ReadInt32();
byte[] arr = reader.ReadBytes(32);
int[] arr2 = new int[4];
for(int i = 0 ; i < 4 ; i++) arr2[i] = reader.ReadInt32();
float fl = reader.ReadSingle();
var obj = /* something involving ^^^ */
}
B: buffer 56 bytes, and use really nasty unsafe / fixed / pointer-banging code
I strongly suggest the first. In particular, this will also allow you to address endianness if required.
IN THE NAME OF EVERYTHING SACRED TO YOU, DO NOT DO THIS:
using System;
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Explicit)]
unsafe struct sample
{
[FieldOffset(0)] public int #in;
[FieldOffset(4)] public fixed byte arr[32];
[FieldOffset(36)] public fixed int arr2[4];
[FieldOffset(52)] public float fl;
}
static class Program
{
unsafe static void Main()
{
byte[] buffer = new byte[56];
new Random().NextBytes(buffer); // some data...
sample result;
fixed(byte* tmp = buffer)
{
sample* ptr = (sample*) tmp;
result = ptr[0];
}
Console.WriteLine(result.#in);
Console.WriteLine(result.fl);
}
}
For larger buffers, you can treat ptr as an unsafe array of multiple sample, accessed by index:
int #in = ptr[i].#in;
(etc)
But honestly... there are so many things "evil" with that, I honestly don't know where to begin... just... unless you know absolutely what every line in there is doing, have done it before, and understand all the traps... DON'T EVEN THINK ABOUT IT
It depends on the format used to pass the structure over the wire.
If say it's json, then each field will have a key and the array will be surrounded by [].
If say it's xml, then you would expect an arr node with child nodes.
If it's some arbitrary format, you need to know the format.
Deserializers have some default behavior but if the passed data is not in default format, you need to tell them exactly how to deserialize.
And how is the raw data documented? I would expect there to be something to tell you here, for example, I might expect it to say the format is (purely an example)
4 bytes NBO Int32 (in)
4 bytes NBO Int32 (length of arr)
len bytes (arr)
4 bytes NBO (length of arr2)
4 * len bytes (arr2, each in NBO Int32)
4 bytes IEEE-754 (fl)
You need to know the format.
Edit: if the C++ arrays are of a known fixed length, then you simply need to know those lengths in advance.
Is there a way to read binary data from file into an array like in C where I can pass a pointer of any type to the i/o functions? I am thinking of something like BinaryReader::ReadBytes(), but that returns a byte[] which I cannot cast to the desired array pointer type.
If you have a fixed size struct
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Pack = 1)]
struct MyFixedStruct
{
//..
}
You can then read it in in one go using this:
public static T ReadStruct<T>(Stream stream)
{
byte[] buffer = new byte[Marshal.SizeOf(typeof(T))];
stream.Read(buffer, 0, Marshal.SizeOf(typeof(T)));
GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
T typedStruct = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
handle.Free();
return typedStruct;
}
This reads in a byte array covering the size of the struct and then marshals the byte array into the structure. You can use it like this:
MyFixedStruct fixedStruct = ReadStruct<MyFixedStruct>(stream);
The struct may include array types as long as the array length is specified, i.e:
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Pack = 1)]
public struct MyFixedStruct
{
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 5)]
public int[] someInts; // 5 int's
//..
};
Edit:
I see you just want to read in a short array - In this case just read in the byte array and use Buffer.BlockCopy() to convert to the array you want:
byte[] someBytes = ..
short[] someShorts = new short[someBytes.Length/2];
Buffer.BlockCopy(someBytes, 0, someShorts, 0, someBytes.Length);
This is quite efficient, equivalent to a memcpy in C++ under the hood. The only other overhead you have of course is that the original byte array will be allocated and later garbage collected. This approach would work for any other primitive array type as well.
How about storing a serialized array of your struct in the file? You can build the array of structs easily. Not sure how to stream through the file though, as you would do in C.
How about using a Stream Class as it provides a generic view for sequence of bytes.