difference of two methods for converting byte[] to structure in c# - c#

I'm doing some conversions between some structures and thier byte[] representation. I found two way to do this but the difference (performance, memory and ...) is not clear to me.
Method 1:
public static T ByteArrayToStructure<T>(byte[] buffer)
{
int length = buffer.Length;
IntPtr i = Marshal.AllocHGlobal(length);
Marshal.Copy(buffer, 0, i, length);
T result = (T)Marshal.PtrToStructure(i, typeof(T));
Marshal.FreeHGlobal(i);
return result;
}
Method 2:
public static T Deserialize<T>(byte[] buffer)
{
BinaryFormatter formatter = new BinaryFormatter();
using (System.IO.MemoryStream stream = new System.IO.MemoryStream(buffer))
{
return (T)formatter.Deserialize(stream);
}
}
so which one is better and what is the major difference?

You are talking about two different approaches and two different types of data. If you are working with raw values converted to Byte Array, go for the first method. If you are dealing with values serialized into Byte Array (they also contains serialization data), go for the second method. Two different situations, two different methods... they are not, let me say, "synonyms".
Int32 Serialized into Byte[] -> Length 54
Int32 Converted to Byte[] -> Length 4

When using the BinaryFormatter to Serialize your data, it will append meta data in the output stream for use during Deserialization. So for the two examples you have, youll find it wont produce the same T output given the same byte[] input. So youll need to decide if you care about the meta data in the binary output or not. If you dont care, method 2 is obviously cleaner. If you need it to be straight binary, then you'll have to use something like method 1.

Related

In C#, is it possible to cast a byte array into another type without creating a new array?

I have a byte[] object that I'm using as a data buffer.
I want to "read" it as an array of a either primitive/non-primitive structs without duplicating the byte[] data in memory.
The goal would be something like:
byte[] myBuffer;
//Buffer is populated
int[] asInts = PixieDust_ToInt(myBuffer);
MyStruct[] asMyStructs = PixieDust_ToMyStruct(myBuffer);
Is this possible? If so, how?
Is it possible? Practically, yes!
Since .NET Core 2.1, MemoryMarshal lets us do this for spans. If you are satisfied with a span instead of an array, then yes.
var intSpan = MemoryMarshal.Cast<byte, int>(myByteArray.AsSpan());
The int span will contain byteCount / 4 integers.
As for custom structs... The documentation claims to require a "primitive type" on both sides of the conversion. However, you might try using a ref struct and see that is the actual constraint. I wouldn't be surprised if it worked!
Note that ref structs are still very limiting, but the limitation makes sense for the kind of reinterpret casts that we are talking about.
Edit: Wow, the constraint is much less strict. It requires any struct, rather than a primitive. It does not even have to be a ref struct. There is only a runtime check that will throw if your struct contains a reference type anywhere in its hierarchy. That makes sense. So this should work for your custom structs as well as it does for ints. Enjoy!
You will not be able to do this. To have a MyStruct[] you'll need to actually create such an array of that type and copy the data over. You could, in theory, create your own custom type that acted as a collection, but was actually just a facade over the byte[], copying the bytes out into the struct objects as a given value was accessed, but if you end up actually accessing all of the values, this would end up copying all of the same data eventually, it would just potentially allow you to defer it a bit and may be helpful if you only actually use a small number of the values.
Consider class System.BitConverter
This class has functions to reinterpret the bytes starting at a given index as an Int32, Int64, Double, Boolean, etc. and back from those types into a sequence of bytes.
Example:
int32 x = 0x12345678;
var xBytes = BitConverter.GetBytes(x);
// bytes is a byte array with length 4: 0x78; 0x56; 0x34; 0x12
var backToInt32 = BitConverter.ToInt32(xBytes, 0);
Or if your array contains mixed data:
double d = 3.1415;
int16 n = 42;
Bool b = true;
Uint64 u = 0xFEDCBA9876543210;
// to array of bytes:
var dBytes = BitConverter.GetBytes(d);
var nBytes = BitConverter.GetBytes(n);
var bBytes = BitConverter.GetBytes(b);
var uBytes = BitConterter.GetBytes(u);
Byte[] myBytes = dBytes.Concat(nBytes).Concat(bBytes).Concat(uBytes).ToArray();
// startIndexes in myBytes:
int startIndexD = 0;
int startIndexN = dBytes.Count();
int startIndexB = startIndexN + nBytes.Count();
int startIndexU = startIndexB + bBytes.Count();
// back to original elements
double dRestored = Bitconverter.ToDouble(myBytes, startIndexD);
int16 nRestored = BitConverter.ToInt16(myBytes, startIndexN);
bool bRestored = BitConverter.ToBool(myBytes, startIndexB);
Uint64 uRestored = BitConverter.ToUint64(myBytes, startIndexU);
The closest you will get in order to convert a byte[] to other base-types is
Byte[] b = GetByteArray();
using(BinaryReader r = new BinaryReader(new MemoryStream(b)))
{
r.ReadInt32();
r.ReadDouble();
r.Read...();
}
There is however no simple way to convert a byte[] to any kind of object[]

Strange unmarshalling behavior with union in C#

I want to export a C-like union into a byte array, like this :
[StructLayout(LayoutKind.Explicit)]
struct my_struct
{
[FieldOffset(0)]
public UInt32 my_uint;
[FieldOffset(0)]
public bool other_field;
}
public static void Main()
{
var test = new my_struct { my_uint = 0xDEADBEEF };
byte[] data = new byte[Marshal.SizeOf(test)];
IntPtr buffer = Marshal.AllocHGlobal(data.Length);
Marshal.StructureToPtr(test, buffer, false);
Marshal.Copy(buffer, data, 0, data.Length);
Marshal.FreeHGlobal(buffer);
foreach (byte b in data)
{
Console.Write("{0:X2} ", b);
}
Console.WriteLine();
}
The output we get (https://dotnetfiddle.net/gb1wRf) is 01 00 00 00 instead of the expected EF BE AD DE.
Now, what do we get if we change the other_field type to byte (for instance) ?
Oddly, we get the output we wanted in the first place, EF BE AD DE (https://dotnetfiddle.net/DnXyMP)
Moreover, if we swap the original two fields, we again get the same output we wanted (https://dotnetfiddle.net/ziSQ5W)
Why is this happening? Why would the order of the fields matter ? Is there a better (reliable) solution for doing the same thing ?
This is an inevitable side-effect of the way a structure is marshaled. Starting point is that the structure value is not blittable, a side-effect of it containing a bool. Which takes 1 byte of storage in the managed struct but 4 bytes in the marshaled struct (UnmanagedType.Bool).
So the struct value cannot just be copied in one fell swoop, the marshaller needs to convert each individual member. So the my_uint is first, producing 4 bytes. The other_field is next, also producing 4 bytes at the exact same address. Which overwrites everything that my_uint produced.
The bool type is an oddity in general, it never produces a blittable struct. Not even when you apply [MarshalAs(UnmanagedType.U1)]. Which in itself has an interesting effect on your test, you'll now see that the 3 upper bytes produced by my_int are preserved. But the result is still junk since the members are still converted one-by-one, now producing a single byte of value 0x01 at offset 0.
You can easily get what you want by declaring it as a byte instead, now the struct is blittable:
[StructLayout(LayoutKind.Explicit)]
struct my_struct {
[FieldOffset(0)]
public UInt32 my_uint;
[FieldOffset(0)]
private byte _other_field;
public bool other_field {
get { return _other_field != 0; }
set { _other_field = (byte)(value ? 1 : 0); }
}
}
I admit that I don't have an authoritative answer for why Marshal.StructureToPtr() behaves this way, other than that clear it is doing more than just copying bytes. Rather, it must be interpreting the struct itself, marshaling each field individually to the destination via the normal rules for interpreting that field. Since bool is defined to only ever be one of two values, the non-zero value gets mapped to true, which marshals to raw bytes as 0x00000001.
Note that if you really just want the raw bytes from the struct value, you can do the copying yourself instead of going through the Marshal class. For example:
var test = new my_struct { my_uint = 0xDEADBEEF };
byte[] data = new byte[Marshal.SizeOf(test)];
unsafe
{
byte* pb = (byte*)&test;
for (int i = 0; i < data.Length; i++)
{
data[i] = pb[i];
}
}
Console.WriteLine(string.Join(" ", data.Select(b => b.ToString("X2"))));
Of course, for that to work you will need to enable unsafe code for your project. You can either do that for the project in question, or build the above into a separate helper assembly where unsafe is less risky (i.e. where you don't mind enabling it for other code, and/or don't care about the assembly being verifiable, etc.).

Deserialize an array of struct

Once again I'm receiving structs via UDP from a C++ programm,
Now I ported the structs to C#, Example:
[Serializable]
struct sample
{
public int in;
public byte[] arr;
public int[] arr2;
public float fl;
}
Ok so how does the Deserializer know when one array ends and the other begins?
Can specify somehow how big the array is?
I don't want to use fixed, since this makes my code unsafe, and I also can't use a Constructor since structs are not allowed to contain constructors without parameters.
Any suggestions?
//edit:
the arrays are known to be 32 and 4 long.
the problem is that I don't know how to pass this information to the deserialiser
then sender is C++ an works like this:
char* pr = &sample;
int i=0;
while (i<sizeof(sample))
{
udp.send(*(pr+i))
i++;
}
Now that you have told us that the lengths are of pre-defined length, then the following statement becomes clearer:
I don't know how to pass this information to the deserialiser
In fact, it becomes moot. There is no pre-defined serializer that is going to help you here. You have two options:
A: write your own serializer, and process the data now that you know the format - perhaps using BinaryReader:
using(var reader = new BinaryReader(source)) {
int in = reader.ReadInt32();
byte[] arr = reader.ReadBytes(32);
int[] arr2 = new int[4];
for(int i = 0 ; i < 4 ; i++) arr2[i] = reader.ReadInt32();
float fl = reader.ReadSingle();
var obj = /* something involving ^^^ */
}
B: buffer 56 bytes, and use really nasty unsafe / fixed / pointer-banging code
I strongly suggest the first. In particular, this will also allow you to address endianness if required.
IN THE NAME OF EVERYTHING SACRED TO YOU, DO NOT DO THIS:
using System;
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Explicit)]
unsafe struct sample
{
[FieldOffset(0)] public int #in;
[FieldOffset(4)] public fixed byte arr[32];
[FieldOffset(36)] public fixed int arr2[4];
[FieldOffset(52)] public float fl;
}
static class Program
{
unsafe static void Main()
{
byte[] buffer = new byte[56];
new Random().NextBytes(buffer); // some data...
sample result;
fixed(byte* tmp = buffer)
{
sample* ptr = (sample*) tmp;
result = ptr[0];
}
Console.WriteLine(result.#in);
Console.WriteLine(result.fl);
}
}
For larger buffers, you can treat ptr as an unsafe array of multiple sample, accessed by index:
int #in = ptr[i].#in;
(etc)
But honestly... there are so many things "evil" with that, I honestly don't know where to begin... just... unless you know absolutely what every line in there is doing, have done it before, and understand all the traps... DON'T EVEN THINK ABOUT IT
It depends on the format used to pass the structure over the wire.
If say it's json, then each field will have a key and the array will be surrounded by [].
If say it's xml, then you would expect an arr node with child nodes.
If it's some arbitrary format, you need to know the format.
Deserializers have some default behavior but if the passed data is not in default format, you need to tell them exactly how to deserialize.
And how is the raw data documented? I would expect there to be something to tell you here, for example, I might expect it to say the format is (purely an example)
4 bytes NBO Int32 (in)
4 bytes NBO Int32 (length of arr)
len bytes (arr)
4 bytes NBO (length of arr2)
4 * len bytes (arr2, each in NBO Int32)
4 bytes IEEE-754 (fl)
You need to know the format.
Edit: if the C++ arrays are of a known fixed length, then you simply need to know those lengths in advance.

How to read byte blocks into struct

I have this resource file which I need to process, wich packs a set of files.
First, the resource file lists all the files contained within, plus some other data, such as in this struct:
struct FileEntry{
byte Value1;
char Filename[12];
byte Value2;
byte FileOffset[3];
float whatever;
}
So I would need to read blocks exactly this size.
I am using the Read function from FileStream, but how can I specify the size of the struct?
I used:
int sizeToRead = Marshal.SizeOf(typeof(Header));
and then pass this value to Read, but then I can only read a set of byte[] which I do not know how to convert into the specified values (well I do know how to get the single byte values... but not the rest of them).
Also I need to specify an unsafe context which I don't know whether it's correct or not...
It seems to me that reading byte streams is tougher than I thought in .NET :)
Thanks!
Assuming this is C#, I wouldn't create a struct as a FileEntry type. I would replace char[20] with strings and use a BinaryReader - http://msdn.microsoft.com/en-us/library/system.io.binaryreader.aspx to read individual fields. You must read the data in the same order as it was written.
Something like:
class FileEntry {
byte Value1;
char[] Filename;
byte Value2;
byte[] FileOffset;
float whatever;
}
using (var reader = new BinaryReader(File.OpenRead("path"))) {
var entry = new FileEntry {
Value1 = reader.ReadByte(),
Filename = reader.ReadChars(12) // would replace this with string
FileOffset = reader.ReadBytes(3),
whatever = reader.ReadFloat()
};
}
If you insist having a struct, you should make your struct immutable and create a constructor with arguments for each of your field.
If you can use unsafe code:
unsafe struct FileEntry{
byte Value1;
fixed char Filename[12];
byte Value2;
fixed byte FileOffset[3];
float whatever;
}
public unsafe FileEntry Get(byte[] src)
{
fixed(byte* pb = &src[0])
{
return *(FileEntry*)pb;
}
}
The fixed keyword embeds the array in the struct. Since it is fixed, this can cause GC issues if you are constantly creating these and never letting them go. Keep in mind that the constant sizes are the n*sizeof(t). So the Filename[12] is allocating 24 bytes (each char is 2 bytes unicode) and FileOffset[3] is allocating 3 bytes. This matters if you're not dealing with unicode data on disk. I would recommend changing it to a byte[] and converting the struct to a usable class where you can convert the string.
If you can't use unsafe, you can do the whole BinaryReader approach:
public unsafe FileEntry Get(Stream src)
{
FileEntry fe = new FileEntry();
var br = new BinaryReader(src);
fe.Value1 = br.ReadByte();
...
}
The unsafe way is nearly instant, far faster, especially when you're converting a lot of structs at once. The question is do you want to use unsafe. My recommendation is only use the unsafe method if you absolutely need the performance boost.
Base on this article, only I have made it generic, this is how to marshal the data directly to the struct. Very useful on longer data types.
public static T RawDataToObject<T>(byte[] rawData) where T : struct
{
var pinnedRawData = GCHandle.Alloc(rawData,
GCHandleType.Pinned);
try
{
// Get the address of the data array
var pinnedRawDataPtr = pinnedRawData.AddrOfPinnedObject();
// overlay the data type on top of the raw data
return (T) Marshal.PtrToStructure(pinnedRawDataPtr, typeof(T));
}
finally
{
// must explicitly release
pinnedRawData.Free();
}
}
Example Usage:
[StructLayout(LayoutKind.Sequential)]
public struct FileEntry
{
public readonly byte Value1;
//you may need to play around with this one
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 12)]
public readonly string Filename;
public readonly byte Value2;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
public readonly byte[] FileOffset;
public readonly float whatever;
}
private static void Main(string[] args)
{
byte[] data =;//from file stream or whatever;
//usage
FileEntry entry = RawDataToObject<FileEntry>(data);
}
Wrapping your FileStream with a BinaryReader will give you dedicated Read*() methods for primitive types:
http://msdn.microsoft.com/en-us/library/system.io.binaryreader.aspx
Out of my head, you could probably mark your struct with [StructLayout(LayoutKind.Sequential)] (to ensure proper representation in memory) and use a pointer in unsafe block to actually fill the struct C-style. Going unsafe is not recommended if you don't really need it (interop, heavy operations like image processing and so on) however.
Not a full answer (it's been covered I think), but a specific note on the filename:
The Char type is probably not a one-byte thing in C#, since .Net characters are unicode, meaning they support character values far beyond 255, so interpreting your filename data as Char[] array will give problems. So the first step is definitely to read that as Byte[12], not Char[12].
A straight conversion from byte array to char array is also not advised, though, since in binary indices like this, filenames that are shorter than the allowed 12 characters will probably be padded with '00' bytes, so a straight conversion will result in a string that's always 12 characters long and might end on these zero-characters.
However, simply trimming these zeroes off is not advised, since reading systems for such data usually simply read up to the first encountered zero, and the data behind that in the array might actually contain garbage if the writing system doesn't bother to specifically clear its buffer with zeroes before putting the string into it. It's something a lot of programs don't bother doing, since they assume the reading system will only interpret the string up to the first zero anyway.
So, assuming this is indeed such a typical zero-terminated (C-style) string, saved in a one-byte-per-character text encoding (like ASCII, DOS-437 or Win-1252), the second step is to cut off the string on the first zero. You can easily do this with Linq's TakeWhile function. Then the third and final step is to convert the resulting byte array to string with whatever that one-byte-per-character text encoding it's written with happens to be:
public String StringFromCStringArray(Byte[] readData, Encoding encoding)
{
return encoding.GetString(readData.TakeWhile(x => x != 0).ToArray());
}
As I said, the encoding will probably be something like pure ASCII, which can be accessed from Encoding.ASCII, standard US DOS encoding, which is Encoding.GetEncoding(437), or Windows-1252, the standard US / western Europe Windows text encoding, which you can retrieve with Encoding.GetEncoding("Windows-1252").

Reading from/Writing to Byte Arrays in C# .net 4

Greetings Overflowers,
I love the flexibility of memory mapped files in that you can read/write any value type.
Is there a way to do the same with byte arrays without having to copy them into for e.g. a memory map buffers ?
Regards
You can use the BitConverter class to convert between base data types and byte arrays.
You can read values directly from the array:
int value = BitConverter.ToInt32(data, pos);
To write data you convert it to a byte array, and copy it into the data:
BitConverter.GetBytes(value).CopyTo(data, pos);
You can bind a MemoryStream to a given byte array, set it's property Position to go to a specific position within the array, and then use a BinaryReader or BinaryWriter to read / write values of different types from/to it.
You are searching the MemoryStream class which can be initialised (without copying!) from a fixed-size byte array.
(Using unsafe code)
The following sample shows how to fill a 16 byte array with two long values, which is something BitConverter still can't do without an additional copy operation:
byte[] bar = new byte[16];
long lValue1 = 1;
long lValue2 = 2;
unsafe {
fixed (byte* bptr = &bar[0]) {
long* lptr = (long*)bptr;
*lptr = lValue1;
// pointer arithmetic: for a long* pointer '+1' adds 8 bytes.
*(lptr + 1) = lValue2;
}
}
Or you could make your own StoreBytes() method:
// here the dest offset is in bytes
public static void StoreBytes(long lValue, byte[] dest, int iDestOffset) {
unsafe {
fixed (byte* bptr = &dest[iDestOffset]) {
long* lptr = (long*)bptr;
*lptr = lValue;
}
}
}
Reading values from a byte array is no problem with BitConverter since you can specify the offset in .ToInt64.
Alternative : use Buffer.BlockCopy, which can convert between array types.

Categories

Resources