I am working on a software that will exchange data with a Siemens PLC (industrial controller).
In order for it to work, I need to be able to serialize and deserialize a byte array containing only the current values of the variables.
The problem I am facing is that Serialize/Deserialize methods add a lot of information beyond the current value of a variable.
The instance of the following class:
[Serializable]
public class VarMap
{
public byte var1;
public int64 var2;
public int32 var3;
}
After serialized, needed to be a byte array containing one value after the other, each ocuppying their size in bytes:
[var1 byte 1][var2 byte 1][var2 byte 2][var2 byte 3][var2 byte 4][var3 byte 1][var3 byte 2].
Any ideas how to make this happen dinamically according to the declaration of the class?
For complex reference types that refer to other reference types, there is not a simple way to flatten those out into a pure sequence of bytes and there is a reason serializers have to use more complex formats to handle such cases.
That said, the type you show is what is called a blittable type.
Blittable types only contain fields of other primitive types whose bit pattern can be directly copied.
By changing your class to a struct, you could generically persist such types to disk directly. This does require the use of an unsafe block, but the code is relatively simple.
There are ways to do the same thing without using unsafe code, but using unsafe is the most straight-forward since you can just take a pointer and use that to write directly to disk, the network, etc.
First, change to a struct:
[StructLayout(LayoutKind.Sequential)]
public struct VarMap
{
public byte var1;
public long var2;
public int var3;
}
Then, create one and write it directly to disk:
VarMap vm = new VarMap
{
var1 = 254,
var2 = 1234,
var3 = 5678
};
unsafe
{
VarMap* p = &vm;
int sz = sizeof(VarMap);
using FileStream fs = new FileStream(#"out.bin", FileMode.Create);
ReadOnlySpan<byte> span = new ReadOnlySpan<byte>(p, sz);
fs.Write(span);
}
However, if you look at the result in a hex viewer you will notice something:
24 bytes were written to disk, not the 13 that we might assume based on the types of the fields in your data structure.
This is because structures are padded in memory for alignment purposes.
Depending on what the device you are sending these bytes to actually expects (and endian-ness might also be an issue), this kind of technique may or may not work.
If you need absolute control, you probably want to hand-write the serialization code yourself rather than trying to find some generic mechanism to do this. It might be possible using reflection, but probably simpler to just convert your fields to the raw byte representation directly.
Related
I'll try to sum up my initial problem before comes to the actual question of this topic just for a better understanding. If you dont want to read, ignore the summing up section and go straight for the second section when I actually explain the problem.
Suming up the problem
I'm emulating a MMORPG server of a game that already exists (just for study, I already know publishing that in any way is illegal) and I'm facing a lot of trouble when "translating" the raw packets buffer to some structure in code that I can use to avoid having to reference the data in the packets by its offsets in the buffer.
I have some background in reversing and C++. This problem is easily solved in C++ by doing the follow (consider 'packetBuffer' as a 'char*').
MyStructureType* packet = (MyStructureType*)&packetBuffer[0];
The problem starts in the fact that C# offer very much less freedom with user memory management. I can still use pointers, but there are a lot of things I can't do (eg. consider a struct X I use to represent the packet Y, if X have the need to declare a fixed-size array of another structure, even if all the types involved are blittable, I'm in trouble: C# just doesn't allow this). So the solution I took is make these packet structures as classes, format the layout of them (using attributes as StructLayout and MarshalAs) and then use the methods Marshal.PtrToStructure and Marshal.StructureToPtr to convert the raw buffer (byte[]) in some high-level representation and vice-versa. Now we come to my actual problem.
The Actual Question
Well, as stated above, I have POD classes that I use as a higher-level representation of packets data (byte[]). Imagine the packet represents a complex structure which have some nested custom structures. It's all fine until the point I have to declare a fixed size array, represented by a "marshalable" class. Consider I declare an array with 3 elements of the class X inside the class Y, I will end up with 3 references (pointers) to the actual data of X, not the 3 elements "hard-coded" in the buffer of Y. Some code to clarify bellow.
[StructLayout(LayoutKind.Sequential)]
class X
{
int i1;
int i2;
}
[StructLayout(LayoutKind.Sequential)]
class Y
{
X _x1;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
X[] _x2;
}
Instantiating Y, I will end up with an object which have 20 bytes instead of 32 (8 bytes for _x1 + 12 bytes for 3 pointers (4 bytes each) for each data for the three elements of_x2).
So, finally the question: how can I make _x2 be hard-coded in Y instead of storing the pointers?
Thanks.
You are right that there is a dumb restriction on using primitives in fixed size arrays. I have no idea why they don't just test for blittability.
Anyway, if you know you have 3 Foos in your fixed-size array, why don't you just create a wrapper struct that contains Foo1, Foo2, and Foo3 and make it a member of your struct?
BTW I advise against using classes for this kind of work. Use structs - it's what they're there for. Classes have two machine words of extra metadata in front.
struct Root
{
//fixed Leaf[] Leaves[3]; -- not allowed by C#
Leaf3 Leaves; //this is okay though, for some reason
}
struct Leaf
{
int x1;
int x2;
}
struct Leaf3
{
Leaf Leaf1;
Leaf Leaf2;
Leaf Leaf3;
}
First I would start by using a struct. Then I ask why _x2 can't be a struct as well. Something like this:
struct XArray
{
X _x1;
X _x2;
X _x3;
}
In C# you won't be able to use an indexer on the struct but you can still access the elements as needed.
You could try using a union. This is basically Structure Hack for C#, allowing you to access your Leafs by taking a Leaf* to &Leaf and then accessing it with array or pointer syntax as you please.
[StructLayout(LayoutKind.Explicit)]
unsafe struct Root
{
[FieldOffset(0)]
Leaf Leaf;
[FieldOffset(0)]
fixed byte packing[24]; //reserves size of three Leafs.
}
[StructLayout(LayoutKind.Sequential, Pack = 4)]
unsafe struct Leaf
{
int X1;
int X2;
}
I have a binary serialized object in memory and I want to read it from memory by using pointers (unsafae code) in C#. Please look at the following function which is reading from memory stream.
static Results ReadUsingPointers(byte[] data)
{
unsafe
{
fixed (byte* packet = &data[0])
{
return *(Results*)packet;
}
}
}
At this return *(Results*)packet; statement i get a compile time exception "Cannot take the address of, get the size of, or declare a pointer to a managed type Results"
Here is my structure
public struct Results
{
public int Id;
public int Score;
public char[] Product;
}
As per my understanding, all properties of my struct are blittable properties, then why I am getting this error, and what should I do if I need to use char[] in my structure?
EDIT-1
Let me explain further (plz note that the objects are mocked)...
Background:
I have an array of Results objects, I serialized them using binary serialization. Now, at later stages of my program, I need to de-serialize my data in memory as quickly as possible as the data volume is very large. So I was trying, how unsafe code can help me there.
Lets say if my structure don't include public char[] Product;, I get my data back at reasonably good speed. But with char[] it gives me error(compiler should do so). I was looking to find out a solution that work with char[] in this context.
MSDN says:
Any of the following types may be a pointer type:
sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, or bool.
Any enum type.
Any pointer type.
Any user-defined struct type that contains fields of unmanaged types only.
So you could define your struct as follows to fix the compiler error:
public struct Results
{
public int Id;
public int Score;
// Don't actually do this though.
public unsafe char* Product;
}
This way, you can point to the first element of an array.
However, based on your edited question, you need a different approach here.
I have an array of Results objects, I serialized them using binary serialization. Now, at later stages of my program, I need to de-serialize my data in memory as quickly as possible
Usually you would use BinaryFormatter for that purpose. If that is too slow, the question should rather be if serialization can be avoided in the first place.
You cannot expect that to work.
public struct Results
{
public int Id;
public int Score;
public char[] Product;
}
The char[] array Product is a managed type. Your code attempts to use the type Results*. That is a pointer type. The documentation states that you can declared pointers to any of the following:
sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, or bool.
Any enum type.
Any pointer type.
Any user-defined struct type that contains fields of unmanaged types only.
Now, your struct clearly matches none of the first three bullets. And does not match the final bullet either because the array is a managed type.
As per my understanding, all properties of my struct are blittable properties.
Yes that is true, but not relevant. You need the members of the struct to be more than blittable.
Even if your code would compile, how would you imagine that it could work? Consider this expression:
*(Results*)packet
How could the compiler turn that into something that would create a new array and copy the correct number of elements of the array? So clearly the compiler has no hope of doing anything useful here and that of course is why the language rejects your code.
I don't think that unsafe code is going to help you here. When you serialize your array you will have to serialize the length, and then the array's content. To deserialize you need to read the length, create a new array of that length, and then read the content. Unsafe code cannot help with that. A simple memory copy of a statically defined type is no use because that would imply that the array's length was known at compile time. It is not.
Regarding your update, you said:
I have an array of Results objects which I serialized using binary serialization.
In order to deserialize you need code that understands the detailed layout of your binary serialization. The code in the question cannot do it.
What you perhaps have not understood yet is that you cannot expect to copy arbitrary blocks of memory, whose lengths are variable and only known at runtime, without something actually knowing those lengths. In effect you are hoping to be able to copy memory without anything in the system knowing how much to copy.
Your attempts to deserialize using an unsafe typecast and memory copy cannot work. You cannot expect any more detailed help without consideration of the binary format of your serialization.
I have a C# class that contains an int[] array (and a couple of other fields, but the array is the main thing). The code often creates copies of this class and profiling shows that the Array.Copy() call to copy this array takes a lot of time. What can I do to make it faster?
The array size is very small and constant: 12 elements. So ideally I'd like something like a C-style array: a single block of memory that's inside the class itself (not a pointer). Is this possible in C#? (I can use unsafe code if needed.)
I've already tried:
1) Using a UIn64 and bit-shifting instead of the array. (The values of each element are also very small.) This does make the copy fast, but slows down the program overall.
2) Using separate fields for each array element: int element0, int element1, int element2, etc. Again, this is slower overall when I have to access the element at a given index.
I would checkout the System.Buffer.BlockCopy if you are really concerned about speed.
http://msdn.microsoft.com/en-us/library/system.buffer.blockcopy.aspx
Simple Example:
int[] a = new int[] {1,2,3,4,5,6,7,8};
int[] b = new int[a.Length];
int size = sizeof(int);
int length = a.Length * size;
System.Buffer.BlockCopy(a, 0, b, 0, length);
Great discussion on it over here: Array.Copy vs Buffer.BlockCopy
This post is old, but anyone in a similar situation as the OP should have a look at fixed size buffers in structs. They are exactly what OP was asking for: an array of primitive types with a constant size stored directly in the class.
You can create a struct to represent your collection, which will contain the fixed size buffer. The data will be stored directly within the struct, which will be stored directly within your class. You can copy through simple assignment.
They come with a few caveats:
They can only be used with primitive types.
They require the "unsafe" keyword on your struct.
Size must be known at compile time.
It used to be that you had to use the fixed keyword and pointers to access them, but recent changes to C# catering to performance programming have made that unnecessary. You can now work with them just like arrays.
public unsafe struct MyIntContainer
{
private fixed int myIntegers[12];
public int this[int index]
{
get => this.myIntegers[index];
set => this.myIntegers[index] = value;
}
}
There is no built-in bound checking, so it would be best for you to include that yourself on such a property, encapsulating any functionality which skips bound checks inside of a method. I am on mobile, or I would have worked that into my example.
You asked about managed arrays. If you are content to use fixed / unsafe, this can be very fast.
struct is assignable, like any primitive. Almost certainly faster than Buffer.BlockCopy() or any other method, due to the lack of method call overhead:
public unsafe struct MyStruct //the actual struct used, contains all
{
public int a;
public unsafe fixed byte buffer[16];
public ulong b;
//etc.
}
public unsafe struct FixedSizeBufferWrapper //contains _only_ the buffer
{
public unsafe fixed byte buffer[16];
}
unsafe
{
fixed (byte* bufferA = myStructA.buffer, bufferB = myStructB.buffer)
{
*((FixedSizeBufferWrapper*)bufferA) =
*((FixedSizeBufferWrapper*)bufferB);
}
}
We cast fixed-size byte buffers from each of your original structs to the wrapper pointer type and dereference each pointer SO THAT we can assign one to the other by value; assigning fixed buffers directly is not possible, hence the wrapper, which is basically zero overhead (it just affects values used in pointer arithmetic that is done anyway). That wrapper is only ever used for casting.
We have to cast because (at least in my version of C#) we cannot assign anything other than a primitive type (usually byte[]) as the buffer, and we aren't allowed to cast inside fixed(...).
EDIT: This appears get translated into a call to Buffer.Memcpy() (specifically Buffer.memcpy4() in my case, in Unity / Mono) under the hood to do the copy.
Okay, so i am continuing to work on my little game engine to teach me C#/C++ more. Now i am attempting to write a way of storing data in a binary format, that i created. (This is learning, i want to do this my self from scratch). What i am wondering what is the best way of dealing with variable length arrays inside a structure when reading it in C++?
E.g. Here is what i currently have for my structures:
[StructLayout(LayoutKind.Sequential)]
public struct FooBinaryHeader
{
public Int32 m_CheckSumLength;
public byte[] m_Checksum;
public Int32 m_NumberOfRecords;
public FooBinaryRecordHeader[] m_BinaryRecordHeaders;
public FooBinaryRecord[] m_BinaryRecords;
}
[StructLayout(LayoutKind.Sequential)]
public struct FooBinaryRecordHeader
{
public Int32 m_FileNameLength;
public char[] m_FileName;
public Int64 m_Offset;
}
[StructLayout(LayoutKind.Sequential)]
public struct FooBinaryRecord
{
public bool m_IsEncrypted;
public Int64 m_DataSize;
public byte[] m_Data;
}
Now how would i go about in C++ to actually read this in as a structure in C++? I was kinda hoping to get around reading each of the elements one by one and copying them into a structure.
The only real tutorial i found on this is this: http://www.gamedev.net/community/forums/topic.asp?topic_id=310409&whichpage=1�
I'll take a wild guess and say reading this into a C++ structure is not really possible correct?
There's no such thing as a variable length array in a structure.
Suppose I had a structure point such as
struct point
{
int x;
int y;
}
If I wanted an array of 5 of these, the compiler would essentially reserve space for 10 ints. What happens if I ask for an array of structures, of which each contains a variable length array? There's no way to align those in memory since we can't know how much space to reserve for each one.
What you can do is to declare a pointer to the type of which you want a variable length array, because a pointer is a constant size. Then, you allocate enough memory for however many instances of that type, and point to it that way. You'll probably need to also add a length field to the struct so you know exactly how far you can go past the pointer before you risk segfaulting.
It might get a little hairy going back and forth between managed and unmanaged code and allocating and freeing memory, but that's another good exercise for learning C++ and C# together, if anything.
You can read it from binary format mapping a copy of these structures. Each array should be treated as a pointer and you should have a integer with size of this array.
For example in
C#
[StructLayout(LayoutKind.Sequential)]
public struct A
{
public Int32 m_CheckSumLength;
public byte[] m_Checksum;
}
C++
struct A {
int length
char* vector
}
Notes: byte has the same size of char.
When you read from a binary you can read the first 4 byte (int is 32 aka 4 byte) and allocate 4 + (readed length) after that you can read directly to the allocated buffer and treat as a A structure.
Use Marshall.StructToPtr and copy length of structure.
I have a structure that represents a wire format packet. In this structure is an array of other structures. I have generic code that handles this very nicely for most cases but this array of structures case is throwing the marshaller for a loop.
Unsafe code is a no go since I can't get a pointer to a struct with an array (argh!).
I can see from this codeproject article that there is a very nice, generic approach involving C++/CLI that goes something like...
public ref class Reader abstract sealed
{
public:
generic <typename T> where T : value class
static T Read(array<System::Byte>^ data)
{
T value;
pin_ptr<System::Byte> src = &data[0];
pin_ptr<T> dst = &value;
memcpy((void*)dst, (void*)src,
/*System::Runtime::InteropServices::Marshal::SizeOf(T::typeid)*/
sizeof(T));
return value;
}
};
Now if just had the structure -> byte array / writer version I'd be set! Thanks in advance!
Using memcpy to copy an array of bytes to a structure is extremely dangerous if you are not controlling the byte packing of the structure. It is safer to marshall and unmarshall a structure one field at a time. Of course you will lose the generic feature of the sample code you have given.
To answer your real question though (and consider this pseudo code):
public ref class Writer abstract sealed
{
public:
generic <typename T> where T : value class
static System::Byte[] Write(T value)
{
System::Byte buffer[] = new System::Byte[sizeof(T)]; // this syntax is probably wrong.
pin_ptr<System::Byte> dst = &buffer[0];
pin_ptr<T> src = &value;
memcpy((void*)dst, (void*)src,
/*System::Runtime::InteropServices::Marshal::SizeOf(T::typeid)*/
sizeof(T));
return buffer;
}
};
This is probably not the right way to go. CLR is allowed to add padding, reorder the items and alter the way it's stored in memory.
If you want to do this, be sure to add [System.Runtime.InteropServices.StructLayout] attribute to force a specific memory layout for the structure. In general, I suggest you not to mess with memory layout of .NET types.
Unsafe code can be made to do this, actually. See my post on reading structs from disk: Reading arrays from files in C# without extra copy.
Not altering the structure is certainly sound advice. I use liberal amounts of StructLayout attributes to specify the packing, layout and character encoding. Everything flows just fine.
My issue is just that I need a performant and preferably generic solution. Performance because this is a server application and generic for elegance. If you look at the codeproject link you'll see that the StructureToPtr and PtrToStructure methods perform on the order of 20 times slower than a simple unsafe pointer cast. This is one of those areas where unsafe code is full of win. C# will only let you have pointers to primitives (and it's not generic - can't get a pointer to a generic), so that's why CLI.
Thanks for the psuedocode grieve, I'll see if it gets the job done and report back.
Am I missing something? Why not create a new array of the same size and initialise each element seperately in a loop?
Using an array of byte data is quite dangerous unless you are targetting one platform only... for example your method doesn't consider differing endianness between the source and destination arrays.
Something I don't really understand about your question as well is why having an array as a member in your class is causing a problem. If the class comes from a .NET language you should have no issues, otherwise, you should be able to take the pointer in unsafe code and initialise a new array by going through the elements pointed at one by one (with unsafe code) and adding them to it.