Marshalling structs with non-aligned arrays

Marshalling structs with non-aligned arrays - c#

I get an exception when trying to marshal this structure
[StructLayout(LayoutKind.Explicit, Pack = 1)]
public struct Data
{
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 4, ArraySubType = UnmanagedType.U1)]
[FieldOffset(0x1)]
public byte[] a2;
}
It says
"Could not load type 'WTF.Data' from assembly 'WTF, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null' because it contains an object field at offset 1 that is incorrectly aligned or overlapped by a non-object field."
When I change offset 1 to 0 or 4, everything is ok.
What am I doing wrong?
Thanks

The [StructLayout] affects both the managed and the marshaled layout of the struct. Bit of a quirk in .NET but creating blittable structs is a rather big win on interop and the CLR can't ignore the fact that managed code always runs on an entirely unmanaged operating system. Not having to create a copy of a struct but just being able to pass a pointer to the managed version is a very major perf win.
Your [FieldOffset] value violates a very strong guarantee of the .NET memory model, object reference assignments are always atomic. An expensive word that means that another thread can never observe an invalid object reference that is only partially updated. Atomicity requires proper alignment, to a multiple of 4 in 32-bit mode, of 8 in 64-bit mode. It they are misaligned then the processor may need to perform multiple memory bus cycles to glue the bytes together. That's bad, it causes tearing when another thread is also updating the variable. Getting parts of the pointer value from the old value, part from the new. What's left is a corrupted pointer that crashes the garbage collector. Very bad.
Obscure stuff from the high-level point of view of C#, it is however very important to provide basic execution guarantees. You can't get it misaligned to 1, no workaround as long as you use LayoutKind.Explicit. So don't use it.

Please see Hans Passant's answer first - aligned data is a good thing, and the CLR enforces it for a reason. It does seem to be possible to "cheat" though, if you for some reason really need or want to:
[StructLayout( LayoutKind.Sequential, Pack = 1 )]
public struct Data
{
public byte Dummy;
[MarshalAs( UnmanagedType.ByValArray, SizeConst = 4,
ArraySubType = UnmanagedType.U1 )]
public byte[] a2;
}
It is also possible with unsafe code:
[StructLayout( LayoutKind.Explicit, Pack = 1 )]
public unsafe struct Data
{
[FieldOffset( 1 )]
public fixed byte a2[4];
}
But again, probably not a good idea.
Edit: A third option would be to simply make the array 5 bytes long and have it at offset 0, and then just ignore the first byte. This does seem safer.

Related

Marshaling an array of boolean vs marshaling a single boolean (defined as int) to bool in C#

In a C API I have BOOL defined as follows
#ifndef BOOL
#define BOOL int
And I have a struct which, among others, has a simple BOOL member and an array of BOOLs
struct SomeStruct
{
BOOL bIsSomething;
BOOL bHasSomething[5];
}
Now I found out that when I want to cast the whole struct I have to marshal them differently:
the single BOOL I marshal with I1 and the fixed length array I have to marshal with I4 (if I don't their struct sizes won't match and I will have problems extracting an array of these structs into C#):
[StructLayout(LayoutKind.Sequential)]
public struct SomenNativeStruct
{
[MarshalAs(UnmanagedType.I1)]
public bool bIsSomething;
[MarshalAs(UnmanagedType.ByValArray, ArraySubType = UnmanagedType.I4, SizeConst = 5)]
public bool[] bHasSomething;
}
I suspect I do something wrong because I'm not sure why I should need to marshal the same type differently depending on whether I get it as a fixed size array or as a single member.
If I'm marshalling them all as I4 I get a System.ArgumentException
An unhandled exception of type 'System.ArgumentException' occurred in SomeDll.dll
Additional information: Type 'Namespace.Document+SomeNativeStruct' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.

bool is a tricky type to interop. There's many mutually incompatible definitions of what a boolean value is, so bool is considered a non-blittable type - that is, it needs to be truly marshalled, rather than just sticking a "totally a bool" tag to the data. And arrays of non-blittable types are doubly-tricky.
The simplest solution would be to avoid using bool entirely. Just replace the bool[] with int[], and provided the original type is actually a 32-bit int (depends on the compiler and platform), you'll get correct interop. You can then manually copy the interop struct to a managed struct with a more sane layout, if you so choose - which also gives you full control over interpreting which int values correspond to true and false, respectively.
In general, native interop is always tricky; you need to have a good understanding of the actual memory layout as well as the meaning of the values and types you're dealing with. The types aren't enough - they're too ambiguous, especially in standard C (which is often the standard for native interop even today). Headers aren't enough - you also need the docs, and perhaps even a look in a (native) debugger.
Extra danger comes from the fact that there's no safety net that tells you you're doing things somewhat wrong - the wrong interop approach can appear to work just fine for years, and then suddenly blow up in your face when e.g. a true value happens to be 42 instead of the more usual -1, and your bitwise arithmetics breaks subtly (this can actually happen in C#, if you use unsafe code). Everything might work great for values smaller than 32768, and then break horribly. There's plenty of hard to catch error cases, so you need extra caution.

Marshaling complex nested structures containing booleans

I need to do complex marshaling of several nested structures, containing variable length arrays to other structures, hence I decided to use ICustomMarshaler (see for a good JaredPar's tutorial here). But then I have a problem with a struct defined in C++ as:
typedef struct AStruct{
int32_t a;
AType* b;
int32_t bLength;
bool aBoolean;
bool bBoolean;
};
On the C# side, in the MarshalManagedToNative implementation of ICustomMarshaler I was using:
Marshal.WriteByte(intPtr, offset, Convert.ToByte(aBoolean));
offset += 1;
Marshal.WriteByte(intPtr, offset, Convert.ToByte(bBoolean));
But it was not working since I discovered that each bool in the C++ struct was taking 2 bytes. Indeed in x86 sizeof(AStruct) = 16, not 14. Ok, bool is not guaranteed to take 1 byte and so I tried with unsigned char and uint8_t but still the size is 16.
Now, I know I could use an int32 instead than a boolean, but since I care about the taken space and there are several structs containing boolean that flow to disk (I use HDF5 file format and I want to map those boolean with H5T_NATIVE_UINT8 defined in the HDF5 library that takes 1 byte), is there another way? I mean can I have something inside a struct that is guaranteed to take 1 byte?
EDIT
the same problem applies also to int16 values: depending on how many values are present because of alignment reasons the size of the struct at the end might be different from what expected. On the C# side I do not "see" the C++ struct, I simply write on the unmanaged memory by following the definition of my structs in C++. It is quite a simple process, but if I have instead to think to the real space taken by the struct (either by guessing or by measuring it) it will become more difficult and prone to errors every time I modify the struct.

This answer is in addition to what Hans Passant has said.
It might be easiest to have your structures use a fixed packing size, so you can readily predict the member layout. Keep in mind though that this could affect performance.
The rest of this answer is specific to Microsoft Visual C++, but most compilers offer their own variant of this.
To get you started, check out this SO answer #pragma pack effect and MSDN http://msdn.microsoft.com/en-us/library/2e70t5y1.aspx
What you often use is a pragma pack(push, ...) followed by a pragma pack(pop, ...) idiom to only affect packing for the structures defined between the two pragma's:
#pragma pack(push, 4)
struct someStructure
{
char a;
int b;
...
};
#pragma pack(pop)
This will make someStructure have a predictable packing of 4 byte-alignment of each of its members.
EDIT: From the MSDN page on packing
The alignment of a member will be on a boundary that is either a multiple of n
or a multiple of the size of the member, whichever is smaller.
So for pack(4) a char will be aligned on a 1-byte boundary, a short on a 2-byte, and the rest on a 4-byte boundary.
Which value is best depends on your situation. You'll need to explicitly pack all structures you intend to access, and probably all structures that are members of structures you want to access.

sizeof(AStruct) = 16, not 14
That's correct. The struct has two extra bytes at the end that are not used. They ensure that, if you put the struct in an array, that the fields in the struct are still properly aligned. In 32-bit mode, the int32_t and AType* members require 4 bytes and should be aligned to a multiple of 4 to allow the processor to access them quickly. That can only be achieved if the structure size is a multiple of 4. Thus 14 is rounded up to 16.
Do keep in mind that this does not mean that the bool fields take 2 bytes. A C++ compiler uses just 1 byte for them. The extra 2 bytes are pure padding.
If you use Marshal.SizeOf(typeof(AStruct)) in your C# program then you'll discover that the struct you declared takes 20 bytes. This is not good and the problem you are trying to fix. The bool members are the problem, an issue that goes way, way, back to early versions of the C language. Which did not have a bool type. The default marshaling that the CLR uses is compatible with BOOL, the typedef in the winapi. Which is a 32-bit type.
So you have to be explicit about it when you declare the struct in your C# code, you have to tell the marshaller that you want the 1-byte type. Which you do by declaring the struct member as byte. Or by overriding the default marshaling:
[StructLayout(LayoutKind.Sequential)]
private struct AStruct{
public int a;
public IntPtr b;
public int bLength;
[MarshalAs(UnmanagedType.U1)]
public bool aBoolean;
[MarshalAs(UnmanagedType.U1)]
public bool bBoolean;
}
And you'll now see that Marshal.SizeOf() now returns 16. Do be aware that you have to force your program in 32-bit mode, make sure that the EXE project's Platform Target setting is x86.

Marshaling a struct that has a variable array with zero elements

.NET 4 with 64bit. I have a C# structure that I intend to marshal to C
[StructLayout(LayoutKind.Sequential)]
public struct ParentStruct
{
public float[] FArray;
public int FArrayLength;
}
To
struct ParentStruct
{
float* FArray;
int FArrayLength;
};
The special circumstances here is the array I need to copy float[] FArray is always pinned and has 0 elements and I am only interested in copying its pointer across to native and not interested in the actual elements it has (which are none!) nor in allocating memory on the native side on the free store (heap), it will be pointing to a garbage location, this is fine.
The technical reason for doing this is that float[] FArray is pointing to an address on the GPU memory and once it is marshaled to the GPU, it will be pointing to the right data.
I want to be able to marshal this struct to C, but I am not sure what is the proper marshal way
I tried marshaling it in the current structure, I got Object contains non-primitive or non-blittable data.
I tried adding [MarshalAs(UnmanagedType.LPArray)] and I was getting Type 'Test.ParentStruct' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.
By the way, I do understand why I am getting these errors, what I really want is a way to marshal given that my array is not a variable array (but it looks like one to the interop libraries) and you can think of my struct as a fixed size.
N.B. I have to use a float[] rather than a uint or something like that because of intellisense and other constraints.

64 Bit P/Invoke Idiosyncrasy

I am trying to properly Marshal some structs for a P/Invoke, but am finding strange behavior when testing on a 64 bit OS.
I have a struct defined as:
/// <summary>http://msdn.microsoft.com/en-us/library/aa366870(v=VS.85).aspx</summary>
[StructLayout(LayoutKind.Sequential)]
private struct MIB_IPNETTABLE
{
[MarshalAs(UnmanagedType.U4)]
public UInt32 dwNumEntries;
public IntPtr table; //MIB_IPNETROW[]
}
Now, to get the address of the table, I would like to do a Marshal.OffsetOf() call like so:
IntPtr offset = Marshal.OffsetOf(typeof(MIB_IPNETTABLE), "table");
This should be 4 - I have dumped the bytes of the buffer to confirm this as well as replacing the above call with a hard coded 4 in my pointer arithmetic, which yielded correct results.
I do get the expected 4 if I instantiate MIB_IPNETTABLE and perform the following call:
IntPtr offset = (IntPtr)Marshal.SizeOf(ipNetTable.dwNumEntries);
Now, in a sequential struct the offset of a field should be sum of the sizes of preceding fields, correct? Or is it the case that when it is an unmanaged structure the offset really is 8 (on an x64 system), but becomes 4 only after Marshalling magic? Is there a way to get the OffsetOf() call to give me the correct offset? I can limp along using calls to SizeOf(), but OffsetOf() is simpler for larger structs.

In a 64-bit C/C++ build the offset of your table field would be 8 due to alignment requirements (unless you forced it otherwise). I suspect that the CLR is doing the same to you:
http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.layoutkind.aspx
The members of the object are laid out sequentially, in the order in which they appear when
exported to unmanaged memory. The members are laid out according to the packing specified in StructLayoutAttribute.Pack, and can be noncontiguous.
you may wnat to use that attribute or use the LayoutKind.Explicit attribute along with the FieldOffset attribute on each field if you need that level of control.

TypeLoadException on x64 but is fine on x86 with structlayouts

Youll need a 64bit machine if you want to see the actuall exception. I've created some dummy classes that repro's the problem.
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public class InnerType
{
char make;
char model;
UInt16 series;
}
[StructLayout(LayoutKind.Explicit)]
public class OutterType
{
[FieldOffset(0)]
char blah;
[FieldOffset(1)]
char blah2;
[FieldOffset(2)]
UInt16 blah3;
[FieldOffset(4)]
InnerType details;
}
class Program
{
static void Main(string[] args)
{
var t = new OutterType();
Console.ReadLine();
}
}
If I run this on the 64 clr, I receive a type load exception,
System.TypeLoadException was unhandled
Message="Could not load type 'Sample.OutterType' from assembly 'Sample, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null' because it contains an object field at offset 4 that is incorrectly aligned or overlapped by a non-object field."
If i force the target cpu to 32, it works fine.
Also, if i change InnerType from a class to a struct it also works. Can someone explain whats going on or what I am doing wrong ?
thanks

The part about overlapping types is misleading here. The problem is that in .Net reference types must always be aligned on pointer size boundaries. Your union works in x86 since the field offset is 4 bytes which is the pointer size for a 32 bit system but fails on x64 since there it must be offset a multiple of 8. The same thing would happen if you set the offset to 3 or 5 on the x86 platform.
EDIT: For the doubters - I couldn't find a ready reference on the internet but check out Expert .NET 2.0 IL Assembler By Serge Lidin page 175.

I have also noticed that you are packing your char data type into 1 byte. Char types in .NET are 2 bytes in size. I cannot verify if this is the actual issue, but I would double-check that.

If you are wanting to place structs within other structs which are themselves Layoutind.Explict you should Use an explicit Size value (in bytes) if you expect them to work in different bitness modes (or on machines with different packing requirements)
What you are saying there is "lay things out sequentially and don't pack internally but use as much space as you like at the end".
If you do not specify Size the runtime is free to add as much space as it likes.
The reason it in general refuses to let structs and object types overlap is that the GC routine must be free to traverse the live object graph. While doing this it cannot know if a unioned (overlapping) field is meaningful as an object reference or as raw bits (say an int or a float). Since it must traverse all live object references to behave correctly it would end up traversing 'random' bits which might point anywhere in the heap (or out of it) as if they were references before you know it you're General Protection Faulting.
Since 32/64 references will take up 32 or 64 bits according to the runtime you must use Explict, only union references with references and value types with value types, ensure your reference types are aligned to the boundaries of both target platforms if they differ (Note: Runtime dependent see below) and do one of the following:
Ensure that all reference fields are the last entry in the struct - it is then free to make the struct bigger/smaller depending on the bitness of the runtime environment.
Force all object references to consume 64bits whether you are on a 32 or 64bit environment
Note on alignment:
Apologies
I was in error on the unaligned reference fields - the compiler removed the type load unless I performed some action with the struct.
[StructLayout(LayoutKind.Explicit)]
public struct Foo
{
[FieldOffset(0)]
public byte padding;
[FieldOffset(1)]
public string InvalidReference;
}
public static void RunSnippet()
{
Foo foo;
foo.padding = 0;
foo.ValidReference = "blah";
// Console.WriteLine(foo); // uncomment this to fail
}
The relevant details are in the ECMA specification http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf see section 16.6.2 which mandates the alignment of native size values including &. It notes that the unaligned prefix instruction exists to work round this if required.
On mono however (both OSX intel and Win32 intel 32 bit) the above code works. Either the runtime does not respect the layouts and silently 'correct' things or it allows arbitrary alignment (historically they were less flexible than the MS runtime in this regard which is surprising).
The CLI intermediate form generated by mono does not include any .unaligned instruction prefixes, as such it appears to not conform to the spec.
That'll teach me to only check on mono.

I struggled with the same problem and hated I couldn't find a clear reference on this topic on MSDN. After reading the answer here, I started concentrating me on the x86 and x64 differences in .NET and found the following: Migrating 32-bit Managed Code to 64-bit. Here they clearly state that pointers are 4 bytes on x86 and 8 bytes on x64. Hope it can be helpful for others.
There are by the way many related questions here on Stack Overflow. I will add two of them, that mention other interesting things.
Incorrectly-aligned/non-object field in struct
Incorrectly aligned or overlapped by a non-object field error

Maybe there goes something Wrong with the Uint16 due to it's not CLS Compliant (see here: http://msdn.microsoft.com/en-us/library/system.uint16.aspx)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Marshalling structs with non-aligned arrays - c#

Related

Marshaling an array of boolean vs marshaling a single boolean (defined as int) to bool in C#

Marshaling complex nested structures containing booleans

Marshaling a struct that has a variable array with zero elements

64 Bit P/Invoke Idiosyncrasy

TypeLoadException on x64 but is fine on x86 with structlayouts

Categories

Resources