MarshalAs attribute case study

MarshalAs attribute case study - c#

When should we use this attribute and why do we need it? For example, if the native function in c takes as a parameter a pointer to unsigned char, and I know that it's needed to fulfill the array of unsigned chars, why can't I use array of bytes in C# to use this function? Is it necessary to do marshalling?

The runtime will be able to automatically determine how to marshal data between native and managed code in most cases, so you generally don't need to specify the attribute. MarshalAs is only necessary when there is an ambiguity in the definition (and you want to tell the runtime precisely how to marshal the data) or if you require non-default behaviour.
In my experience, MarshalAs is only really required when working with strings, since there are so many different representations in native code; unicode/ansi, c-strings or not, etc.

Additional use of MarshalAs attribute is marshalling fixed-size arrays (including fixed-size strings) with ByValArray and SizeConst parameters. For example, many structures from Windows API contain fixed-size strings.

Based on Microsoft documentation for Type marshaling
Marshaling is the process of transforming types when they need to cross between managed and native code. Marshaling is needed because
the types in the managed and unmanaged code are different. In managed
code, for instance, you have a String, while in the unmanaged world
strings can be Unicode ("wide"), non-Unicode, null-terminated, ASCII,
etc. By default, the P/Invoke subsystem tries to do the right thing
based on the default behavior, described on this article. However, for
those situations where you need extra control, you can employ the
MarshalAs attribute to specify what is the expected type on the
unmanaged side.
Generally, the runtime tries to do the "right thing" when marshaling
to require the least amount of work from you.
Which types need special handling is explained on the following link from the doc to Blittable and Non-Blittable Types:
Most data types have a common representation in both managed and
unmanaged memory and do not require special handling by the interop
marshaler. These types are called blittable types because they do not
require conversion when passed between managed and unmanaged code.
Non-blittable types would be the answer to your question. You would have to marshal for the following ones:
Array, Boolean, char, class, object, string, value type (structure), delegates, unmanaged arrays that are either COM-style safe arrays or C-style arrays with fixed or variable length.
Unmanaged structures can also contain embedded arrays or Booleans (non-blittable types). There you have to be careful according the doc:
Structures that are returned from platform invoke calls must be
blittable types. Platform invoke does not support non-blittable
structures as return types.

Related

C#/CIL: type of native int

I am writing some tools to help validate IL that is emitted at runtime. A part of this validation involves maintaining a Stack<Type> as OpCodes are emitted so that future OpCodes that utilize these stack elements can be validated as using the proper types. I am confused as to how to handle the ldind.i opcode, however.
The Microsoft documentation states:
The ldind.i instruction indirectly loads a native int value from the
specified address (of type native int, &, or *) onto the stack as a
native int.
In C#, native int is not defined, and I am confused as to what type most accurately represents this data. How can I determine what its size is, and which C# type should be used to represent it? I am concerned it will vary by system hardware.

To my mind, you'd be better off looking at how the VES is defined and using a dedicated enum to model the types on the stack rather than C# visible types. Otherwise you're in for a rude surprise when we get to the floating point type.
From MS Partition I.pdf1, Section 12.1:
The CLI model uses an evaluation stack [...] However, the CLI supports only a subset of these types in its operations upon values stored on its evaluation stack—int32, int64, and native int. In addition, the CLI supports an internal data type to represent floating-point values on the internal evaluation stack. The size of the internal data type is implementation-dependent.
So those, as well as things like references are the things you should track, and I'd recommend you do that with an explicit model of the VES Stack using its terms.
1ECMA C# and Common Language Infrastructure Standards

Where are pointers used in C#?

I am pretty new to C# environment, I have been coding in C++ for a while now. Where exactly will one use pointers in C#? IS it advisable to use pointers in C#?

You can use pointers, but only in "unsafe" mode. You will never use them in normal C# coding.
All classes inherit from Object and are called "Reference" types. These types are passed by reference, which is mainly just a pointer under the hood, and hides any details like addresses from the programmer.
These are your "pointers" but you never delete them because everything in C# is garbage collected, and you access them as if you had the object itself (no -> operator).
Types like int, double, and char are "Value" types and are passed by value (just like in C++). You can also pass these by reference, but you have to use the ref keyword in the function signature and when calling it.
struct is the other special case where it is a "Value" type. MSDN

The only place that you can actually use pointers in C# is in unsafe regions of code which is generally frowned upon. The rest of the time, the runtime will manage memory for you.

Create COM interface returning a pointer that is marshalled as IntPtr in C#

I want to declare a COM Interface in MIDL that allows for returning a pointer (like in the ID3D11Blob). I understand that pointers are a special thing in COM because of the stubs generated for RPC calls. I do not need RPC, but only want to access the COM server from C#. The question is: can I declare the interface in such a way that the C# stub returns an IntPtr? I have tried to add [local] to enable void pointers, but that does not suffice.
The interface should look in MIDL like
[local] void *PeekData(void)
and in C# like
IntPtr PeekData()
Is this possible? If so, how?
Thanks in advance,
Christoph
Edit: To rephrase the question: Why is
HRESULT GetData([in, out, size_is(*size)] BYTE data[], [in, out] ULONG *size);
becoming
void GetData(ref byte, ref uint)
and how can I avoid the first parameter becoming a single byte in C#?

This goes wrong because you imported the COM server declarations from a type library. Type libraries were originally designed to support a sub-set of COM originally called "OLE Automation". Which restricts the kind of types you can use for method arguments. In particular, raw pointers are not permitted. An array must be declared as a SAFEARRAY. Which ensures that the caller can always index an array safely, safe arrays have extra metadata that describes the rank and the lower/upper bounds of the array.
The [size_is] attribute is only understood by MIDL, it is used to create the proxy and the stub for the interface. Knowing how many elements the array contains is also important when it needs to be copied into an interop packet that's sent on the wire to the stub.
Since type libraries don't support a declaration like this, the [size_is] attribute is stripped and the type library importer only sees BYTE*. Which is ambiguous, that can be a byte passed by reference or it can be a pointer to an array of bytes. The importer chooses the former since it has no hope of making an array work, it doesn't know the size of the array. So you get ref byte.
To fix this issue, you have to alter the import library so you can provide the proper declaration of the method. Which requires the [MarshalAs] attribute to declare the byte[] argument an LPArray with the SizeParamIndex property set so you can tell the CLR that the array size is determined by the size argument. There are two basic ways to go about it:
Decompile the interop library with ildasm.exe, modify the .il file and put it back together with ilasm.exe. You'd use a sample C# declaration that you look at with ildasm.exe to know how to edit the IL. This is the approach that Microsoft recommends.
Use a good decompiler that can decompile IL back to C#. Reflector and ILSpy are popular. Copy/paste the generated code into a source file of your project and edit the method, applying the [MarshalAs] attribute. Advantage is that editing is easier and you no longer have a dependency on the interop library anymore.
In either case, you want to make sure that the COM server is stable so you don't have to do this very often. If it is not then modifying the server itself is highly recommended, use a safe array.

I think I found the solution on http://msdn.microsoft.com/en-gb/library/z6cfh6e6(v=vs.110).aspx#cpcondefaultmarshalingforarraysanchor2: This is the default behaviour for C-style arrays. One can avoid that by using SAFEARRAYs.

What do HRESULT, DWORD, and HANDLE mean in unmanaged code?

I was reading about Marshaling. and im confused because what does mean this in unmanaged code.
HRESULT, DWORD, and HANDLE.
The original text is:
You already know that there is no such compatibility between managed and unmanaged environments. In other words, .NET does not contain such the types HRESULT, DWORD, and HANDLE that exist in the realm of unmanaged code. Therefore, you need to find a .NET substitute or create your own if needed. That is what called marshaling.

short answer:
it is just telling you that you must "map" one data type used in one programming language to another data type used in a different programming language, and the data types must match.
quick answer:
For this one, the details may not be correct, but the concept is.
These are a few of the data types defined in the Windows header files for C/C++. They are "macros" which "abstract" the primitive data types of C/C++ into more meaningful data types used in Windows programming. For instance, DWORD is really an 32-bit unsigned integer in C/C++, but on 64-bit processors, it is defined in the header files as a 64-bit unsigned integer. The idea is to provide an abstraction layer between the data type needed by the processor and the data types used by the language.
During marshalling, this "dword" will be converted to the CLR data type you specify in the DllImport declaration. This is an important point.
Let's say you want to call a Windows API method that takes a DWORD parameter. When declaring this call in C# using DllImport, you must specify the parameter data type as System.UInt32. If you don't, "bad things will happen".
For example, if you mistakenly specify the parameter data type as System.UInt64. When the actual call is made, the stack will become corrupt because more bytes are being placed on the stack then the API call expects. Which can lead to completely unexpected behavior, such as crashing the application, crashing Windows, invalid return values, or whatever.
That is why it is important to specific the correct data type.
data types in question:
DWORD is defined as 32-bit unsigned integer or the CLR type System.UInt32.
HANDLE is the CLR types IntPtr, UintPtr, or HandleRef
HRESULT is System.Int32 or System.UInt32
References:
Using P/Invoke to Call Unmanaged APIs from Your Managed Classes at http://msdn.microsoft.com/en-us/library/aa719104(v=vs.71).aspx has a table listing the Windows data type with its corresponding CLR data type that specifically answers your question.
Windows Data Types (Windows) at http://msdn.microsoft.com/en-us/library/aa383751(v=VS.85).aspx
.NET Column: Calling Win32 DLLs in C# with P/Invoke at http://msdn.microsoft.com/en-us/magazine/cc164123.aspx

HRESULT: http://en.wikipedia.org/wiki/HRESULT
In the field of computer programming, the HRESULT is a data type used
in Windows operating systems, and the earlier IBM/Microsoft OS/2
Operating system, used to represent error conditions, and warning
conditions. The original purpose of HRESULTs was to formally lay out
ranges of error codes for both public and Microsoft internal use in
order to prevent collisions between error codes in different
subsystems of the OS/2 Operating System. HRESULTs are numerical error
codes. Various bits within an HRESULT encode information about the
nature of the error code, and where it came from. HRESULT error codes
are most commonly encountered in COM programming, where they form the
basis for a standardized COM error handling convention.
DWORD: http://en.wikipedia.org/wiki/DWORD#Size_families
HANDLE: http://en.wikipedia.org/wiki/Handle_(computing)
In computer programming, a handle is an abstract reference to a
resource. Handles are used when application software references blocks
of memory or objects managed by another system, such as a database or
an operating system. While a pointer literally contains the address of
the item to which it refers, a handle is an abstraction of a reference
which is managed externally; its opacity allows the referent to be
relocated in memory by the system without invalidating the handle,
which is impossible with pointers. The extra layer of indirection also
increases the control the managing system has over operations
performed on the referent. Typically the handle is an index or a
pointer into a global array of tombstones.

HRESULT, DWORD, and HANDLE are typedef's (i.e., they represent plain data types) defined by Microsoft for use by programmers compiling *un*managed code in a Windows environment. They are defined in a C (or C++) header file that is provided by Microsoft that is, typically, automatically included in unmanaged Windows projects created within Microsoft Visual Studio.

How to marshal a list in C#

I have to send a list from C# to C++.The C# list is List<string>MyList and the C++ code accepts it as list<wstring>cppList.How to use marshalas for this.
Thanks

It is always wiser not to use complex type marshaling between native code and managed code.
In case of List, these type totally differ from each other as they have different memory layout for each item.
So the best way is to write a utility function in a native dll that accepts array of string(char*) and manually build your native List and ultimately call the desired method. It is easy for your to create wrapper of that utility function.

C# cannot P/Invoke complex C++ types. You will have to use C++/CLI, they might have a method for marshalling it across. Else, you will have to marshal each string across individually.

strings in C# are wstrings (2 byte unicode strings), so if what you say is true, then no special conversions are necessary.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.