When should I explicitly specify a StructLayout?

When should I explicitly specify a StructLayout? - c#

I'm fiddling with calling DLLs from C#, and came across the need to define my own structs. Lots of articles force a sequential layout for the struct with
[StructLayout(LayoutKind.Sequential)]
struct Foo ...
So, I followed suite, and my programme worked. Now, when I took the line out, it still works. Why do I need it?

The internal layout of a managed struct is undocumented and undiscoverable. Implementation details like member order and packing are intentionally hidden. With the [StructLayout] attribute, you force the P/Invoke marshaller to impose a specific layout and packing.
That the default just happens to match what you need to get your code to work is merely an accident. Although not an uncommon one. Note the Type.StructLayoutAttribute property.

Interesting point. I'm sure I had code that failed until I put in an explicit LayoutKind.Sequential, however I have confirmed Sequential is the default for structures even in 1.1.
Note the VB Reference for Structure
implies at Remarks > Behaviour > Memory Consumption that you do need to specify StructLayout to confirm the memory layout, but the documentation for StructLayoutAttribute
states Sequential is the default for structures in Microsoft compilers.

I am not entirely sure, but it may affect binary serialization - it might spit out the fields in order with not naming or ordering information (resulting in a smaller file), but that is a complete whim.

Related

Why are so many simple types in the .Net framework not marked as serializable?

I find it a recurring inconvenience that a lot of simple types in the .Net framework are not marked as serializable. For example: System.Drawing.Point or Rectangle.
Both those structs only consist of primitive data and should be serializable in any format easily. However, because of the missing [System.Serializable] attribute, I can't use them with a BinaryFormatter.
Is there any reason for this, which I'm not seeing?

It is simply a question of efficiency. Tagging a field as serializable the compiler must map each field onto a table of aliases. If they were all marked as serializables every object injecting or inheriting them need to be mapped aswell onto the table of aliases to process its serialization when probably you will never use them and it has a cost of memory and processing and it is more unsafe. Test it with millions of elements and you will see.

Personally, I believe it has less to do with the need to pass the buck, and more to do with the fact of usefulness and actual use, coupled with the fact that the .NET Framework is simply that, a framework. It is designed to be a stepping stone which provides you the basics to complete tasks that would otherwise be daunting in other languages, rather than do everything for you.
There really isn't anything stopping you from creating your own serialization mechanisms and extensions which provide the functionality you're seeking, or relying on many of the other products out there which are FOSS or paid which achieve this for you OOB.
Granted, #Hans Passant's answer is I think very close to the truth, there's a lot of other facets to this which go beyond just simply "It's not my problem." You can take it whatever way you want, but the ultimate thing you need to get out of it is, "where can I go from here?"

If attributes are only constructed when they are reflected into, why are attribute constructors so limited?

As shown here, attribute constructors are not called until you reflect to get the attribute values. However, as you may also know, you can only pass compile-time constant values to attribute constructors. Why is this? I think many people would much prefer to do something like this:
[MyAttribute(new MyClass(foo, bar, baz, jQuery)]
than passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value, and instead of using compile-time warnings/errors depending on exceptions that might be thrown somewhere that has nothing to do with the class except that a method that it called uses some attributes that were typed wrong.
What limitation caused this?

Attributes are part of metadata. You need to be able to reflect on metadata in an assembly without running code in that assembly.
Imagine for example that you are writing a compiler that needs to read attributes from an assembly in order to compile some source code. Do you really want the code in the referenced assembly to be loaded and executed? Do you want to put a requirement on compiler writers that they write compilers that can run arbitrary code in referenced assemblies during the compilation? Code that might crash, or go into infinite loops, or contact databases that the developer doesn't have permission to talk to? The number of awful scenarios is huge and we eliminate all of them by requiring that attributes be dead simple.

The issue is with the constructor arguments. They need to come from somewhere, they are not supplied by code that consumes the attribute. They must be supplied by the Reflection plumbing when it creates the attribute object by calling its constructor. For which it needs the constructor argument values.
This starts at compile time with the compiler parsing the attribute and recording the constructor arguments. It stores those argument values in the assembly metadata in a binary format. At issue then is that the runtime needs a highly standardized way to deserialize those values, one that preferably doesn't depend on any of the .NET classes that you'd normally use the de/serialize data. Because there's no guarantee that such classes are actually available at runtime, they won't be in a very trimmed version of .NET like the Micro Framework.
Even something as common as binary serialization with the BinaryFormatter class is troublesome, note how it requires the [Serializable] attribute on the class to allow it to do its job. Versioning would also be an enormous problem, clearly such a serializer class could never change for the risk of breaking attributes in old assemblies.
This is a rock and a hard place, solved by the CLS designers by heavily restricting the allowed types for an attribute constructor. They didn't leave much, just the simple values types, string, a simple one-dimensional array of them and Type. Never a problem deserializing them since their binary representation is simple and unambiguous. Quite a restriction but attributes can still be pretty expressive. The ultimate fallback is to use a string and decode that string in the constructor at runtime. Creating an object of MyClass isn't an issue, you can do so in the attribute constructor. You'll have to encode the arguments that this constructor needs however as properties of the attribute.

The probably most correct answer as to why you can only use constants for attributes is because the C#/BCL design team did not judge supporting anything else important enough to be added (i.e. not worth the effort).
When you build, the C# compiler will instantiate the attributes you have placed in your code and serialize them, so that they can be stored in the generated assembly. It was probably more important to ensure that attributes can be retrieved quickly and reliably than it was to support more complex scenarios.
Also, code that fails because some attribute property value is wrong is much easier to debug than some framework-internal deserialization error. Consider what would happen if the class definition for MyClass was defined in an external assembly - you compile and embed one version, then update the class definition for MyClass and run your application: boom!
On the other hand, it's seriously frustrating that DateTime instances are not constants.

What limitation caused this?
The reason it isn't possible to do what you describe is probably not caused by any limitation, but it's purely a language design decision. Basically, when designing the language they said "this should be possible but not this". If they really wanted this to be possible, the "limitations" would have been dealt with and this would be possible. I don't know the specific reasoning behind this decision though.
/.../ passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value /.../
I have been in similar situations. I sometimes wanted to use attributes with lambda expressions to implement something in a functional way. But after all, c# is not a functional language, and if I wrote the code in a non-functional way I haven't had the need for such attributes.
In short, I think like this: If I want to develop this in a functional way, I should use a functional language like f#. Now I use c# and I do it in a non-functional way, and then I don't need such attributes.
Perhaps you should simply reconsider your design and not use the attributes like you currently do.
UPDATE 1:
I claimed c# is not a functional language, but that is a subjective view and there is no rigourous definition of "Functional Language". I agree with the Adam Wright, "/.../ As such, I wouldn't class C# as functional in general discussion - it is, at best, multi-paradigm, with some functional flavour." at Why is C# a functional programmming language?
UPDATE 2:
I found this post by Jon Skeet: https://stackoverflow.com/a/294259/1105687 It regards not allowing generic attribute types, but the reasoning could be similar in this case:
Answer from Eric Lippert (paraphrased): no particular reason, except
to avoid complexity in both the language and compiler for a use case
which doesn't add much value.

Attributes, just metadata or needed?

Few days ago I asked what this attribute means:
[System.Runtime.InteropServices.DllImport("KERNEL32.DLL", EntryPoint="RtlZeroMemory")] public unsafe static extern bool ZeroMemory(byte* destination, int length);
I have learned that attributes are metadata but what I do not understand is - is this needed in this case? I thought metada are just that, metadata that can be ommited. Also the code seems to be running fine when I remove the attibute.
I would like to understand.
PS: Hans Passant mentioned its covered by any book about .NET Csharp..it is not, the largely used one VS 2010 from John Sharp does not cover it.

The metadata does usually have a reason and a meaning. In this particular case it tells the compiler how to bind this external method definition (e.g. to which DLL import it matches).
Other attributes control how interop is performed by the framework, yet other control how the object inspector displays data. 3rd-party attributes are also used extensively to control various behaviors, for instance for finding specific type information when performing reflection.

No, this attribute is absolutely required. It informs the CLR that what you've defined actually uses platform invokation services (or, P/Invoke) to call a function defined in unmanaged code.
Specifically, the RtlZeroMemory function, defined in the library kernel32.dll.
Without it, the compiler wouldn't know which function it was bound to, and the CLR wouldn't know which function to call at run-time.

This attribute is doing 2 things
Informs the CLR that the C method being invoked lives in kernel32.dll
Informs the CLR that the C method name is RtlZeroMemory and not ZeroMemory as it's named in code.
Yes this attribute is 100% necessary. It's a requirement for any PInvoke method to at the least name the DLL the C method lives in.

As your example shows, attributes are in fact needed in several key areas of .NET programming.
Attributes provide a model known as "Aspect-Oriented Programming" or AOP. Instead of having to write code that performs some specific task, such as serialization, DLL interop, logging, etc, you can instead simply decorate the classes or members on which you want these tasks performed with an attribute. Attributes are a special type of class, with members which can be invoked by the CLR as it runs your code, that will perform the task you wanted when you decorated the code.
You are correct in part; many attributes are intended simply to store metadata. DescriptionAttribute is a good one. However, even in this case, the attribute is important depending on how it's used. If you are decorating a member of a GUI class that you want to use in the designer, [Description()] provides valuable information to the user of the class in the designer, which may not be you. I've also seen and used many alternate uses for DescriptionAttribute; it can be applied to almost anything, so I've used it to provide "friendly names" for Enum constants, coupled with a GetDescription() extension method to grab them, when using Enums to populate drop-down lists.
So, while it's technically "metadata", an attribute's being "required" is governed by how much you want the task inherent in that attribute to be performed.

As for this particular attribute, I'm not too sure. To be honest, I've never seen it in almost a year of constant C#.
However, attributes in general can prove very useful. For instance, I was having issues with the VS2010 designer setting autocomplete properties in the wrong order, and getting run-time errors as a result. The solution was to add attributes to the autocomplete properties that prevented the designer from writing these properties to the design file, and instead setting the properties myself in the .cs file (in the proper order).
Summary: Attributes (usually) are not required, but can prove extremely useful.

Best practices for organizing .NET P/Invoke code to Win32 APIs

I am refactoring a large and complicated code base in .NET that makes heavy use of P/Invoke to Win32 APIs. The structure of the project is not the greatest and I am finding DllImport statements all over the place, very often duplicated for the same function, and also declared in a variety of ways:
The import directives and methods are sometimes declared as public, sometimes private, sometimes as static and sometimes as instance methods. My worry is that refactoring may have unintended consequences but this might be unavoidable.
Are there documented best practices I can follow that can help me out?
My instict is to organize a static/shared Win32 P/Invoke API class that lists all of these methods and associated constants in one file... EDIT There are over 70 imports to the user32 DLL.
(The code base is made up of over 20 projects with a lot of windows message passing and cross-thread calls. It's also a VB.NET project upgraded from VB6 if that makes a difference.)

You might consider the way it was done in the .NET framework. It invariably declares a static class (Module in VB.NET) named NativeMethods that contains the P/Invoke declarations. You could be more organized than the Microsoft programmers, there are many duplicate declarations. Different teams working on different parts of the framework.
However, if you want to share this among all projects you have to declare these declarations Public instead of Friend. Which isn't great, it ought to be an implementation detail. I think you can solve that by re-using the source code file in every project that needs it. Normally taboo but okay in this case, I think.
I personally declare them as needed in the source code file that needs them, making them Private. That also really helps when lying about the argument types, especially for SendMessage.

Organize them into a [Safe|Unsafe]NativeMethods class. Mark the class as internal static. If you need to expose them to your own assemblies, you can use InternalsVisibleTo - though it'd be more appropriate if you could group related ones into each assembly.
Each method should be static - I honestly wasn't aware you could even mark instance methods with DllImport.
As a first step - I'd probably move everything to a Core assembly (if you have one), or create a Product.Native assembly. Then you can find dupes and overlaps easily, and look for managed equivalents. If your p/invokes are a mess, I don't suspect you have much in the way of layering in the other assemblies that will guide your grouping.

Why not create a singular file called Win32.vb and within that logically group the pinvokes into separate namespaces, for instance a GDI namespace could use all GDI pinvokes, User32 namespace could use all pinvokes that resides in the User32 kernel, and so on....it may be painful at first, but at least you will have a centralized namespaces all contained within that file? Have a look here to see what I mean...What do you think?

Are your P/Invoke calls an artifact of the migration from VB6? I have migrated 300,000 lines of code from VB6 to C# (Windows.Forms and System.EnterpriseServices), and eliminated all but a handful of P/Invokes calls--there is nearly always a managed equivalent. If you are refactoring, you may want to consider doing something similar. The resulting code should be fair easier to maintain.

The recommended way is to have a NativeMethods class per assembly with all the DllImported methods in it, with internal visibility. In this manner you know always where your imported function are and avoid duplicate declarations.

What I typically try to do in this case is to do what you are talking about, create various classes, static or not, that provide the functionality, this way it can be re-used as needed. Depending on the nature of the calls, I'd shy way from a static class implementation, but that will depend on your specific implementation.
Expansion on Above as requested.
Given the nature of P/Invoke, especially if a number of calls are needed and are of varying areas of implementation I find it better to group like items together, this way you are not pulling in a lot of other clutter, or other DLL imports when not needed.
THe desire to stay away from static methods, is due to calls to unmanaged resources and potential for memory leaks etc..

Why does StyleCop recommend prefixing method or property calls with "this"?

I have been trying to follow StyleCop's guidelines on a project, to see if the resulting code was better in the end. Most rules are reasonable or a matter of opinion on coding standard, but there is one rule which puzzles me, because I haven't seen anyone else recommend it, and because I don't see a clear benefit to it:
SA1101: The call to {method or property name} must begin with the 'this.' prefix to indicate that the item is a member of the class.
On the downside, the code is clearly more verbose that way, so what are the benefits of following that rule? Does anyone here follow that rule?

I don't really follow this guidance unless I'm in the scenarios you need it:
there is an actual ambiguity - mainly this impacts either constructors (this.name = name;) or things like Equals (return this.id == other.id;)
you want to pass a reference to the current instance
you want to call an extension method on the current instance
Other than that I consider this clutter. So I turn the rule off.

It can make code clearer at a glance. When you use this, it's easier to:
Tell static and instance members apart. (And distinguish instance methods from delegates.)
Distinguish instance members from local variables and parameters (without using a naming convention).

I think this article explains it a little
http://blogs.msdn.microsoft.com/sourceanalysis/archive/2008/05/25/a-difference-of-style.aspx
...a brilliant young developer at Microsoft (ok, it was me) decided to take it upon himself to write a little tool which could detect variances from the C# style used within his team. StyleCop was born. Over the next few years, we gathered up all of the C# style guidelines we could find from the various teams within Microsoft, and picked out all of best practices which were common to these styles. These formed the first set of StyleCop rules. One of the earliest rules that came out of this effort was the use of the this prefix to call out class members, and the removal of any underscore prefixes from field names. C# style had officially grown apart from its old C++ tribe.

this.This
this.Does
this.Not
this.Add
this.Clarity
this.Nor
this.Does
this.This
this.Add
this.Maintainability
this.To
this.Code
The usage of "this.", when used excessively or a forced style requirement, is nothing more then a contrivance used under the guise that there is < 1% of developers that really do not understand code or what they are doing, and makes it painful for 99% who want to write easily readable and maintainable code.
As soon as you start typing, Intellisence will list the content available in the scope of where you are typing, "this." is not necessary to expose class members, and unless you are completely clueless to what you are coding for you should be able to easily find the item you need.
Even if you are completely clueless, use "this." to hint what is available, but don't leave it in code. There are also a slew of add-ons like Resharper that help to bring clarity to the scope and expose the contents of objects more efficiently. It is better to learn how to use the tools provided to you then to develop a bad habit that is hated by a large number of your co-workers.
Any developer that does not inherently understand the scope of static, local, class or global content should not rely on "hints" to indicate the scope. "this." is worse then Hungarian notation as at least Hungarian notation provided an idea about the type the variable is referencing and serves some benefit. I would rather see "_" or "m" used to denote class field members then to see "this." everywhere.
I have never had an issue, nor seen an issue with a fellow developer that repeatedly fights with code scope or writes code that is always buggy because of not using "this." explicitly. It is an unwarranted fear that "this." prevents future code bugs and is often the argument used where ignorance is valued.
Coders grow with experience, "this." is like asking someone to put training wheels on their bike as an adult because it is what they first had to use to learn how to ride a bike. And adult might fall off a bike 1 in 1,000 times they get on it, but that is no reason to force them to use training wheels.
"this." should be banned from the language definition for C#, unfortunately there is only one reason for using it, and that is to resolve ambiguity, which could also be easily resolved through better code practices.

A few basic reasons for using this (and I coincidentally always prefix class values with the name of the class of which they are a part as well - even within the class itself).
1) Clarity. You know right this instant which variables you declared in the class definition and which you declared as locals, parameters and whatnot. In two years, you won't know that and you'll go on a wondrous voyage of re-discovery that is absolutely pointless and not required if you specifically state the parent up front. Somebody else working on your code has no idea from the get-go and thus benefits instantly.
2) Intellisense. If you type 'this.' you get all instance-specific members and properties in the help. It makes finding things a lot easier, especially if you're maintaining somebody else's code or code you haven't looked at in a couple of years. It also helps you avoid errors caused by misconceptions of what variables and methods are declared where and how. It can help you discover errors that otherwise wouldn't show up until the compiler choked on your code.
3) Granted you can achieve the same effect by using prefixes and other techniques, but this begs the question of why you would invent a mechanism to handle a problem when there is a mechanism to do so built into the language that is actually supported by the IDE? If you touch-type, even in part, it will ultimately reduce your error rate, too, by not forcing you to take your fingers out of the home position to get to the underscore key.
I see lots of young programmers who make a big deal out of the time they will save by not typing a character or two. Most of your time will be spent debugging, not coding. Don't worry so much about your typing speed. Worry more about how quickly you can understand what is going on in the code. If you save a total of five minutes coding and win up spending an extra ten minutes debugging, you've slowed yourself down, no matter how fast you look like you're going.

Note that the compiler doesn't care whether you prefix references with this or not (unless there's a name collision with a local variable and a field or you want to call an extension method on the current instance.)
It's up to your style. Personally I remove this. from code as I think it decreases the signal to noise ratio.
Just because Microsoft uses this style internally doesn't mean you have to. StyleCop seems to be a MS-internal tool gone public. I'm all for adhering to the Microsoft conventions around public things, such as:
type names are in PascalCase
parameter names are in camelCase
interfaces should be prefixed with the letter I
use singular names for enums, except for when they're [Flags]
...but what happens in the private realms of your code is, well, private. Do whatever your team agrees upon.
Consistency is also important. It reduces cognitive load when reading code, especially if the code style is as you expect it. But even when dealing with a foreign coding style, if it's consistent then it won't take long to become used to it. Use tools like ReSharper and StyleCop to ensure consistency where you think it's important.
Using .NET Reflector suggests that Microsoft isn't that great at adhering to the StyleCop coding standards in the BCL anyway.

I do follow it, because I think it's really convenient to be able to tell apart access to static and instance members at first glance.
And of course I have to use it in my constructors, because I normally give the constructor parameters the same names as the field their values get assigned to. So I need "this" to access the fields.

In addition it is possible to duplicate variable names in a function so using 'this' can make it clearer.
class foo {
private string aString;
public void SetString(string aString){
//this.aString refers to the class field
//aString refers to the method parameter
this.aString = aString;
}
}

I follow it mainly for intellisense reasons. It is so nice typing this. and getting a consise list of properties, methods, etc.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.