Strange struct constructor compiler workaround

Strange struct constructor compiler workaround - c#

Structs cannot contain explicit parameterless constructors. Such as:
public struct Person
{
public string Name { get; }
public int Age { get; }
public Person() { Name = string.Empty; Age = 0; }
}
However, this is allowed:
public struct Person
{
public string Name { get; }
public int Age { get; }
public Person(string name = null, int age = 0) { Name = name; Age = age; }
}
Any ideas why? Any reason this is bad to do?

The original answer (see the update below):
The second one is allowed because it is not parameterless. But I wouldn't use optional parameters here because it is very confusing - if you call new Person(), your constructor will not be executed (you can check it if you replace the default values other than null and zero):
public struct Person
{
public string Name { get; }
public int Age { get; }
public Person(string name = "Bob", int age = 42)
{
Name = name;
Age = age;
}
}
So new Person() is the same as default(Person), both will use the initobj MSIL instruction instead of calling a constructor.
So why would it be a problem if you could define a default constructor for structs? Consider the following examples.
private void Test()
{
Person p1 = new Person(); // ok, this is clear, use ctor if possible, otherwise initobj
Person p2 = default(Person); // use initobj
Person p3 = CreateGeneric<Person>(); // and here?
Person[] persons = new Person[100]; // do we have initialized persons?
}
public T CreateGeneric<T>() where T : new()
{
return new T();
}
So real parameterless constructors are not allowed for structs in C# (in CLR it is supported, though). Actually parameterless constructors were planned to be introduced in C# 6.0; however, it caused so many compatibility problems that at the end this feature was removed from the final release.
Update 2022:
Starting with C# 10 parameterless struct constructors are actually supported due to struct records with field initializers. From the docs:
If a record struct does not contain a primary constructor nor any instance constructors, and the record struct has field initializers, the compiler will synthesize a public parameterless instance constructor.
But not as if everything was obvious now. Let's revisit the examples above:
new Person(): Even this case is a bit confusing. Obviously, if you have a parameterless constructor, it will call that one. But unlike in case of classes, if there is no parameterless constructor, it will use the initobj instruction even if there is a constructor overload with optional parameters only (the OP's 2nd example).
default(Person): This is clear, the initobj instruction will be used
CreateGeneric<Person>(): It turns out that it also invokes the parameterless struct constructor... well, at least for the first time. But when targeting .NET Framework, it fails to invoke the constructor for subsequent calls.
new Persons[100]: No constructor call (which is actually expected)
And the feature has some further unexpected implications:
If a field has an initializer and there are only parameterized constructors, then new MyStruct() does not initialize the fields.
Parameter default value initialization like void Method(Person p = new Person()) also fails to call the parameterless constructor. In the meantime this was 'fixed' by emitting CS1736 if there is a parameterless constructor; otherwise, it is allowed and means just default.
If you target .NET Framework, Activator.CreateInstance(Type) also works incorrectly (behaves the same way as CreateGeneric<Person>() above): the constructor is invoked only for the first time.
Expression.New(Type) also works incorrectly, not just in .NET Framework but on all .NET/Core platforms prior version 6.0
And the story is not over. Now it seems that auto-synthesizing parameterless constructor will be removed, which makes the first bullet point above illegal and will be a breaking change from C# 10. But there are also further open issues such as this one, which also needs some changes to language specification.

A parameter-less constructor for a struct will make it more tempting to create mutable structs which are considered evil.
var person = new Person();
person.Age = 35;
...
I am sure there are other reasons, but a major pain is that because they are copied as they are passed around and it is easy to change the wrong struct and therefore easier to make an error that is difficult to diagnose.
public void IncreaseAge(Person p)
{
p += 1; // does not change the original value passed in, only the copy
}

Any ideas why?
This constructor is not implicitly parameterless -- it has 2 parameters.
Any reason this is bad to do?
Here is a reason: The code can be hard to reason about.
You may expect that if you wrote:
var people = new Person[100];
that all People in that array would have the default values for the optional arguments but this is untrue.

The values of optional arguments are hardcoded into the calling code.
If later you choose the change your defaults, all compiled code will retain the values it was compiled with.

Related

In C#, is there a way to reference the calling class in a constructor?

I am making a constructor for a class in c#, and I would like it to fill its values differently depending on the type of class that called it.
For example, there is a class called Employer and a class called Person.
When an instance of Employer calls new Person(); I would like the constructor in Person to set the new person's Employed variable to true.
Is this possible in c#?
Tried searching for an answer but was unsure how to word the question.

You can't do it automatically, no. (You could grab a stack trace and parse that, but it would be horribly brittle in the face of JIT compiler optimizations etc.) I'd argue that doing so would make the code brittle and hard to maintain, too - the effect would be like "spooky action at a distance".
The simplest option is to add a bool employed parameter in the constructor instead. Then it's really obvious at every call site how you want the constructed object to behave.

There are a few different ways to do this. The first is to overload the constructor.
public Person() {
this.Employed = false;
}
public Person(bool employed) {
this.Employed = employed;
}
The second that comes to mind is to populate the expected values when instantiating the object.
Person myPerson = new Person {Employed = true };

You can have multiple constructors with different inputs for a class:
public Person() {
this.Employed = false;
}
public Person(bool employed) {
this.Employed = employed;
}
public Person(bool employed,bool _isEmploye) {
if(_isEmploye)
this.Employed = true;
else
this.Employed = false;
}
and use appropriate inputs wherever you call:
Person p = new Person(true,true);

Field initializer accessing 'this' reloaded

This question is an extension of Cristi Diaconescu's about the illegality of field initializers accessing this in C#.
This is illegal in C#:
class C
{
int i = 5;
double[] dd = new double[i]; //Compiler error: A field initializer cannot reference the non-static field, method, or property.
}
Ok, so the reasonable explanation to why this is illegal is given by, among others, Eric Lippert:
In short, the ability to access the receiver before the constructor body runs is a feature of marginal benefits that makes it easier to write buggy programs. The C# language designers therefore disabled it entirely. If you need to use the receiver then put that logic in the constructor body.
Also, the C# specifications are pretty straightforward (up to a point):
A variable initializer for an instance field cannot reference the instance being created. Thus, it is a compile-time error to reference this in a variable initializer, as it is a compile-time error for a variable initializer to reference any instance member through a simple-name.
So my question is: what does "through a simple-name" mean?
Is there some alternative mechanism where this would be legal? I am certain that almost every word in the specification is there for a very specific reason, so what is the reason of limiting the illegality of this particular code to references through simple names?
EDIT: I've not worded my question too well. I'm not asking for the definition of "simple-name", I am asking about the reason behind limiting the illegality to that particular scenario. If it is always illegal to reference any instance member in any which way, then why specify it so narrowly? And if its not, then what mechanism would be legal?

It isn't possible, in the general case, to determine whether an expression refers to the object being constructed, so prohibiting it and requiring compilers to diagnose it would require the impossible. Consider
partial class A {
public static A Instance = CreateInstance();
public int a = 3;
public int b = Instance.a;
}
It's possible, and as far as I know perfectly valid, even if it a horrible idea, to create an object with FormatterServices.GetUninitializedObject(typeof(A)), set A.Instance to that, and then call the constructor. When b is initialised, the object reads its own a member.
partial class A {
public static A CreateInstance() {
Instance = (A)FormatterServices.GetUninitializedObject(typeof(A));
var constructor = typeof(A).GetConstructor(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic, null, Type.EmptyTypes, null);
var helperMethod = new DynamicMethod(string.Empty, typeof(void), new[] { typeof(A) }, typeof(A).Module, true);
var ilGenerator = helperMethod.GetILGenerator();
ilGenerator.Emit(OpCodes.Ldarg_0);
ilGenerator.Emit(OpCodes.Call, constructor);
ilGenerator.Emit(OpCodes.Ret);
var constructorInvoker = (Action<A>)helperMethod.CreateDelegate(typeof(Action<A>));
constructorInvoker(Instance);
return Instance;
}
}
static class Program {
static void Main() {
Console.WriteLine("A.Instance = (a={0}, b={1})", A.Instance.a, A.Instance.b);
}
}
You can only get compiler errors for what's detectable at compile time.

According to the documentation:
A simple-name consists of a single identifier.
I suppose they clarify this because this.i is equivalent to i within a class method, when no variable named i is in scope. They've already forbade the use of this outside of an instance method:
class C
{
int i = 5;
double[] dd = new double[this.i];
//Compiler error: Keyword 'this' is not available in the current context.
}
If this language wasn't there, some might read this as meaning you could reference instance variables simply by omitting the keyword this.
The best alternative is to use a constructor:
class C
{
int i = 5;
double[] dd;
C()
{
dd = new double[i];
}
}
You can also do this:
class C
{
public int i = 5;
}
class D
{
double[] dd = new double[new C().i];
}
Thanks to the fact that the two members are in different classes, the order in which they are initialized is unambiguous.

You can always do really messed up stuff when unmanaged code comes into play. Consider this:
public class A
{
public int n = 42;
public int k = B.Foo();
public A()
{
}
}
public class B
{
public static unsafe int Foo()
{
//get a pointer to the newly created instance of A
//through some trickery.
//Possibly put some distinctive field value in `A` to make it easier to find
int i = 0;
int* p = &i;
//get p to point to n in the new instance of `A`
return *p;
}
}
I spent a bit of time trying to actually implement this (for kicks) but gave up after a bit. That said, you can get a pointer to the heap and then just start looking around for something that you can recognize as an instance of A and then grab the n value from it. It would be hard, but it is possible.

I think you are just misreading the last sentence. The spec flatly states an instance field initializer cannot reference the instance being created. It is then simply citing examples. You cannot use this and for the same reason you cannot use a "simple-name" because a simple name access implicitly uses this. The spec is not narrowing the cases. It simply calling out some specific constructions that are illegal. Another one would be using base to access a protected field from a base class.

Is it bad practice to initialise fields outside of an explicit constructor [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Initialize class fields in constructor or at declaration?
We are arguing about coding practices. The examples here are a little too simple, but the real deal has several constructors. In order to initialise the simple values (eg dates to their min value) I have moved the code out of the constructors and into the field definitions.
public class ConstructorExample
{
string _string = "John";
}
public class ConstructorExample2
{
string _string;
public ConstructorExample2()
{
_string = "John";
}
}
How should it be done by the book? I tend to be very case by case and so am maybe a little lax about this kind of thing. However i feel that occams razor tells me to move the initialisation out of multiple constructors. Of course, I could always move this shared initialisation into a private method.
The question is essentially ... is initialising fields where they are defined as opposed to the constructor bad in any way?
The argument I am facing is one of error handling, but i do not feel it is relevant as there are no possible exceptions that won't be picked up at compile time.

Note that all such field declaration-level initialization will be performed once for each constructor-chain, even if the constructor by itself sets the field to something else.
If you chain constructors together, the fields will be initialized in the common, first, constructor that is called.
Look at this example:
using System;
namespace ClassLibrary3
{
public class Class1
{
private string _Name = "Lasse";
public Class1()
{
}
public Class1(int i)
: this()
{
}
public Class1(bool b)
{
_Name = "Test";
}
}
}
This code compiles as this:
using System;
namespace ClassLibrary3
{
public class Class1
{
private string _Name;
public Class1()
{
_Name = "Lasse"
}
public Class1(int i)
: this()
{
// not here, as this() takes care of it
}
public Class1(bool b)
{
_Name = "Lasse"
_Name = "Test";
}
}
}

It's not necessarily bad to initialize values outside of the constructor, and the problem you have here:
string _string;
public ConstructorExample2()
{
_string = "John";
}
Is that if you have multiple constructors you have to remember to either
1. Reinitialize _string in every constructor
2. Separate the logic out into a common method and call that method in every constructor
3. Call the constructor with the logic in it, from the other constructors. (Chain the constructors)
Now this isn't necessarily a problem, but you have to remember to do it. By initializing it outside of the constructor, it's done for you. It's one less thing you need to remember to do.

Microsoft FxCop by default recommends field initializers over using the constructor. This question is also a duplicate of this one and should provide some insight.
With static classes, you'll have to note some subtleties as addressed at this question.

In the above example the assignment of "John" to _string has no logical reliance on any variables and therefore it should be outside of the constructor in the field initializer.
So long as it is not possible to initialize the object in an non-usable state then it doesn't matter.
When the code is compiled both approaches will be the same anyway.

Not sure about C#, but in Java source code they seem to prefer the constructor, example:
public class String{
char[] value;
int offset;
...
public String(){
value = new char[0];
offset = 0;
...
}
}

I think for simple initializations like that it's fine to do it in the declaration. However, I don't understand the error handling argument. Even if there is an exception in the initialization, I think you will find that your normal error handling mechanism will work the same. It will still throw an exception when you call the constructor.

I tend to initialize things in the get accessor, where they are first used. If null then initialize and all that.

I prefer to initialize simple fields like that outside of the constructor.
It shouldn't cause any issues since compilation actually moves those initializations into the constructor at compile-time anyway.

If the initialization of the variable will be the same, no matter what arguments are passed to the constructor, then it doesn't make sense to clutter the constructor method with the unnecessary initialization code. In this case, I initialize in-place.

Inisialing the fields in the constructor is better. This way if/when a different constructor is added you know that all the fields are starting with null/default values and you can initialise them appropriately.

What's the difference between an object initializer and a constructor?

What are the differences between the two and when would you use an "object initializer" over a "constructor" and vice-versa? I'm working with C#, if that matters. Also, is the object initializer method specific to C# or .NET?

Object Initializers were something added to C# 3, in order to simplify construction of objects when you're using an object.
Constructors run, given 0 or more parameters, and are used to create and initialize an object before the calling method gets the handle to the created object. For example:
MyObject myObjectInstance = new MyObject(param1, param2);
In this case, the constructor of MyObject will be run with the values param1 and param2. These are both used to create the new MyObject in memory. The created object (which is setup using those parameters) gets returned, and set to myObjectInstance.
In general, it's considered good practice to have a constructor require the parameters needed in order to completely setup an object, so that it's impossible to create an object in an invalid state.
However, there are often "extra" properties that could be set, but are not required. This could be handled through overloaded constructors, but leads to having lots of constructors that aren't necessarily useful in the majority of circumstances.
This leads to object initializers - An Object Initializer lets you set properties or fields on your object after it's been constructed, but before you can use it by anything else. For example:
MyObject myObjectInstance = new MyObject(param1, param2)
{
MyProperty = someUsefulValue
};
This will behave about the same as if you do this:
MyObject myObjectInstance = new MyObject(param1, param2);
myObjectInstance.MyProperty = someUsefulValue;
However, in multi-threaded environments the atomicity of the object initializer may be beneficial, since it prevents the object from being in a not-fully initialized state (see this answer for more details) - it's either null or initialized like you intended.
Also, object initializers are simpler to read (especially when you set multiple values), so they give you the same benefit as many overloads on the constructor, without the need to have many overloads complicating the API for that class.

A constructor is a defined method on a type which takes a specified number of parameters and is used to create and initialize an object.
An object initializer is code that runs on an object after a constructor and can be used to succinctly set any number of fields on the object to specified values. The setting of these fields occurs after the constructor is called.
You would use a constructor without the help of an object initializer if the constructor sufficiently set the initial state of the object. An object initializer however must be used in conjunction with a constructor. The syntax requires the explicit or implicit use (VB.Net and C#) of a constructor to create the initial object. You would use an object initializer when the constructor does not sufficiently initialize the object to your use and a few simple field and/or property sets would.

When you do
Person p = new Person { Name = "a", Age = 23 };
this is what an object initializer essentially does:
Person tmp = new Person(); //creates temp object calling default constructor
tmp.Name = "a";
tmp.Age = 23;
p = tmp;
Now this facilitates behaviour like this. Knowing how object initializers work is important.

If you have properties that MUST be set on your object for it to work properly, one way is to expose just a single constructor which requires those mandatory properties as parameters.
In that case, you cannot create your object without specifying those mandatory properties. Something like that cannot be enforced by object initializers.
Object initializers are really just a "syntax convenience" to shorten initial assignments. Nice, but not really very functionally relevant.
Marc

A constructor is a method (possibly) accepting parameters and returning a new instance of a class. It may contain initialization logic.
Below you can see an example of a constructor.
public class Foo
{
private SomeClass s;
public Foo(string s)
{
s = new SomeClass(s);
}
}
Now consider the following example:
public class Foo
{
public SomeClass s { get; set; }
public Foo() {}
}
You could achieve the same result as in the first example using an object initializer, assuming that you can access SomeClass, with the following code:
new Foo() { s = new SomeClass(someString) }
As you can see, an object initializer allows you to specify values for public fields and public (settable) properties at the same time construction is performed, and that's especially useful when the constructor doesn't supply any overload initializing certain fields.
Please mind, however that object initializers are just syntactic sugar and that after compilation won't really differ from a sequence of assignments.

Object initializers can be useful to initialize some small collection which can be used for testing purposes in the initial program creation stage. The code example is below:
class Program
{
static void Main(string[] args)
{
List<OrderLine> ordersLines = new List<OrderLine>()
{
new OrderLine {Platform = "AmazonUK", OrderId = "200-2255555-3000012", ItemTitle = "Test product 1"},
new OrderLine {Platform = "AmazonUK", OrderId = "200-2255555-3000013", ItemTitle = "Test product 2"},
new OrderLine {Platform = "AmazonUK", OrderId = "200-2255555-3000013", ItemTitle = "Test product 3"}
};
}
}
class OrderLine
{
public string Platform { get; set; }
public string OrderId { get; set; }
public string ItemTitle { get; set; }
}
Here is the catch. In the above code example isn’t included any constructor and it works correctly, but if some constructor with parameters will be included in the OrderLine class as example:
public OrderLine(string platform, string orderId, string itemTitle)
{
Platform = platform;
OrderId = orderId;
ItemTitle = itemTitle;
}
The compiler will show error - There is no argument given that corresponds to the required formal parameter…. It can be fixed by including in the OrderLine class explicit default constructor without parameters:
public OrderLine() {}

Object initializers are especially useful in LINQ query expressions. Query expressions make frequent use of anonymous types, which can only be initialized by using an object initializer, as shown in the code example below:`
var orderLineReceiver = new { ReceiverName = "Name Surname", ReceiverAddress = "Some address" };
More about it - Object and collection initializers

Now, years later, I am reconsidering the use of Constructors over Object Initializers. I have always liked Object initializers, as they are quick and easy. Pick which fields you want to set, and set them and you are done.
But then along came along the nullable context wherein you must specify which properties are nullable, and which are not. If you ignore using a constructor, and instead use an object initializer, the compiler is not going to be assured that your object is in fact whole (no null properties which should in fact be non-null) But a properly written and used constructor resolves all of that.
But an even better solution is to use the "required" keyword on the fields that are expected to be populated on creation, whether through a constructor, or an object Initializer. It's a new keyword of C# 11, which comes with .net 7

C# .Net 4.0 Named and Default Parameters

What value do you think named and default parameters will add in C#.Net 4.0?
What would be a good use for these (that hasn't already been achieved with overloading and overriding)?

It can make constructors simpler, especially for immutable types (which are important for threading) - see here for a full discussion. Not as nice as it should be perhaps, but nicer than having lots of overloads. You obviously can't use object initializers with immutable objects, so the usual:
new Foo {Id = 25, Name = "Fred"}
isn't available; I'll settle for:
new Foo (Id: 25, Name: "Fred")
This can be extended to the general idea of simplifying overloads, but in most cases I'd prefer overloads that advertise the legal combinations. Constructors are a bit different, IMO, since you are just (typically) defining the initial state.
The COM side of things is also important to a lot of people, but I simply don't use much COM interop - so this isn't as important to me.
Edit re comments; why didn't they just use the same syntax that attributes use? Simple - it can be ambiguous with other members / variables (which isn't an issue with attributes); take the example:
[XmlElement("foo", Namespace = "bar")]
which uses one regular parameter (to the ctor, "foo"), and one named assignment. So suppose we use this for regular named arguments:
SomeMethod("foo", SecondArg = "bar");
(which could also be a constructor; I've used a method for simplicity)
Now... what if we have a variable or a property called SecondArg? This would be ambiguous between using SecondArg as a named argument to SomeMethod, and assigning "bar" to SecondArg, and passing "bar" as a regular argument.
To illustrate, this is legal in C# 3.0:
static void SomeMethod(string x, string y) { }
static void Main()
{
string SecondArg;
SomeMethod("foo", SecondArg = "bar");
}
Obviously, SecondArg could be a property, field, varialble, etc...
The alternative syntax doesn't have this ambiguity.
Edit - this section by 280Z28: Sorry for adding this here, but it's not really a unique answer and it's too long for the comments and includes code. You hinted at the ambiguity but your example didn't highlight the deciding case. I think the example you gave points out something that could be confusing, but the required {} around object initializers prevents an underlying syntactical ambiguity. My explanation for the following code is embedded as the multi-line block comment.
[AttributeUsage(AttributeTargets.Class)]
public sealed class SomeAttribute : Attribute
{
public SomeAttribute() { }
public SomeAttribute(int SomeVariable)
{
this.SomeVariable = SomeVariable;
}
public int SomeVariable
{
get;
set;
}
}
/* Here's the true ambiguity: When you add an attribute, and only in this case
* there would be no way without a new syntax to use named arguments with attributes.
* This is a particular problem because attributes are a prime candidate for
* constructor simplification for immutable data types.
*/
// This calls the constructor with 1 arg
[Some(SomeVariable: 3)]
// This calls the constructor with 0 args, followed by setting a property
[Some(SomeVariable = 3)]
public class SomeClass
{
}

It will help to dodge the problem of providing a decent API to work with Office applications! :)
Some parts of the Office API are okay, but there are edge cases that were clearly designed for use from a language with optional/named parameters. So that's why C# has to have them.

Optional parameters also avoid the problem where classes provide dozens of methods that are just variations on the arguments accepted.
Consider the Exception class. Instead of one constructor with optional arguments, it has four constructors for each combination of 'has message', and 'has inner exception'. That's alright, but now consider what happens if you provide a null value to a constructor taking an innerException? Does it act exactly like the constructor with with no innerException parameter, sortof like the constructor with no innerException parameter, or does it throw a null reference exception?
A single constructor with 2 optional parameters would have made it more obvious that passing a null innerException was equivalent to not including it at all. A perfect place for default arguments.
Also don't forget that now every derived Exception class also has to include 4 constructors, which is a pointless hassle.

It will make COM interop a lot easier.
Until C# 4 VB.Net was a much better language for interop. Without defaults you have massive lists of dummy ref parameters in C#.

Brevity of code is the obvious one that springs to mind. Why define several overloads when you can define one function. Also, though, if you have two identically typed parameters, it isn't always possible to construct the full set of overloads you might need.

Also this doesn't compile:
[AttributeUsage(AttributeTargets.Property, Inherited = false, AllowMultiple = true)]
sealed class MyAttribute : Attribute
{
public MyAttribute(object a = null)
{
}
}
class Test
{
[My] // [My(a: "asd")]
int prop1 { get; set; }
}
while this does:
[AttributeUsage(AttributeTargets.Property, Inherited = false, AllowMultiple = true)]
sealed class MyAttribute : Attribute
{
public MyAttribute()
{
}
public object a { get; set; }
}
class Test
{
[My] // [My(a=null)]
int prop1 { get; set; }
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.