Understanding C# field initialization requirements - c#

Considering the following code:
public class Progressor
{
private IProgress<int> progress = new Progress<int>(OnProgress);
private void OnProgress(int value)
{
//whatever
}
}
This gives the following error on compilation:
A field initializer cannot reference the non-static field, method, or property 'Progressor.OnProgress(int)'
I understand the restriction it is complaining about, but I don't understand why it is an issue, but the field can be initialized in the constructor instead as follows:
public class Progressor
{
private IProgress<int> progress;
public Progressor()
{
progress = new Progress<int>(OnProgress);
}
private void OnProgress(int value)
{
//whatever
}
}
What is the difference in C# regarding the field initialization vs constructor initialization that requires this restriction?

Field initialization come before base class constructor call, so it is not a valid object. Any method call with this as argument at this point leads to unverifiable code and throws a VerificationException if unverifiable code is not allowed. For example: in security transparent code.
10.11.2 Instance variable initializers
When an instance constructor has no constructor initializer, or it has a constructor initializer of the form base(...), that constructor implicitly performs the initializations specified by the variable-initializers of the instance fields declared in its class. This corresponds to a sequence of assignments that are executed immediately upon entry to the constructor and before the implicit invocation of the direct base class constructor. The variable initializers are executed in the textual order in which they appear in the class declaration.
10.11.3 Constructor execution
Variable initializers are transformed into assignment statements, and these assignment statements are executed before the invocation of the base class instance constructor. This ordering ensures that all instance fields are initialized by their variable initializers before any statements that have access to that instance are executed.

Everything in my answer is just my thoughts on 'why it would be dangerous to allow that kind of access'. I don't know if that's the real reason why it was restricted.
C# spec says, that field initialization happens in the order fields are declared in the class:
10.5.5.2. Instance field initialization
The variable initializers are executed in the textual order in which
they appear in the class declaration.
Now, let's say the code you've mentioned is possible - you can call instance method from field initialization. It would make following code possible:
public class Progressor
{
private string _first = "something";
private string _second = GetMyString();
private string GetMyString()
{
return "this is really important string";
}
}
So far so good. But let's abuse that power a little bit:
public class Progressor
{
private string _first = "something";
private string _second = GetMyString();
private string _third = "hey!";
private string GetMyString()
{
_third = "not hey!";
return "this is really important string";
}
}
So, _second get's initialized before _third. GetMyString runs, _third get's "not hey!" value assigned, but later on it's own field initialization runs, and it's being set to `"hey!". Not really useful nor readable, right?
You could also use _third within GetMyString method:
public class Progressor
{
private string _first = "something";
private string _second = GetMyString();
private string _third = "hey!";
private string GetMyString()
{
return _third.Substring(0, 1);
}
}
What would you expect to be value of _second? Well, before field initialization runs all the fields get default values. For string it would be null, so you'll get unexpected NullReferenceException.
So imo, designers decided it's just easier to prevent people from making that kind of mistakes at all.
You could say, OK let's disallow accessing properties and calling methods, but let's allow using fields that were declared above the one you want to access it from. Something like:
public class Progressor
{
private string _first = "something";
private string _second = _first.ToUpperInvariant();
}
but not
public class Progressor
{
private string _first = "something";
private string _second = _third.ToUpperInvariant();
private string _third = "another";
}
That's seems useful and safe. But there is still a way to abuse it!
public class Progressor
{
private Lazy<string> _first = new Lazy<string>(GetMyString);
private string _second = _first.Value;
private string GetMyString()
{
// pick one from above examples
}
}
And all the problems with methods happen to come back again.

Section 10.5.5.2: Instance field initialization describes this behavior:
A variable initializer for an instance field cannot reference the
instance being created. Thus, it is a compile-time error to reference
this in a variable initializer, as it is a compile-time error for a
variable initializer to reference any instance member through a
simple-name
This behavior applies to your code because OnProgress is an implicit reference to the instance being created.

The answer is more or less, the designers of C# preferred it that way.
Since all field initializers are translated into instructions in the constructor(s) which go before any other statements in the constructor, there is no technical reason why this should not be possible. So it is a design choice.
The good thing about a constructor is that it makes it clear in what order the assignments are done.
Note that with static members, the C# designers chose differently. For example:
static int a = 10;
static int b = a;
is allowed, and different from this (also allowed):
static int b = a;
static int a = 10;
which can be confusing.
If you make:
partial class C
{
static int b = a;
}
and elsewhere (in other file):
partial class C
{
static int a = 10;
}
I do not even think it is well-defined what will happen.
Of course for your particular example with delegates in an instance field initializer:
Action<int> progress = OnProgress; // ILLEGAL (non-static method OnProgress)
there is really no problem since it is not a read or an invocation of the non-static member. Rather the method info is used, and it does not depend on any initialization. But according to the C# Language Specification it is still a compile-time error.

Related

Why is `this` not available in C# 6.0 Auto-Property Initialization?

I have the following code class:
public class Foo
{
public Nested Bar { get; } = new Nested(this);
public class Nested
{
public Nested(Foo foo)
{
foo.DoSomething();
}
}
private void DoSomething()
{
}
}
However, I get this compile error:
Keyword 'this' is not available in the current context
I can fix it by simply not using Auto-Property Initializer, and explicitly move it into a constructor instead:
public Nested Bar { get; }
public Foo()
{
this.Bar = new Nested(this);
}
Why is it so? Isn't Auto-Property Initializer actually translated into constructor code in IL?
Simply: you can't use this in initializers. The idea is to prevent an incomplete object from escaping - Nested(this) could do anything to your object, leading to very confusing and hard to understand bugs. Keep in mind that initializers execute before any constructor that you add. The same thing fails for field initializers too, in exactly the same way:
private Nested _field = new Nested(this);
Essentially, initializers are intended to perform simple initializations - fixing the 98% problem. Anything involving this is more complex, and you'll need to write your own constructor - and take the blame for any timing issues :)
Why is it so? Isn't Auto-Property Initializer actually translated into constructor code in IL?
The rules for automatically implemented property initializers are the same as those for field initializers, for the same reason. Note that property initializers are executed before base class bodies, just like field initializers - so you're still in the context of a "somewhat uninitialized" object; more so than during a constructor body.
So you should imagine that the property is being converted into this:
private readonly Nested bar = new Nested(this); // Invalid
public Nested Bar
{
get { return bar; }
}
In short, this restriction is to stop you from getting yourself into trouble. If you need to refer to this when initializing a property, just do it manually in a constructor, as per your second example. (It's relatively rare in my experience.)

C# 6 auto-properties - read once or every time?

I follow a pattern when setting certain properties whereby I check to see if the corresponding field is empty, returning the field if not and setting it if so. I frequently use this for reading configuration settings, for example, so that the setting is read lazily and so that it is only read once. Here is an example:
private string DatabaseId
{
get
{
if (string.IsNullOrEmpty(databaseId))
{
databaseId = CloudConfigurationManager.GetSetting("database");
}
return databaseId;
}
}
I have started to use C# 6 autoproperty initialization as it really cleans up and makes my code more concise. I would like to do something like this:
private string DatabaseId { get; } = CloudConfigurationManager.GetSetting("database");
But I'm not sure how the compiler interprets it in this case. Will this have the same effect as my first block of code, setting the (automatically implemented) field once, and thereafter reading from the field? Or will this call the CloudConfigurationManager every time I get DatabaseId?
What you show:
private string DatabaseId { get; } = CloudConfigurationManager.GetSetting("database");
Is an "Auto-Property Initializer", keyword being "initializer", from MSDN Blogs: C# : The New and Improved C# 6.0:
The auto-property initializer allows assignment of properties directly within their declaration. For read-only properties, it takes care of all the ceremony required to ensure the property is immutable.
Initializers run once per instance (or once per type for static members). See C# Language Specification, 10.4.5 Variable initializers:
For instance fields, variable initializers correspond to assignment statements that are executed when an instance of the class is created.
So that code compiles to something like this:
public class ContainingClass
{
private readonly string _databaseId;
public string DatabaseId { get { return _databaseId; } }
public ContainingClass()
{
_databaseId = CloudConfigurationManager.GetSetting("database");
}
}
For static variables, this kind of looks the same:
private static string DatabaseId { get; } = CloudConfigurationManager.GetSetting("database");
Compiles to, more or less:
public class ContainingClass
{
private static readonly string _databaseId;
public static string DatabaseId { get { return _databaseId; } }
static ContainingClass()
{
_databaseId = CloudConfigurationManager.GetSetting("database");
}
}
Though not entirely, as when the type doesn't have a static constructor, "static field initializers are executed at an implementation-dependent time prior to the first use of a static field of that class".
C# 6.0 readonly auto property will create a field and invoke the initializer only once.
However, that is not equal to what you have there. In your code, CloudConfigurationManager.GetSetting will be called only when someone reads the DatabaseId property but with "readonly auto property" CloudConfigurationManager.GetSetting will be called at the time of class initialization itself.
This difference may/mayn't matter. It depends. If the call is expensive then you can use Lazy<T> which is roughly equal to what you have.
It will set the value only once and after that just read it.
However there's a slight difference in the sense that you now no longer have a databaseId field. In your first example you basically check for id == null || id == "" to set the database string. That means that if you create a new instance with databaseId set to an empty string, the first example will still get the ID from the settings.
The second example however will see that empty string as a valid value and remain with it.
First code:
if(id == null || id == "") // Get ID from settings
Second code:
if(id == null) // Get ID from settings
An auto-property has automatically a backing field. In this case, this field will only be assignable from the constructor or from the auto-property initializer. Your new code is better than the first one. It will only make one call to CloudConfigurationManager.GetSetting("database");. In the first example, you have to make a check every time your property get is called.

Field initializer accessing 'this' reloaded

This question is an extension of Cristi Diaconescu's about the illegality of field initializers accessing this in C#.
This is illegal in C#:
class C
{
int i = 5;
double[] dd = new double[i]; //Compiler error: A field initializer cannot reference the non-static field, method, or property.
}
Ok, so the reasonable explanation to why this is illegal is given by, among others, Eric Lippert:
In short, the ability to access the receiver before the constructor body runs is a feature of marginal benefits that makes it easier to write buggy programs. The C# language designers therefore disabled it entirely. If you need to use the receiver then put that logic in the constructor body.
Also, the C# specifications are pretty straightforward (up to a point):
A variable initializer for an instance field cannot reference the instance being created. Thus, it is a compile-time error to reference this in a variable initializer, as it is a compile-time error for a variable initializer to reference any instance member through a simple-name.
So my question is: what does "through a simple-name" mean?
Is there some alternative mechanism where this would be legal? I am certain that almost every word in the specification is there for a very specific reason, so what is the reason of limiting the illegality of this particular code to references through simple names?
EDIT: I've not worded my question too well. I'm not asking for the definition of "simple-name", I am asking about the reason behind limiting the illegality to that particular scenario. If it is always illegal to reference any instance member in any which way, then why specify it so narrowly? And if its not, then what mechanism would be legal?
It isn't possible, in the general case, to determine whether an expression refers to the object being constructed, so prohibiting it and requiring compilers to diagnose it would require the impossible. Consider
partial class A {
public static A Instance = CreateInstance();
public int a = 3;
public int b = Instance.a;
}
It's possible, and as far as I know perfectly valid, even if it a horrible idea, to create an object with FormatterServices.GetUninitializedObject(typeof(A)), set A.Instance to that, and then call the constructor. When b is initialised, the object reads its own a member.
partial class A {
public static A CreateInstance() {
Instance = (A)FormatterServices.GetUninitializedObject(typeof(A));
var constructor = typeof(A).GetConstructor(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic, null, Type.EmptyTypes, null);
var helperMethod = new DynamicMethod(string.Empty, typeof(void), new[] { typeof(A) }, typeof(A).Module, true);
var ilGenerator = helperMethod.GetILGenerator();
ilGenerator.Emit(OpCodes.Ldarg_0);
ilGenerator.Emit(OpCodes.Call, constructor);
ilGenerator.Emit(OpCodes.Ret);
var constructorInvoker = (Action<A>)helperMethod.CreateDelegate(typeof(Action<A>));
constructorInvoker(Instance);
return Instance;
}
}
static class Program {
static void Main() {
Console.WriteLine("A.Instance = (a={0}, b={1})", A.Instance.a, A.Instance.b);
}
}
You can only get compiler errors for what's detectable at compile time.
According to the documentation:
A simple-name consists of a single identifier.
I suppose they clarify this because this.i is equivalent to i within a class method, when no variable named i is in scope. They've already forbade the use of this outside of an instance method:
class C
{
int i = 5;
double[] dd = new double[this.i];
//Compiler error: Keyword 'this' is not available in the current context.
}
If this language wasn't there, some might read this as meaning you could reference instance variables simply by omitting the keyword this.
The best alternative is to use a constructor:
class C
{
int i = 5;
double[] dd;
C()
{
dd = new double[i];
}
}
You can also do this:
class C
{
public int i = 5;
}
class D
{
double[] dd = new double[new C().i];
}
Thanks to the fact that the two members are in different classes, the order in which they are initialized is unambiguous.
You can always do really messed up stuff when unmanaged code comes into play. Consider this:
public class A
{
public int n = 42;
public int k = B.Foo();
public A()
{
}
}
public class B
{
public static unsafe int Foo()
{
//get a pointer to the newly created instance of A
//through some trickery.
//Possibly put some distinctive field value in `A` to make it easier to find
int i = 0;
int* p = &i;
//get p to point to n in the new instance of `A`
return *p;
}
}
I spent a bit of time trying to actually implement this (for kicks) but gave up after a bit. That said, you can get a pointer to the heap and then just start looking around for something that you can recognize as an instance of A and then grab the n value from it. It would be hard, but it is possible.
I think you are just misreading the last sentence. The spec flatly states an instance field initializer cannot reference the instance being created. It is then simply citing examples. You cannot use this and for the same reason you cannot use a "simple-name" because a simple name access implicitly uses this. The spec is not narrowing the cases. It simply calling out some specific constructions that are illegal. Another one would be using base to access a protected field from a base class.

How can I ensure this private readonly array is actually private and readonly?

I'm trying to figure out how I am able to successfully change a "readonly" array. The code below runs successfully, but I'm quite confused as to why the dereferencing of a private/readonly array is legal, as marked below:
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
MyClass myClass = new MyClass();
myClass.Time[5] = 5; // Why is this legal? How can I make it illegal?
}
}
public class MyClass
{
private readonly uint[] time;
public IList<uint> Time
{
get { return time; }
}
public MyClass()
{
time = new uint[7];
}
}
}
As I Note above, I would expect that Time[5] would be illegal due to the fact that public IList Time does not have a setter.
How can I change MyClass to ensure that it is not legal to do myClass.Time[5] ?
Note: I've clarified the intent of this question, I was unclear at the start that the intention is to make this ILLEGAL. And I want to understand why its legal in the first place as is.
As I Note above, I would expect that
Time[5] would be illegal due to the
fact that public IList Time does not
have a setter.
The absence of a setter means that you can't assign a NEW ARRAY to the backing field of the property, but it doesn't mean that you can't change the CURRENT array reference that the backing field is pointing to.
Additionally, how can I create an
array in the constructor which is
read-only and unchangeable outside of
this class?
You can instatnitate a readonly field either at the declaration stage or in the constructor of the class as per MSDN.
As for how to fix this, the following MSDN article discuses this exact issue and some way to remedy it. I am not sure what your requirements are, but I would recommend looking into implementing a custom collection using ReadOnlyCollectionBase then passing that along or you can use ReadOnlyCollection<T>. The link to the ReadOnlyCollectionBase provides an example of an implementation.
readonly means that the field itself cannot be changed (that is, you cannot say "this.time = new uint[10]" outside the constructor). Arrays are mutable objects, so anywhere you have a reference to the array, the possessor of that reference can change the values stored in that array.
readonly fields are writable only in the constructor (this includes field initializers)
Two options for you:
Do a shallow copy of the array in the Time property, so callers can't modify your copy of the array
Use ReadOnlyCollection to prevent modifications at all
Time property hasn't setter so you will be not able to do something like this:
static void Main(string[] args)
{
MyClass myClass = new MyClass();
myClass.Time = new List<uint>();
}
but you are able to use indexer so Time[5] is legal.
Additionally, how can I create an array in the constructor which is read-only and unchangeable outside of this class?
read-only fields can be initialized in constructor only. Initialization right after declaration is the same as initialization in constructor.

Does C# resolve dependencies among static data members automatically?

If one static data member depends on another static data member, does C#/.NET guarantee the depended static member is initialized before the dependent member?
For example, we have one class like:
class Foo
{
public static string a = "abc";
public static string b = Foo.a + "def";
}
When Foo.b is accessed, is it always "abcdef" or can be "def"?
If this is not guaranteed, is there any better way to make sure depended member initialized first?
Like said before, static field initialization is deterministic and goes according to the textual declaration ordering.
Take this, for example:
class Foo
{
public static string b = a + "def";
public static string a = "abc";
}
Foo.b will always result in "def".
For that matter, when there is a dependency between static fields, it is better to use a static initializer :
class Foo
{
public static string b;
public static string a;
static Foo()
{
a = "abc";
b = a + "def";
}
}
That way, you explicitly express your concern about the initialization order; or dependency for that matter (even if the compiler won't help if you accidentally swap the initialization statements.) The above will have the expected values stored in a and b (respectively "abc" and "abcdef").
However, things might get twisty (and implementation specific) for the initialization of static fields defined in multiple classes. The section 10.4.5.1 Static field initialization of the language specification talks about it some more.
It will show allways "abcdef", because initialization goes top down in source, today just like before.
All static members will be initialized upon loading of the classtype holding them.

Categories

Resources