What is the behind-the-scenes difference between int? and int data types?
Is int? somehow a reference type?
? wraps the value type (T) in a Nullable<T> struct:
http://msdn.microsoft.com/en-us/library/b3h38hb0.aspx
In addition to "int?" being a shortcut for "Nullable", there was also infrastructure put into the CLR in order to implicitly and silently convert between "int?" and "int". This also means that any boxing operation will implicitly box the actual value (i.e., it's impossible to box Nullable as Nullable, it always results in either the boxed value of T or a null object).
I ran into many of these issues when trying to create Nullable when you don't know T at compile time (you only know it at runtime). http://bradwilson.typepad.com/blog/2008/07/creating-nullab.html
For one of the better "behind the scenes" discussions about Nullable types you should look at CLR Via C# by Jeffrey Richter.
The whole of Chapter 18 is devoted to discussing in detail Nullable types. This book is also excellent for many other areas of the .NET CLR internals.
I learned that you must explicitly cast a nullable value type to a none-nullable value type, as the following example shows:
int? n = null;
//int m1 = n; // Doesn't compile
int n2 = (int)n; // Compiles, but throws an exception if n is null
MS Document
Related
This isn't my first question about nullable reference types as it's been few months I'm experiencing with it. But the more I'm experiencing it, the more I'm confused and the less I see the value added by that feature.
Take this code for example
string? nullableString = default(string?);
string nonNullableString = default(string);
int? nullableInt = default(int?);
int nonNullableInt = default(int);
Executing that gives:
nullableString => null
nonNullableString => null
nullableInt => null
nonNullableInt => 0
The default value of an (non-nullable) integer has always been 0 but
to me it doesn't make sense a non-nullable string's default value is null.
Why this choice? This is opposed to the non-nullable principles we've always been used to.
I think the default non-nullable string's default value should have been String.Empty.
I mean somewhere deep down in the implementation of C# it must be specified that 0 is the default value of an int. We also could have chosen 1 or 2 but no, the consensus is 0. So can't we just specify the default value of a string is String.Empty when the Nullable reference type feature is activated? Moreover it seems Microsoft would like to activate it by default with .NET 5 projects in a near future so this feature would become the normal behavior.
Now same example with an object:
Foo? nullableFoo = default(Foo?);
Foo nonNullableFoo = default(Foo);
This gives:
nullableFoo => null
nonNullableFoo => null
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available).
Why by default setting to null an object that isn't supposed to be null?
Now extending this question even more
string nonNullableString = null;
int nonNullableInt = null;
The compiler gives a warning for the 1st line which could be transformed into an error with a simple configuration in our .csproj file: <WarningsAsErrors>CS8600</WarningsAsErrors>.
And it gives a compilation error for the 2nd line as expected.
So the behavior between non-nullable value types and non-nullable reference types isn't the same but this is acceptable since I can override it.
However when doing that:
string nonNullableString = null!;
int nonNullableInt = null!;
The compiler is completely fine with the 1st line, no warning at all.
I discovered null! recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.
Considering it doesn't protect at all against having null in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' lives at all (as opposed to non-nullable value types which could NOT be null and therefore don't need to be null-checked)
So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null or not or if a property could be null or not (for example in a C# representation of a database table where NULL is an allowed value in a column).
Beside that I don't see how can I efficiently use this new feature, could you please give me other useful examples on how you use nullable reference types?
I really would like to make good use of this feature to improve my developer's life but I really don't see how...
Thank you
You are very confused by how programming language design works.
Default values
The default value of an (non-nullable) integer has always been 0 but to me it doesn't make sense a non-nullable string's default value is null. Why this choice? This is completely against non-nullable principles we've always been used to. I think the default non-nullable string's default value should have been String.Empty.
Default values for variables are a basic feature of the language that is in C# since the very beginning. The specification defines the default values:
For a variable of a value_type, the default value is the same as the value computed by the value_type's default constructor ([see] Default constructors).
For a variable of a reference_type, the default value is null.
This makes sense from a practical standpoint, as one of the basic usages of defaults is when declaring a new array of values of a given type. Thanks to this definition, the runtime can just zero all the bits in the allocated array - default constructors for value types are always all-zero values in all fields and null is represented as an all-zero reference. That's literally the next line in the spec:
Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.
Now Nullable Reference Types feature (NRT) was released last year with C#8. The choice here is not "let's implement default values to be null in spite of NRT" but rather "let's not waste time and resources trying to completely rework how the default keyword works because we're introducing NRTs". NRTs are annotations for programmers, by design they have zero impact on the runtime.
I would argue that not being able to specify default values for reference types is a similar case as for not being able to define a parameterless constructor on a value type - runtime needs a fast all-zero default and null values are a reasonable default for reference types. Not all types will have a reasonable default value - what is a reasonable default for a TcpClient?
If you want your own custom default, implement a static Default method or property and document it so that the developers can use that as a default for that type. No need to change the fundamentals of the language.
I mean somewhere deep down in the implementation of C# it must be specified that 0 is the default value of an int. We also could have chosen 1 or 2 but no, the consensus is 0. So can't we just specify the default value of a string is String.Empty when the Nullable reference type feature is activated?
As I said, the deep down is that zeroing a range of memory is blazingly fast and convenient. There is no runtime component responsible for checking what the default of a given type is and repeating that value in an array when you create a new one, since that would be horribly inefficient.
Your proposal would basically mean that the runtime would have to somehow inspect the nullability metadata of strings at runtime and treat an all-zero non-nullable string value as an empty string. This would be a very involved change digging deep into the runtime just for this one special case of an empty string. It's much more cost-efficient to just use a static analyzer to warn you when you're assigning null instead of a sensible default to a non-nullable string. Fortunately we have such analyzer, namely the NRT feature, which consistently refuses to compile my classes that contain definitions like this:
string Foo { get; set; }
by issuing a warning and forcing me to change that to:
string Foo { get; set; } = "";
(I recommend turning on Treat Warnings As Errors by the way, but it's a matter of taste.)
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available). Why by default setting to null an object that isn't supposed to be null?
This would, among other things, render you unable to declare an array of a reference type without a default constructor. Most basic collections use an array as the underlying storage, including List<T>. And it would require you to allocate N default instances of a type whenever you make an array of size N, which is again, horribly inefficient. Also the constructor can have side effects. I'm not going to further ponder how many things this would break, suffice to say it's hardly an easy change to make. Considering how complicated NRT was to implement anyway (the NullableReferenceTypesTests.cs file in the Roslyn repo has ~130,000 lines of code alone), the cost-efficiency of introducing such a change is... not great.
The bang operator (!) and Nullable Value Types
The compiler is completely fine with the 1st line, no warning at all. I discovered null! recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.
The null value is valid only for reference types and nullable value types. Nullable types are, again, defined in the spec:
A nullable type can represent all values of its underlying type plus an additional null value. A nullable type is written T?, where T is the underlying type. This syntax is shorthand for System.Nullable<T>, and the two forms can be used interchangeably. (...) An instance of a nullable type T? has two public read-only properties:
A HasValue property of type bool
A Value property of type T
An instance for which HasValue is true is said to be non-null. A non-null instance contains a known value and Value returns that value.
The reason for which you can't assign a null to int is rather obvious - int is a value type that takes 32-bits and represents an integer. The null value is a special reference value that is machine-word sized and represents a location in memory. Assigning null to int has no sensible semantics. Nullable<T> exists specifically for the purpose of allowing null assignments to value types to represent "no value" scenarios. But note that doing
int? x = null;
is purely syntactic sugar. The all-zero value of Nullable<T> is the "no value" scenario, since it means that HasValue is false. There is no magic null value being assigned anywhere, it's the same as saying = default -- it just creates a new all-zero struct of the given type T and assigns it.
So again, the answer is -- no one deliberately tried to design this to work incompatibly with NRTs. Nullable value types are a much more fundamental feature of the language that works like this since its introduction in C#2. And the way you propose it to work doesn't translate to a sensible implementation - would you want all value types to be nullable? Then all of them would have to have the HasValue field that takes an additional byte and possible screws up padding (I think a language that represents ints as a 40-bit type and not 32 would be considered heretical :) ).
The bang operator is used specifically to tell the compiler "I know that I'm dereferencing a nullable/assigning null to a non-nullable, but I'm smarter than you and I know for a fact this is not going to break anything". It disables static analysis warnings. But it does not magically expand the underlying type to accommodate for a null value.
Summary
Considering it doesn't protect at all against having null in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' life at all (as opposed to non-nullable value types which could NOT be null and therefore don't need to be null-checked)
So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null or not or if a property could be null or not (for example in a C# representation of a database table where NULL is an allowed value in a column).
From the official docs on NRTs:
This new feature provides significant benefits over the handling of reference variables in earlier versions of C# where the design intent can't be determined from the variable declaration. The compiler didn't provide safety against null reference exceptions for reference types (...) These warnings are emitted at compile time. The compiler doesn't add any null checks or other runtime constructs in a nullable context. At runtime, a nullable reference and a non-nullable reference are equivalent.
So you're right in that "the only value added is just in terms of signatures" and static analysis, which is the reason we have signatures in the first place. And that is not an improvement on developers' lives? Note that your line
string nonNullableString = default(string);
gives off a warning. If you did not ignore it (or even better, had Treat Warnings As Errors on) you'd get value - the compiler found a bug in your code for you.
Does it protect you from assigning null to a non-null reference type at runtime? No. Does it improve developers' lives? Thousand times yes. The power of the feature comes from warnings and nullable analysis done at compile time. If you ignore warnings issued by NRT, you're doing it at your own peril. The fact that you can ignore the compiler's helpful hand does not make it useless. After all you can just as well put your entire code in an unsafe context and program in C, doesn't mean that C# is useless since you can circumvent its safety guarantees.
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available)
That's an opinion, but: that isn't how it is implemented. default means null for reference-types, even if it is invalid according to nullability rules. The compiler spots this and warns you about it on the line Foo nonNullableFoo = default(Foo);:
Warning CS8600 Converting null literal or possible null value to non-nullable type.
As for string nonNullableString = null!; and
The compiler is completely fine with the 1st line, no warning at all.
You told it to ignore it; that's what the ! means. If you tell the compiler to not complain about something, it isn't valid to complain that it didn't complain.
So at the end it seems the only value added is just in terms of signatures.
No, it has lots more validity, but if you ignore the warnings that it does raise (CS8600 above), and if you suppress the other things that it does for you (!): yes, it will be less useful. So... don't do that?
I've grabbed the source code of Nullable<T> class from the https://referencesource.microsoft.com/ and put it to the file and renamed to NullableZZ (and also the sources of NonVersionableAttribute into separate file).
When I've tried to build the next code:
static void Main(string[] args)
{
NullableZZ<int> n1 = 100;
NullableZZ<int> n2 = null;
}
I've got this error:
Error CS0037 Cannot convert null to 'NullableZZ' because it is a non-nullable value type ConsoleApp2 C:\Users\Roman2\source\repos\ConsoleApp2\ConsoleApp2\Program.cs
Why the C# compiler does not want to compile it? Has it some "tricks" to compile its "own" version of Nullable<T>?
Why the C# compiler does not want to compile it?
Because it doesn't have any specific knowledge of your class, but it does have specific knowledge of Nullable<T>.
Has it some "tricks" to compile its "own" version of Nullable<T>?
Yes. The null literal is convertible to Nullable<T> for any non-nullable value type T, and also to any reference type. It is not convertible to NullableZZ<int>. Also, int? is effectively shorthand for Nullable<int> - it has special treatment.
Basically look through the specification (e.g. the ECMA C# 5 spec) and observe everywhere that it talks about Nullable<T>. You'll find lots of places that it's mentioned.
Nullable value types have support in the framework, the language and the CLR:
The Nullable<T> type has to exist in the framework
The language has support as described in this answer
The CLR has support in terms of validating generic constraints and also boxing (where the null value of a nullable value type boxes to a null reference)
In C# there are 2 ways of casting:
foo as int
(int)foo
Why does the first line not compile and the second does?
Console.Write(49 as char);
Console.Write((char)49);
From MSDN:
You can use the as operator to perform certain types of conversions between compatible reference types or nullable types.
char is neither a reference type nor a nullable type. It can't set the output variable of 49 to null (when the conversion fails) since it isn't nullable. It would work with char? though, although useless in this situation.
When I write
Nullable<Nullable<DateTime>> test = null;
I get a compilation error:
The type 'System.Datetime?' must be a non-nullable value type in order to use it as a paramreter 'T' in the generic type or method 'System.Nullable<T>'
But Nullable<T> is a struct so it's supposed to be non-nullable.
So I tried to create this struct:
public struct Foo<T> where T : struct
{
private T value;
public Foo(T value)
{
this.value = value;
}
public static explicit operator Foo<T>(T? value)
{
return new Foo<T>(value.Value);
}
public static implicit operator T?(Foo<T> value)
{
return new Nullable<T>(value.value);
}
}
Now when I write
Nullable<Foo<DateTime>> test1 = null;
Foo<Nullable<DateTime>> test2 = null;
Foo<DateTime> test3 = null;
The first line is ok but for the second and third lines I get the two following compilation error:
The type 'System.DateTime?' must be a non-nullable value type in order to use it as a parameter 'T' in the generic type or method 'MyProject.Foo<T>' (second line only)
and
Cannot convert null to 'MyProject.Foo<System.DateTime?> because it is a non-nullable value type'
Foo<Nullable<DateTime>> test = new Foo<DateTime?>();
doesn't work neither event if Nullable<DateTime> is a struct.
Conceptually, I can understand why Nullable<T> is nullable, it avoids having stuffs like DateTime?????????? however I can still have List<List<List<List<List<DateTime>>>>>...
So why this limitation and why can't I reproduce this behavior in Foo<T>? Is this limitation enforced by the compiler or is it intrinsic in Nullable<T> code?
I read this question but it just says that it is not possible none of the answers say fundamentally why it's not possible.
But Nullable is a struct so it's supposed to be non-nullable.
Nullable<T> is indeed a struct, but the precise meaning of the generic struct constraint as stated in the docs is:
The type argument must be a value type. Any value type except Nullable can be specified. See Using Nullable Types (C# Programming Guide) for more information.
For the same reason, your line
Foo<Nullable<DateTime>> test2 = null;
results in the compiler error you are seeing, because your generic struct constraint restricts your generic T argument in a way so Nullable<DateTime> must not be specified as an actual argument.
A rationale for this may have been to make calls such as
Nullable<Nullable<DateTime>> test = null;
less ambiguous: Does that mean you want to set test.HasValue to false, or do you actually want to set test.HasValue to true and test.Value.HasValue to false? With the given restriction to non-nullable type arguments, this confusion does not occur.
Lastly, the null assignment works with Nullable<T> because - as implied by the selected answers and their comments to this SO question and this SO question - the Nullable<T> type is supported by some compiler magic.
The error is saying that the type-parameter of Nullable should be not-nullable.
What you're doing, is creating a Nullable type which has a nullable type-parameter, which is not allowed:
Nullable<Nullable<DateTime>>
is the same as
Nullable<DateTime?>
Which is quite pointless. Why do you want to have a nullable type for a type that is already nullable ?
Nullable is just a type that has been introduced in .NET 2.0 so that you are able to use ' nullable value types'. For instance, if you have a method wich has a datetime-parameter that is optional; instead of passing a 'magic value' like DateTime.MinValue, you can now pass null to that method if you do not want to use that parameter.
In generic classes where T: struct means that type T cannot be null.
However Nullable types are designed to add nullability to structs. Technically they are structs, but they behave like they may contain null value. Because of this ambiguity the use of nullables is not allowed with where T: struct constraint - see Constraints on Type Parameters
Nullable types are not just generic structs with special C# compiler support. Nullable types are supported by CLR itself (see CLR via C# by Jeffrey Richter), and looks like this special CLR support makes them non-recursive.
CLR supports special boxing/unboxing rules int? i = 1; object o = i will put int value into variable o and not Nullable<int> value. In case of multiple nullables - should o = (int??)1; contain int or int? value?
CLR has special support for calling GetType and interface members - it calls methods of underlying type. This actually leads to situation when Nullable.GetType() throws a NullObjectReference exception, when it has NullValueFlag.
As for C#, there are a lot of features in C# that are hard-coded for nullable types.
Based on this article Nullable Types (C# Programming Guide) the primary goal of introducing nullable types is to add null support for the types that do not support nulls. Logically, since DateTime? already supports nulls it shouldn't be allowed to be "more" nullable.
This document also plainly states that
Nested nullable types are not allowed. The following line will not compile: Nullable<Nullable<int>> n;
Special C# features of nullable types:
C# has special ?? operator. Should (int???)null ?? (int)1 resolve to (int??)1 or to (int)1 value?
Nullables have special System.Nullable.GetValueOrDefault property. What should it return for nested nullables?
Special processing for ? == null and ? != null operators. If the Nullable<Nullable<T>> contains Nullable<T> value, but this value is null, what should HasValue property return? What should be the result of comparison with null?
Special implicit conversions. Should int?? i = 10 be implicitly convertible?
Explicit conversions. Should be int i = (int??)10; supported?
Special support for bool? type Using nullable types. E.g. (bool?)null | (bool?)true == true.
So, should CLR support recursive GetType() call? Should it remove Nullable wrappers when boxing value? If it should do for one-level values, why don't for all other levels as well? Too many options to consider, too many recursive processing.
The easiest solution is to make Nullable<Nullable<T>> non-compilable.
According to the documentation of the as operator, as "is used to perform certain types of conversions between compatible reference types". Since Nullable is actually a value type, I would expect as not to work with it. However, this code compiles and runs:
object o = 7;
int i = o as int? ?? -1;
Console.WriteLine(i); // output: 7
Is this correct behavior? Is the documentation for as wrong? Am I missing something?
Is this correct behavior?
Yes.
Is the documentation for as wrong?
Yes. I have informed the documentation manager. Thanks for bringing this to my attention, and apologies for the error. Obviously no one remembered to update this page when nullable types were added to the language in C# 2.0.
Am I missing something?
You might consider reading the actual C# specification rather than the MSDN documentation; it is more definitive.
I read:
Note that the as operator only performs reference conversions and boxing conversions. The as operator cannot perform other conversions, such as user-defined conversions, which should instead be performed by using cast expressions.
And boxing conversions.....
Just a guess, but I'd say it boxes o as an integer and then converts it to a nullable.
From the documentation about the as keyword:
It is equivalent to the following expression except that expression is evaluated only one time.
expression is type ? (type)expression : (type)null
The reference for is use also states it works with reference types, however, you can also do stuff like this:
int? temp = null;
if (temp is int?)
{
// Do something
}
I'm guessing it is just an inaccuracy in the reference documentation in that the type must be nullable (ie a nullable type or a reference type) instead of just a reference type.
Apparently the MSDN documentation on the as operator needs to be updated.
object o = 7;
int i = o as **int** ?? -1;
Console.WriteLine(i);
If you try the following code where we use the as operator with the value type int, you get the appropriate compiler error message that
The as operator must be used with a reference type or nullable type ('int' is a non-nullable value type)
There is an update though on the link in Community Content section that quotes:
The as operator must be used with a reference type or nullable type.
You're applying the 'as' to Object, which is a reference type. It could be null, in which case the CLR has special support for unboxing the reference 'null' to a nullable value type. This special unboxing is not supported for any other value type, so, even though Nullable is a value type, it does have certain special privledges.