Just curious.
If you go:
string myString;
Its value is null.
But if you go:
int myInt;
What is the value of this variable in C#?
Thanks
David
Firstly, note that this is only applicable for fields, not local variables - those can't be read until they've been assigned, at least within C#. In fact the CLR initializes stack frames to 0 if you have an appropriate flag set - which I believe it is by default. It's rarely observable though - you have to go through some grotty hacks.
The default value of int is 0 - and for any type, it's essentially the value represented by a bit pattern full of zeroes. For a value type this is the equivalent of calling the parameterless constructor, and for a reference type this is null.
Basically the CLR wipes the memory clean with zeroes.
This is also the value given by default(SomeType) for any type.
default of int is 0
The default value for int is 0.
See here for the full list of default values per type: http://msdn.microsoft.com/en-us/library/83fhsxwc.aspx
Here is a table of default values for value types in C#:
http://msdn.microsoft.com/en-us/library/83fhsxwc.aspx
Reference types default value is usually null.
String is a reference type. Int is a value type. Reference types are simply a pointer on the stack directed at the heap, which may or may not contain a value. A value type is just the value on the stack, but it must always be set to something.
The value for an unitialized variable of type T is always default(T). For all reference types this is null, and for the value types see the link that #Blorgbeard posted (or write some code to check it).
Related
This isn't my first question about nullable reference types as it's been few months I'm experiencing with it. But the more I'm experiencing it, the more I'm confused and the less I see the value added by that feature.
Take this code for example
string? nullableString = default(string?);
string nonNullableString = default(string);
int? nullableInt = default(int?);
int nonNullableInt = default(int);
Executing that gives:
nullableString => null
nonNullableString => null
nullableInt => null
nonNullableInt => 0
The default value of an (non-nullable) integer has always been 0 but
to me it doesn't make sense a non-nullable string's default value is null.
Why this choice? This is opposed to the non-nullable principles we've always been used to.
I think the default non-nullable string's default value should have been String.Empty.
I mean somewhere deep down in the implementation of C# it must be specified that 0 is the default value of an int. We also could have chosen 1 or 2 but no, the consensus is 0. So can't we just specify the default value of a string is String.Empty when the Nullable reference type feature is activated? Moreover it seems Microsoft would like to activate it by default with .NET 5 projects in a near future so this feature would become the normal behavior.
Now same example with an object:
Foo? nullableFoo = default(Foo?);
Foo nonNullableFoo = default(Foo);
This gives:
nullableFoo => null
nonNullableFoo => null
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available).
Why by default setting to null an object that isn't supposed to be null?
Now extending this question even more
string nonNullableString = null;
int nonNullableInt = null;
The compiler gives a warning for the 1st line which could be transformed into an error with a simple configuration in our .csproj file: <WarningsAsErrors>CS8600</WarningsAsErrors>.
And it gives a compilation error for the 2nd line as expected.
So the behavior between non-nullable value types and non-nullable reference types isn't the same but this is acceptable since I can override it.
However when doing that:
string nonNullableString = null!;
int nonNullableInt = null!;
The compiler is completely fine with the 1st line, no warning at all.
I discovered null! recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.
Considering it doesn't protect at all against having null in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' lives at all (as opposed to non-nullable value types which could NOT be null and therefore don't need to be null-checked)
So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null or not or if a property could be null or not (for example in a C# representation of a database table where NULL is an allowed value in a column).
Beside that I don't see how can I efficiently use this new feature, could you please give me other useful examples on how you use nullable reference types?
I really would like to make good use of this feature to improve my developer's life but I really don't see how...
Thank you
You are very confused by how programming language design works.
Default values
The default value of an (non-nullable) integer has always been 0 but to me it doesn't make sense a non-nullable string's default value is null. Why this choice? This is completely against non-nullable principles we've always been used to. I think the default non-nullable string's default value should have been String.Empty.
Default values for variables are a basic feature of the language that is in C# since the very beginning. The specification defines the default values:
For a variable of a value_type, the default value is the same as the value computed by the value_type's default constructor ([see] Default constructors).
For a variable of a reference_type, the default value is null.
This makes sense from a practical standpoint, as one of the basic usages of defaults is when declaring a new array of values of a given type. Thanks to this definition, the runtime can just zero all the bits in the allocated array - default constructors for value types are always all-zero values in all fields and null is represented as an all-zero reference. That's literally the next line in the spec:
Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.
Now Nullable Reference Types feature (NRT) was released last year with C#8. The choice here is not "let's implement default values to be null in spite of NRT" but rather "let's not waste time and resources trying to completely rework how the default keyword works because we're introducing NRTs". NRTs are annotations for programmers, by design they have zero impact on the runtime.
I would argue that not being able to specify default values for reference types is a similar case as for not being able to define a parameterless constructor on a value type - runtime needs a fast all-zero default and null values are a reasonable default for reference types. Not all types will have a reasonable default value - what is a reasonable default for a TcpClient?
If you want your own custom default, implement a static Default method or property and document it so that the developers can use that as a default for that type. No need to change the fundamentals of the language.
I mean somewhere deep down in the implementation of C# it must be specified that 0 is the default value of an int. We also could have chosen 1 or 2 but no, the consensus is 0. So can't we just specify the default value of a string is String.Empty when the Nullable reference type feature is activated?
As I said, the deep down is that zeroing a range of memory is blazingly fast and convenient. There is no runtime component responsible for checking what the default of a given type is and repeating that value in an array when you create a new one, since that would be horribly inefficient.
Your proposal would basically mean that the runtime would have to somehow inspect the nullability metadata of strings at runtime and treat an all-zero non-nullable string value as an empty string. This would be a very involved change digging deep into the runtime just for this one special case of an empty string. It's much more cost-efficient to just use a static analyzer to warn you when you're assigning null instead of a sensible default to a non-nullable string. Fortunately we have such analyzer, namely the NRT feature, which consistently refuses to compile my classes that contain definitions like this:
string Foo { get; set; }
by issuing a warning and forcing me to change that to:
string Foo { get; set; } = "";
(I recommend turning on Treat Warnings As Errors by the way, but it's a matter of taste.)
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available). Why by default setting to null an object that isn't supposed to be null?
This would, among other things, render you unable to declare an array of a reference type without a default constructor. Most basic collections use an array as the underlying storage, including List<T>. And it would require you to allocate N default instances of a type whenever you make an array of size N, which is again, horribly inefficient. Also the constructor can have side effects. I'm not going to further ponder how many things this would break, suffice to say it's hardly an easy change to make. Considering how complicated NRT was to implement anyway (the NullableReferenceTypesTests.cs file in the Roslyn repo has ~130,000 lines of code alone), the cost-efficiency of introducing such a change is... not great.
The bang operator (!) and Nullable Value Types
The compiler is completely fine with the 1st line, no warning at all. I discovered null! recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.
The null value is valid only for reference types and nullable value types. Nullable types are, again, defined in the spec:
A nullable type can represent all values of its underlying type plus an additional null value. A nullable type is written T?, where T is the underlying type. This syntax is shorthand for System.Nullable<T>, and the two forms can be used interchangeably. (...) An instance of a nullable type T? has two public read-only properties:
A HasValue property of type bool
A Value property of type T
An instance for which HasValue is true is said to be non-null. A non-null instance contains a known value and Value returns that value.
The reason for which you can't assign a null to int is rather obvious - int is a value type that takes 32-bits and represents an integer. The null value is a special reference value that is machine-word sized and represents a location in memory. Assigning null to int has no sensible semantics. Nullable<T> exists specifically for the purpose of allowing null assignments to value types to represent "no value" scenarios. But note that doing
int? x = null;
is purely syntactic sugar. The all-zero value of Nullable<T> is the "no value" scenario, since it means that HasValue is false. There is no magic null value being assigned anywhere, it's the same as saying = default -- it just creates a new all-zero struct of the given type T and assigns it.
So again, the answer is -- no one deliberately tried to design this to work incompatibly with NRTs. Nullable value types are a much more fundamental feature of the language that works like this since its introduction in C#2. And the way you propose it to work doesn't translate to a sensible implementation - would you want all value types to be nullable? Then all of them would have to have the HasValue field that takes an additional byte and possible screws up padding (I think a language that represents ints as a 40-bit type and not 32 would be considered heretical :) ).
The bang operator is used specifically to tell the compiler "I know that I'm dereferencing a nullable/assigning null to a non-nullable, but I'm smarter than you and I know for a fact this is not going to break anything". It disables static analysis warnings. But it does not magically expand the underlying type to accommodate for a null value.
Summary
Considering it doesn't protect at all against having null in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' life at all (as opposed to non-nullable value types which could NOT be null and therefore don't need to be null-checked)
So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null or not or if a property could be null or not (for example in a C# representation of a database table where NULL is an allowed value in a column).
From the official docs on NRTs:
This new feature provides significant benefits over the handling of reference variables in earlier versions of C# where the design intent can't be determined from the variable declaration. The compiler didn't provide safety against null reference exceptions for reference types (...) These warnings are emitted at compile time. The compiler doesn't add any null checks or other runtime constructs in a nullable context. At runtime, a nullable reference and a non-nullable reference are equivalent.
So you're right in that "the only value added is just in terms of signatures" and static analysis, which is the reason we have signatures in the first place. And that is not an improvement on developers' lives? Note that your line
string nonNullableString = default(string);
gives off a warning. If you did not ignore it (or even better, had Treat Warnings As Errors on) you'd get value - the compiler found a bug in your code for you.
Does it protect you from assigning null to a non-null reference type at runtime? No. Does it improve developers' lives? Thousand times yes. The power of the feature comes from warnings and nullable analysis done at compile time. If you ignore warnings issued by NRT, you're doing it at your own peril. The fact that you can ignore the compiler's helpful hand does not make it useless. After all you can just as well put your entire code in an unsafe context and program in C, doesn't mean that C# is useless since you can circumvent its safety guarantees.
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available)
That's an opinion, but: that isn't how it is implemented. default means null for reference-types, even if it is invalid according to nullability rules. The compiler spots this and warns you about it on the line Foo nonNullableFoo = default(Foo);:
Warning CS8600 Converting null literal or possible null value to non-nullable type.
As for string nonNullableString = null!; and
The compiler is completely fine with the 1st line, no warning at all.
You told it to ignore it; that's what the ! means. If you tell the compiler to not complain about something, it isn't valid to complain that it didn't complain.
So at the end it seems the only value added is just in terms of signatures.
No, it has lots more validity, but if you ignore the warnings that it does raise (CS8600 above), and if you suppress the other things that it does for you (!): yes, it will be less useful. So... don't do that?
Please let me know if this is in the wrong place, or let me know a better place for it.
This question is not about the syntax, more the idea behind it.
I would like to know what null is essentially, or isn't as the case may be
First off, I just want to clarify exactly what null is. By my understanding, null is
technically nothing. So the statement
string s = null;
is technically incorrect? You are assigning a variable no value but a variable can't not have a value? Am I correct in thinking this?
My idea of null is that it's kind of like thinking, "If I go and get a drink of water, I will need a cup to put the water in". null is the space in which the data will be placed in, but the space doesn't exist yet. Following this idea:
string s = "";
would be more appropriate, no? And along this thought (though a bit less confusing)
int n = 0;
follows the same idea, where 0 is a value, but the value of 0 is nothing?
The original line of code is quite valid. In C#, any reference type variable can have no value, i.e. it refers to no object. null is quite different to an empty string. Consider a variable of some other type, e.g. Form. What would be the equivalent of an empty string then? A null in C# is basically the same as a NULL in a database.
Value types are a but different. Because reference type variables contain a reference, it is possible for them to refer to no object. Value type variables, on the other hand, contain a value and so cannot be null. A struct or enum is a value type and a class or delegate is a reference type. So, if a variable on the stack contains the value zero then, for a reference type that means no object, i.e. null, and for a value type that means the default value for that type, e.g. zero for numbers, false for bool and DateTime.MinValue for DateTime.
C# references are basically tarted-up pointers. Just as in C/C++ a pointer can be null to point to no object, so a C# reference-type variable can be null to refer to no object. In the case of strings, an empty string is an object and is very different to null. An empty balloon is still a balloon and very different to no balloon at all.
Your confusion likely lies in the difference between reference types and value types.
void Foo() {
string s = null;
}
This does not create a string. Instead, s is a reference to a string. However, s is currently referring to (or pointing to, in C terminology) nothing.
s ---> [nothing]
Now when we do this:
s = "Stack Overflow";
we are making s refer to that string. s itself doesn't contain the string.
s ----> "Stack Overflow"
Note that "" itself is a string, and does exist
s ----> ""
Strings are actually a bad example because of string interning.
Value types on the other hand line up with your confusion.
An int for example, must have a value. If you don't assign it one, it will (generally) take the default value of 0.
See more:
Value Types and Reference Types
This is not wrong as much as it is redundant because null is the default value of string
as int num = 0 is redundant because 0 is the default value of int
If you need to initialize your string then you should go for string s = "" or my personal favorite string s = string.Empty
Null means nothing. 0 is a value, but null means that there is no value. Actually in C#:
string s=null;
means that instance of the class string is null and has no value.
Imagine string s is just a variable pointing to a place in memory which was allocated for you.
string s = null; Is allocating a variable but it is not pointng to a place in your memory.
string s = "; micht be the same but it is pointing to a slot in your memory containing a iteral with an empty string.
I hope that made the problem a bit more clear.
I instantiate an array like this:
int array[] = new int[4];
What are the default values for those four members? Is it null, 0 or not exists?
It's 0. It can't be null, as null isn't a valid int value.
From section 7.6.10.4 of the C# 5 specification:
All elements of the new array instance are initialized to their default values (§5.2).
And from section 5.2:
The default value of a variable depends on the type of the variable and is determined as follows:
For a variable of a value-type, the default value is the same as the value computed by the value-type’s default constructor (§4.1.2).
For a variable of a reference-type, the default value is null.
Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.
(As an implementation detail, there's some trickiness around the first bullet point. Although C# itself doesn't allow you to declare a parameterless constructor for value types, you can create your own parameterless constructors for value types in IL. I don't believe those constructors are called in array initialization, but they will be called in a new X() expression in C#. It's outside the realm of the C# spec though, really.)
The default value of an automatically-initialized variable of type T, such as an array element or an instance field, is the same as the value of default(T). For reference types and pointer types, it's null. For numeric types, it is the zero of that type. For bool, it's false. For struct types, it is the struct value that has all its fields initialized to their default values.
From Arrays (C# Programming Guide):
The default values of numeric array elements are set to zero, and reference elements are set to null.
Integers cannot be NULL. They will have the value '0'. Even if you try to assign NULL to an int from code, you will not be able to do it.
What is this code doing? Specifically the default(XX) part. I've never seen it before.
Entities.BizTalkRequestResult result = default(Entities.BizTalkRequestResult);
It's not a cast; it compiles to the default value of Entities.BizTalkRequestResult. For a reference type, e.g., that's probably null. See MSDN: http://msdn.microsoft.com/en-us/library/xwth0h0d(v=vs.80).aspx
There is a misconception; this is not casting at all. The default operator or function returns the default value. ex: 0 for int and null for reference types.
default is often used with generics (default(T)) because we don't know the actual type at compile time.
It gives you the default value for the particular type inside the parentheses. E.g. 0 for primitives numeric types like int or float, or null for reference types. It's useful particularly when the type could vary, and you want to write general code that is applicable for all possible types.
I'm familiar with the C# specification, section 5.3 which says that a variable has to be assigned before use.
In C and unmanaged C++ this makes sense as the stack isn't cleared and the memory location used for a pointer could be anywhere (leading to a hard-to-track-down bug).
But I am under the impression that there are not truly "unassigned" values allowed by the runtime. In particular that a reference type that is not initialized will always have a null value, never the value left over from a previous invocation of the method or random value.
Is this correct, or have I been mistakenly assuming that a check for null is sufficient all these years? Can you have truly unintialized variables in C#, or does the CLR take care of this and there's always some value set?
I am under the impression that there are not truly "unassigned" values allowed by the runtime. In particular that a reference type that is not initialized will always have a null value, never the value left over from a previous invocation of the method or random value. Is this correct?
I note that no one has actually answered your question yet.
The answer to the question you actually asked is "sorta".
As others have noted, some variables (array elements, fields, and so on) are classified as being automatically "initially assigned" to their default value (which is null for reference types, zero for numeric types, false for bools, and the natural recursion for user-defined structs).
Some variables are not classified as initially assigned; local variables in particular are not initially assigned. They must be classified by the compiler as "definitely assigned" at all points where their values are used.
Your question then is actually "is a local variable that is classified as not definitely assigned actually initially assigned the same way that a field would be?" And the answer to that question is yes, in practice, the runtime initially assigns all locals.
This has several nice properties. First, you can observe them in the debugger to be in their default state before their first assignment. Second, there is no chance that the garbage collector will be tricked into dereferencing a bad pointer just because there was garbage left on the stack that is now being treated as a managed reference. And so on.
The runtime is permitted to leave the initial state of locals as whatever garbage happened to be there if it can do so safely. But as an implementation detail, it does not ever choose to do so. It zeros out the memory for a local variable aggressively.
The reason then for the rule that locals must be definitely assigned before they are used is not to prevent you from observing the garbage uninitialized state of the local. That is already unobservable because the CLR aggressively clears locals to their default values, the same as it does for fields and array elements. The reason this is illegal in C# is because using an unassigned local has high likelihood of being a bug. We simply make it illegal, and then the compiler prevents you from ever having such a bug.
As far as I'm aware, every type has a designated default value.
As per this document, fields of classes are assigned the default value.
http://msdn.microsoft.com/en-us/library/aa645756(v=vs.71).aspx
This document says that the following always have default values assigned automatically.
Static variables.
Instance variables of class instances.
Instance variables of initially assigned struct variables.
Array elements.
Value parameters.
Reference parameters.
Variables declared in a catch clause or a foreach statement.
http://msdn.microsoft.com/en-us/library/aa691173(v=vs.71).aspx
More information on the actual default values here:
Default values of C# types (C# reference)
It depends on where the variable is declared. Variables declared within a class are automatically initialized using the default value.
object o;
void Method()
{
if (o == null)
{
// This will execute
}
}
Variables declared within a method are not initialized, but when the variable is first used the compiler checks to make sure that it was initialized, so the code will not compile.
void Method()
{
object o;
if (o == null) // Compile error on this line
{
}
}
In particular that a reference type that is not initialized will always have a null value
I think you are mixing up local variables and member variables. Section 5.3 talks specifically about local variables. Unlike member variables that do get defaulted, local variables never default to the null value or anything else: they simply must be assigned before they are first read. Section 5.3 explains the rules that the compiler uses to determine if a local variable has been assigned or not.
There are 3 ways that a variable can be assigned an initial value:
By default -- this happens (for example) if you declare a class variable without assigning an initial value, so the initial value gets default(type) where type is whatever type you declare the variable to be.
With an initializer -- this happens when you declare a variable with an initial value, as in int i = 12;
Any point before its value is retrieved -- this happens (for example) if you have a local variable with no initial value. The compiler ensures that you have no reachable code paths that will read the value of the variable before it is assigned.
At no point will the compiler allow you to read the value of a variable that hasn't been initialized, so you never have to worry about what would happen if you tried.
All primitive data types have default values, so there isn't any need to worry about them.
All reference types are initialized to null values, so if you leave your reference types uninitialized and then call some method or property on that null ref type, you would get a runtime exception which would need to be handled gracefully.
Again, all Nullable types need to be checked for null or default value if they are not initialized as follows:
int? num = null;
if (num.HasValue == true)
{
System.Console.WriteLine("num = " + num.Value);
}
else
{
System.Console.WriteLine("num = Null");
}
//y is set to zero
int y = num.GetValueOrDefault();
// num.Value throws an InvalidOperationException if num.HasValue is false
try
{
y = num.Value;
}
catch (System.InvalidOperationException e)
{
System.Console.WriteLine(e.Message);
}
But, you will not get any compile error if you leave all your variables uninitialized as the compiler won't complain. It's only the run-time you need to worry about.