Comparing a nullable to null - c#

Basically, Nullable<T> is a structure, which explains things like calling .HasValue will never throw a NullReferenceException. I was wondering why - given a nullable which does not have a value - comparisons to null are always true, even when using Object.ReferenceEquals, which I thought would return false because it is a structure.
Is there special behaviour built into the CLR to make this possible? It would probably also explain why the generic struct constraint does not allow nullables.
Best Regards,
Oliver Hanappi

If you do:
int? x = null;
if (x == null)
that will use HasValue.
Using Object.ReferenceEquals will box the value first - and that will convert the null value into a null reference (which is indeed special CLR behaviour). When a nullable value type is boxed, the result is either a null reference or a box of the underlying value type. In other words, there's no such thing as a boxed value of a nullable value type.

Nullable comparisons do not always return true. Take this example, for instance:
int? n = null;
int? m = 5;
Console.WriteLine(n == null); // prints True
Console.WriteLine(m == null); // prints False
The CLR has special boxing behavior for Nullables, such that reference comparison works as you might expect. Essentially, the Value property of the struct is boxed into an object.

Yes, Nullable<T> is a special struct that enjoys compiler support. I have blogged about what happens when it gets compiled into IL here.

Related

How does `Nullable<T> t = null` work? [duplicate]

The Nullable<T> type is defined as a struct. In .Net, you can't assign null to a struct because structs are value types that cannot be represented with null (with the exception of Nullable<T>).
int i = null; // won't compile - we all know this
int? i = null; // will compile, and I'm glad it does, and it should compile, but why?
How did Nullable<T> become an exception to the rule "You can't assign null to a value type?" The decompiled code for Nullable<T> offers no insights as of to how this happens.
How did Nullable<T> become an exception to the rule "You can't assign null to a value type?"
By changing the language, basically. The null literal went from being "a null reference" to "the null value of the relevant type".
At execution time, "the null value" for a nullable value type is a value where the HasValue property returns false. So this:
int? x = null;
is equivalent to:
int? x = new int?();
It's worth separating the framework parts of Nullable<T> from the language and CLR aspects. In fact, the CLR itself doesn't need to know much about nullable value types - as far as I'm aware, the only important aspect is that the null value of a nullable value type is boxed to a null reference, and you can unbox a null reference to the null value of any nullable value type. Even that was only introduced just before .NET 2.0's final release.
The language support mostly consists of:
Syntactic sugar in the form of ? so int? is equivalent to Nullable<int>
Lifted operators
The changed meaning of null
The null-coalescing operator (??) - which isn't restricted to nullable value types

How Does .Net Allow Nullables To Be Set To Null

The Nullable<T> type is defined as a struct. In .Net, you can't assign null to a struct because structs are value types that cannot be represented with null (with the exception of Nullable<T>).
int i = null; // won't compile - we all know this
int? i = null; // will compile, and I'm glad it does, and it should compile, but why?
How did Nullable<T> become an exception to the rule "You can't assign null to a value type?" The decompiled code for Nullable<T> offers no insights as of to how this happens.
How did Nullable<T> become an exception to the rule "You can't assign null to a value type?"
By changing the language, basically. The null literal went from being "a null reference" to "the null value of the relevant type".
At execution time, "the null value" for a nullable value type is a value where the HasValue property returns false. So this:
int? x = null;
is equivalent to:
int? x = new int?();
It's worth separating the framework parts of Nullable<T> from the language and CLR aspects. In fact, the CLR itself doesn't need to know much about nullable value types - as far as I'm aware, the only important aspect is that the null value of a nullable value type is boxed to a null reference, and you can unbox a null reference to the null value of any nullable value type. Even that was only introduced just before .NET 2.0's final release.
The language support mostly consists of:
Syntactic sugar in the form of ? so int? is equivalent to Nullable<int>
Lifted operators
The changed meaning of null
The null-coalescing operator (??) - which isn't restricted to nullable value types

Explanation of int? vs int [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What's the difference between 'int?' and 'int' in C#?
I've come across some code in C# that declares a variable as: int? number
What does the ? mean and how does this differ from just: int
int cannot be null.
int? is an alias for the Nullable<int> struct, which you can set to null.
int? is a shorthand for creating an instance of the generic System.Nullable<T> structure type. It allows you to make your variable nullable. Remember, given that the <ValueType>? syntax is a shorthand, you could declare your variables thus:
Nullable<int> i = 10;
int? is shorthand for Nullable<int> which allows you to pretend that an integer can handle nulls.
int? foo = null;
It is useful for indicating a lack of value where you would previously have used a magic value (-1) in the past, and also useful when dealing with database columns that allow null entries.
For a quasi-in-depth look, a Nullable<T> (introduced in .NET 2.0) is simply a wrapper over a value type T that exposes two properties, HasValue and Value, where HasValue is a boolean that indicates if the value has been set and Value (obviously enough) returns the value. It is an error to access Value if HasValue is false. Therefore, to access Value, it is good form to check HasValue first. Additionally, if you simply want to normalize any non-values to default values (0 for numeric types), you can use the method GetValueOrDefault() without needing to check HasValue.
Note that although you appear to set foo to null, it's not actually null under normal usage scenarios. null is simply additional syntactic sugar for this type. The above line of code translates to
Nullable<int> foo = new Nullable<int>();
Initializing the variable in this fashion simply sets the HasValue property to false.
However, in situations involving boxing, the value will actually box to null if HasValue is false (it will otherwise box to T). Be aware of the consequences! For example, in:
int? foo = null;
string bar = foo.ToString(); // this is fine, returns string.Empty
Type type = foo.GetType(); // blows up! GetType causes the value to box
// resulting in a NullReferenceException
That's a quick crash course. For more, visit the documentation.
It's syntactic compiler sugar for Nullable<int>
Basically your number (or any other value type) can be null as well as it's value. You check for a value using the HasValue property. They can be cast into their value types (although this will fail if they're null) or you can use the Value property (again it will throw an exception if it is null)
One thing which usually appears to be overlooked when using nullable types is the GetValueOrDefault() method which returns default(T) if the object is null.
As #Kyle Trauberman points out in the comment you can indeed compare it to null instead of checking HasValue. The type itself is a value type with overriden equality methods so that much as it will never be null itself it will return true when compared to null if it doesn't have a value.
A questionmark behind the declaration means that the variable is nullable.
int? can be null where as int can not.
Reference Nullable Types:
http://msdn.microsoft.com/en-us/library/1t3y8s4s(v=VS.100).aspx

Boxing / Unboxing Nullable Types - Why this implementation?

Extract from CLR via C# on Boxing / Unboxing value types ...
On Boxing: If the nullable instance is not null, the CLR takes the value out of the nullable instance and boxes it. In other words a Nullable < Int32 > with a value of 5 is boxed into a boxed-Int32 with a value of 5.
On Unboxing: Unboxing is simply the act of obtaining a reference to the unboxed portion of a boxed object. The problem is that a boxed value type cannot be simply unboxed into a nullable version of that value type because the boxed value doesn't have the boolean hasValue field in it. So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable < T > object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).
Why did the CLR team go through so much trouble for Nullable types ? Why was it not simply boxed into a Nullable < Int32 > in the first place ?
I remember this behavior was kind of last minute change. In early betas of .NET 2.0, Nullable<T> was a "normal" value type. Boxing a null valued int? turned it into a boxed int? with a boolean flag. I think the reason they decided to choose the current approach is consistency. Say:
int? test = null;
object obj = test;
if (test != null)
Console.WriteLine("test is not null");
if (obj != null)
Console.WriteLine("obj is not null");
In the former approach (box null -> boxed Nullable<T>), you wouldn't get "test is not null" but you'd get "object is not null" which is weird.
Additionally, if they had boxed a nullable value to a boxed-Nullable<T>:
int? val = 42;
object obj = val;
if (obj != null) {
// Our object is not null, so intuitively it's an `int` value:
int x = (int)obj; // ...but this would have failed.
}
Beside that, I believe the current behavior makes perfect sense for scenarios like nullable database values (think SQL-CLR...)
Clarification:
The whole point of providing nullable types is to make it easy to deal with variables that have no meaningful value. They didn't want to provide two distinct, unrelated types. An int? should behaved more or less like a simple int. That's why C# provides lifted operators.
So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable<T> object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).
This is not true. The CLR would have to allocates memory on stack to hold the variable whether or not it's nullable. There's not a performance issue to allocate space for an extra boolean variable.
I think it makes sense to box a null value to a null reference. Having a boxed value saying "I know I would be an Int32 if I had a value, but I don't" seems unintuitive to me. Better to go from the value type version of "not a value" (a value with HasValue as false) to the reference type version of "not a value" (a null reference).
I believe this change was made on the feedback of the community, btw.
This also allows an interesting use of as even for value types:
object mightBeADouble = GetMyValue();
double? unboxed = mightBeADouble as double?;
if (unboxed != null)
{
...
}
This is more consistent with the way "uncertain conversions" are handled with reference types, than the previous:
object mightBeADouble = GetMyValue();
if (mightBeADouble is double)
{
double unboxed = (double) mightBeADouble;
...
}
(It may also perform better, as there's only a single execution time type check.)
A thing that you gain via this behavior is that the boxed version implements all interfaces supported by the underlying type. (The goal is to make Nullable<int> appear the same as int for all practical purposes.) Boxing to a boxed-Nullable<int> instead of a boxed-int would prevent this behavior.
From the MSDN Page,
double? d = 44.4;
object iBoxed = d;
// Access IConvertible interface implemented by double.
IConvertible ic = (IConvertible)iBoxed;
int i = ic.ToInt32(null);
string str = ic.ToString();
Also getting the int from a boxed version of a Nullable<int> is straightforward - Usually you can't unbox to a type other than the original src type.
float f = 1.5f;
object boxed_float = f;
int int_value = (int) boxed_float; // will blow up. Cannot unbox a float to an int, you *must* unbox to a float first.
float? nullableFloat = 1.4f;
boxed_float = nullableFloat;
float fValue = (float) boxed_float; // can unbox a float? to a float Console.WriteLine(fValue);
Here you do not have to know if the original version was an int or a Nullable version of it. (+ you get some perf too ; save space of storing the the hasValue boolean as well in the boxed object)
I guess that is basically what it does. The description given includes your suggestion (ie boxing into a Nullable<T>).
The extra is that it sets the hasValue field after boxing.
I would posit that the reason for the behavior stems from the behavior of Object.Equals, most notably the fact that if the first object is null and the second object is not, Object.Equals returns false rather than call the Equals method on the second object.
If Object.Equals would have called the Equals method on the second object in the case where the first object was null but the second was not, then an object which was null-valued Nullable<T> could have returned True when compared to null. Personally, I think the proper remedy would have been to make the HasValue property of a Nullable<T> have nothing to do with the concept of a null reference. With regard to the overhead involved with storing a boolean flag on the heap, one could have provided that for every type Nullable<T> there would a be a static boxed empty version, and then provide that unboxing the static boxed empty copy would yield an empty Nullable<T>, and unboxing any other instance would yield a populated one.

Is there anything I should worry about when using nullable types in .Net 2.0?

C# 2.0 gives me access to nullable types. This seems very convenient when I want the DateTime variable in the database to be null. Is there anything I should worry about when using nullable types or can I go overboard and make every type I have nullable?
Mitch's DBNull not being null is a good point. Also beware of this, which is legal C# but will (fortunately) produce a warning:
int x = 0;
// Much later
if (x == null)
It looks like it should be illegal, but it's valid because x can be implicitly converted to a nullable int. Never ignore this warning.
You should also be aware of what appear on the surface to be oddities with operators. For instance, usually you'd expect that:
if (x <= y || x >= y)
would always be true - but it's not when x and y are both null. However, x == y will be true - unlike in SQL and indeed unlike VB! The lifted operator behaviour is provided by the language, not the runtime, so you need to be aware of what each language you use will do.
Finally, make sure you understand how boxing works with nullable types. A non-null value of a nullable value type is boxed as if it were non-nullable to start with, and a null value is boxed to a simple null reference. (i.e. no box is actually created - the runtime just returns null). You can unbox to a nullable value type from either a null reference or a box of the underlying type. Does that make sense? (I go into it in more detail in C# in Depth, and hopefully with a bit more clarity... but it's 8.45 and I haven't had coffee...)
EDIT: Okay, maybe an example will help:
int? i = 5;
int? j = null;
object x = i; // x = reference to boxed int (there's no such thing as a "boxed nullable int")
object y = j; // y = null (a simple null reference)
i = (int?) x; // Unboxing from boxed int to int? is fine.
j = (int?) y; // Unboxing from a null reference to int? is fine too.
Watch out for DBNull which is different to null
You should also be careful to only use nullable types where the logic of the class mandates that the values are really nullable. This may sound stupid, but if you "go overboard" and make everything nullable you may be making the code more complex than necessary.
When it's used at the appropriate places I think it improves code quality, but if it's redundant, don't do it.
There is a gotcha with GetType(), that is particularly apparent when using new() and generics to watch for:
static void Foo<T>() where T : new()
{
T t = new T();
string s = t.ToString(); // fine
bool eq = t.Equals(t); // fine
int hc = t.GetHashCode(); // fine
Type type = t.GetType(); // BOOM!!!
}
Basically, GetType() is unusual in that it isn't virtual, so it always gets cast (boxed) to object. The unusual boxing rules mean that this calls GetType() on null, which isn't possible. So void calling GetType() if you think you might have empty Nullable<T> objects.
Also - note that some data-binding methods don't like Nullable<T> very much.
Interoperability can be a problem - e.g. exposing to COM clients or as a web service.

Categories

Resources