What is the reason null doesn't evaluate to false in conditionals?
I first thought about assignments to avoid the bug of using = instead of ==, but this could easily be disallowed by the compiler.
if (someClass = someValue) // cannot convert someClass to bool. Ok, nice
if (someClass) // Cannot convert someClass to bool. Why?
if (someClass != null) // More readable?
I think it's fairly reasonable to assume that null means false. There are other languages that use this too, and I've not had a bug because of it.
Edit: And I'm of course referring to reference types.
A good comment by Daniel Earwicker on the assignment bug... This compiles without a warning because it evaluates to bool:
bool bool1 = false, bool2 = true;
if (bool1 = bool2)
{
// oops... false == true, and bool1 became true...
}
It's a specific design feature in the C# language: if statements accept only a bool.
IIRC this is for safety: specifically, so that your first if (someClass = someValue) fails to compile.
Edit: One benefit is that it makes the if (42 == i) convention ("yoda comparisons") unnecessary.
"I think it's fairly reasonable to assume that null means false"
Not in C#. false is a boolean struct, a value type. Value types cannot have a null value. If you wanted to do what you achieved, you'd have to create custom converters of your particular type to boolean:
public class MyClass
{
public static implicit operator bool(MyClass instance)
{
return instance != null;
}
}
With the above, I could then do:
if (instance) {
}
etc.
"I think it's fairly reasonable to assume that null means false"
I don't agree. IMHO, more often than not, false means "no". Null means "I don't know"; i.e. completely indeterminate.
One thing that comes to mind what about in the instance of a data type, like int? Int's can't be null, so do they always evaluate to true? You could assume that int = 0 is false, but that starts to get really complicated, because 0 is a valid value (where maybe 0 should evaluate to true, because the progammer set it) and not just a default value.
There are a lot of edge cases where null isn't an option, or sometimes it's an option, and other times it's not.
They put in things like this to protect the programmer from making mistakes. It goes along the same line of why you can't do fall through in case statements.
Just use if(Convert.ToBoolean(someClass))
http://msdn.microsoft.com/en-us/library/wh2c31dd.aspx
Parameters
value Type: System.Object An object that implements the
IConvertible interface, or null.
Return Value
Type: System.Boolean true or false,
which reflects the value returned by
invoking the IConvertible.ToBoolean
method for the underlying type of
value. If value is null, the method
returns false
As far as I know, this is a feature that you see in dynamic languages, which C# is not (per the language specification if only accepts bool or an expression that evaluates to bool).
I don't think it's reasonable to assume that null is false in every case. It makes sense in some cases, but not in others. For example, assume that you have a flag that can have three values: set, unset, and un-initialized. In this case, set would be true, unset would be false and un-initialized would be null. As you can see, in this case the meaning of null is not false.
Because null and false are different things.
A perfect example is bool? foo
If foo's value is true, then its value is true.
If foo's value is false, then its value is false
If foo has nothing assigned to it, its value is null.
These are three logically separate conditions.
Think of it another way
"How much money do I owe you?"
"Nothing" and "I don't have that information" are two distinctly separate answers.
What is the reason null doesn't
evaluate to false in conditionals?
I first thought about assignments to
avoid the bug of using = instead of
==
That isn't the reason. We know this because if the two variables being compared happen to be of type bool then the code will compile quite happily:
bool a = ...
bool b = ...
if (a = b)
Console.WriteLine("Weird, I expected them to be different");
If b is true, the message is printed (and a is now true, making the subsequent debugging experience consistent with the message, thus confusing you even more...)
The reason null is not convertible to bool is simply that C# avoids implicit conversion unless requested by the designer of a user-defined type. The C++ history book is full of painful stories caused by implicit conversions.
Structurally, most people who "cannot think of any technological reason null should be equal to false" get it wrong.
Code is run by CPUs.
Most (if not all) CPUs have bits, groups of bits and interpretations of groups of bits. That said, something can be 0, 1, a byte, a word, a dword, a qword and so on.
Note that on x86 platform, bytes are octets (8 bits), and words are usually 16 bits, but this is not a necessity. Older CPUs had words of 4 bits, and even todays' low-end embedded controllers often use like 7 or 12 bits per word.
That said, something is either "equal", "zero", "greater", "less", "greater or equal", or "less or equal" in machine code. There is no such thing as null, false or true.
As a convention, true is 1, false is 0, and a null pointer is either 0x00, 0x0000, 0x00000000, or 0x0000000000000000, depending on address bus width.
C# is one of the exceptions, as it is an indirect type, where the two possible values 0 and 1 are not an immediate value, but an index of a structure (think enum in C, or PTR in x86 assembly).
This is by design.
It is important to note, though, that such design decisions are elaborate decisions, while the traditional, straightforward way is to assume that 0, null and false are equal.
C# doesn't make a conversion of the parameter, as C++ does. You need to explicitly convert the value in a boolean, if you want the if statement to accept the value.
It's simply the type system of c# compared to languages like PHP, Perl, etc.
A condition only accepts Boolean values, null does not have the type Boolean so it doesn't work there.
As for the NULL example in C/C++ you mentioned in another comment it has to be said that neither C nor C++ have a boolean type (afair C++ usually has a typecast for bool that resolves to an int, but thats another matter) and they also have no null-references, only NULL(=> 0)-pointers.
Of course the compiler designers could implement an automatic conversion for any nullable type to boolean but that would cause other problems, i.e.:
Assuming that foo is not null:
if (foo)
{
// do stuff
}
Which state of foo is true?
Always if it's not null?
But what if you want your type to be convertable to boolean (i.e. from your tri-state or quantum-logic class)?
That would mean you would have two different conversions to bool, the implicit and the explicit, which would both behave differently.
I don't even dare to imagine what should happen if you do
if (!!foo) // common pattern in C to normalize a value used as boolean,
// in this case might be abused to create a boolean from an object
{
}
I think the forced (foo == null) is good since it also adds clarity to your code, it's easier to understand what you really check for.
Related
Using this construct:
var dict = new Dictionary<int, string>();
var result = (dict?.TryGetValue(1, out var value) ?? false) ? value : "Default";
I get an error saying CS0165 use of unassigned local variable 'value' which is not what I expect. How could value possibly be undefined? If the dictionary is null the inner statement will return false which will make the outer statement evaluate to false, returning Default.
What am I missing here? Is it just the compiler being unable to evaluate the statement fully? Or Have I messed it up somehow?
Your analysis is correct. It is not the analysis the compiler makes, because the compiler makes the analysis that is required by the C# specification. That analysis is as follows:
If the condition of a condition?consequence:alternative expression is a compile-time constant true then the alternative branch is not reachable; if false, then the consequence branch is not reachable; otherwise, both branches are reachable.
The condition in this case is not a constant, therefore the consequence and alternative are both reachable.
local variable value is only definitely assigned if dict is not null, and therefore value is not definitely assigned when the consequence is reached.
But the consequence requires that value be definitely assigned
So that's an error.
The compiler is not as smart as you, but it is an accurate implementation of the C# specification. (Note that I have not sketched out here the additional special rules for this situation, which include predicates like "definitely assigned after a true expression" and so on. See the C# spec for details.)
Incidentally, the C# 2.0 compiler was too smart. For example, if you had a condition like 0 * x == 0 for some int local x it would deduce "that condition is always true no matter what the value of x is" and mark the alternative branch as unreachable. That analysis was correct in the sense that it matched the real world, but it was incorrect in the sense that the C# specification clearly says that the deduction is only to be made for compile-time constants, and equally clearly says that expressions involving variables are not constant.
Remember, the purpose of this thing is to find bugs, and what is more likely? Someone wrote 0 * x == 0 ? foo : bar intending that it have the meaning "always foo", or that they've written a bug by accident? I fixed the bug in the compiler and since then it has strictly matched the specification.
In your case there is no bug, but the code is too complicated for the compiler to analyze, so it is probably also too complicated to expect humans to analyze. See if you can simplify it. What I might do is:
public static V GetValueOrDefault<K, V>(
this Dictionary<K, V> d,
K key,
V defaultValue)
{
if (d != null && d.TryGetValue(key, out var value))
return value;
return defaultValue;
}
…
var result = dict.GetValueOrDefault(1, "Default");
The goal should be to make the call site readable; I think my call site is considerably more readable than yours.
Is it just the compiler being unable to evaluate the statement fully?
Yes, more or less.
The compiler does not track unassigned, it tracks the opposite 'defintely assigned'. It has to stop somewhere, in this case it would need to incorporate knowledge about the library method TryGetValue(). It doesn't.
Today I stumbled upon an interesting bug I wrote. I have a set of properties which can be set through a general setter. These properties can be value types or reference types.
public void SetValue( TEnum property, object value )
{
if ( _properties[ property ] != value )
{
// Only come here when the new value is different.
}
}
When writing a unit test for this method I found out the condition is always true for value types. It didn't take me long to figure out this is due to boxing/unboxing. It didn't take me long either to adjust the code to the following:
public void SetValue( TEnum property, object value )
{
if ( !_properties[ property ].Equals( value ) )
{
// Only come here when the new value is different.
}
}
The thing is I'm not entirely satisfied with this solution. I'd like to keep a simple reference comparison, unless the value is boxed.
The current solution I am thinking of is only calling Equals() for boxed values. Doing a check for a boxed values seems a bit overkill. Isn't there an easier way?
If you need different behaviour when you're dealing with a value-type then you're obviously going to need to perform some kind of test. You don't need an explicit check for boxed value-types, since all value-types will be boxed** due to the parameter being typed as object.
This code should meet your stated criteria: If value is a (boxed) value-type then call the polymorphic Equals method, otherwise use == to test for reference equality.
public void SetValue(TEnum property, object value)
{
bool equal = ((value != null) && value.GetType().IsValueType)
? value.Equals(_properties[property])
: (value == _properties[property]);
if (!equal)
{
// Only come here when the new value is different.
}
}
( ** And, yes, I know that Nullable<T> is a value-type with its own special rules relating to boxing and unboxing, but that's pretty much irrelevant here.)
Equals() is generally the preferred approach.
The default implementation of .Equals() does a simple reference comparison for reference types, so in most cases that's what you'll be getting. Equals() might have been overridden to provide some other behavior, but if someone has overridden .Equals() in a class it's because they want to change the equality semantics for that type, and it's better to let that happen if you don't have a compelling reason not to. Bypassing it by using == can lead to confusion when your class sees two things as different when every other class agrees that they're the same.
Since the input parameter's type is object, you will always get a boxed value inside the method's context.
I think your only chance is to change the method's signature and to write different overloads.
How about this:
if(object.ReferenceEquals(first, second)) { return; }
if(first.Equals(second)) { return; }
// they must differ, right?
Update
I realized this doesn't work as expected for a certain case:
For value types, ReferenceEquals returns false so we fall back to Equals, which behaves as expected.
For reference types where ReferenceEquals returns true, we consider them "same" as expected.
For reference types where ReferenceEquals returns false and Equals returns false, we consider them "different" as expected.
For reference types where ReferenceEquals returns false and Equals returns true, we consider them "same" even though we want "different"
So the lesson is "don't get clever"
I suppose
I'd like to keep a simple reference comparison, unless the value is boxed.
is somewhat equivalent to
If the value is boxed, I'll do a non-"simple reference comparison".
This means the first thing you'll need to do is to check whether the value is boxed or not.
If there exists a method to check whether an object is a boxed value type or not, it should be at least as complex as that "overkill" method you provided the link to unless that is not the simplest way. Nonetheless, there should be a "simplest way" to determine if an object is a boxed value type or not. It's unlikely that this "simplest way" is simpler than simply using the object Equals() method, but I've bookmarked this question to find out just in case.
(not sure if I was logical)
Can this overload of XPathNavigator.Evaluate return null ?
// Can "result" be null ?
object result = xmlDoc.CreateNavigator().Evaluate(xpathString);
If the answer is No, then why Resharper says that result maybe null ?
string str = result.ToString(); // Resharper: Possible NullReferenceException
I found nothing in the documentation about an input that might cause it to return null. I also tried inspecting the Reference Source for this function, but it was unfruitful.
I know that R# uses code annotations, but I still don't trust this warning as I tried different inputs with none of them returns null.
Looking at the code, it does look like it would be highly unlikely to get a null from XPathNavigator.Evaluate. There are a couple of possible code paths that might get you a null, but I suspect they're pathological edge cases (if evaluating a function that should be a number function, but isn't, or if the operand to a query is already null). I doubt these would happen under normal circumstances.
I don't know why ReSharper has the [CanBeNull] annotation on the return value. If I had to guess, I'd say it's because the method is virtual, and therefore there's no way to guarantee that the implementation will always return a value. Or because it calls an abstract method on another class that doesn't have any null-ness guarantees, and there's no check on the return of that value, so again, there's no guarantee that it won't be null.
The annotations are based on static control flow analysis, and that can only get you so far. ReSharper will provide the strongest hints that it can. If it knows it's not null, it will annotate it so, if it doesn't know, it will flag it [CanBeNull], and err on the side of caution.
I have code where I get a string as input, and I compare it with an integer.
I saw that integer variable also has an Equals function that accepts a string parameter.
I have used it directly thinking it will typecast it.
It did not give any compile time or runtime error, but it always gives a false result.
For example,
int sessionId = 1;
string requestId="1"
return sessionId.Equals(requestId);
sessionId.Equals(requestId) always gives false.
Why is the reason for such behavior? If there is a reason, why are they allowing it run without error?
Integers and strings are always different, and thus "1".Equals(1) returns false.
It compiles because object.Equals(object other) takes an object as the right side, and thus accepts any type.
The reason why this happens is that a string "0" is not the same as 0, so it returns false.
Why is such behavior supported? Because the Equals method allows you to pass an object as a parameter, and a string is in an object, so you are "allowed" to do it. As you have found, it's not very useful in this case.
To solve your problem either get a string representation of the integer, or parse your string to an integer, then compare.
E.g. Try
return (sessionId.ToString() == requestId);
or
return (sessionId == int.Parse(requestId));
If you choose the later you may need to consider if the Parse could fail and how you might handle that.
Yes, Equals takes any type on the right side because it is requires an object. But inside the function it requires the same type as the left side. IMHO there's no need to throw an exception for type mismatching because one only wants to know about equality of two types.
Decompiled Equals of int:
public override bool Equals(object obj)
{
return obj is int && this == (int)obj;
}
If someone shows you a car and a banana, and asks whether they are the same thing, would you throw a temper tantrum because a car is a vehicle and a banana is a fruit, or would you simply say "no, they are not the same thing"?
In many languages, trying to compare an integer and a string will yield a compiler error, because the compiler knows that the integer and the string cannot possibly be the same thing and thus any code that tried to compare them would almost certainly be erroneous. On the other hand, when you say sessionId.Equals(requestId), the compiler knows that you are asking that requestId be passed to the Int32 override of Equals. Since that override can accept a reference to any kind of heap object, it has no problem passing the string "1". That method in turn knows that it was given something which isn't the same as an Int32 with the value 1. It doesn't know that the calling code can't possibly supply anything that would match an Int32; all it knows is that the particular value isn't the same, and because the value isn't the same it perfectly happily returns false.
Shouldn’t we be using String.Compare for string comparison and forget about Equals?
I did have the same problem and I believe the function Equals should throw an exception. In my case, I have been comparing a string with a Boolean.
The discussion by now went wrong way. This is my view:
If a comparison between objects belonging to two different classes always returns false, then they do not need to be compared in the first place.
If there is a need to have a function that bypasses type checking, there should be one. However, having the function Equals positioned as a recommended method for string comparison and in the same time introducing the possibility of unneeded bugs (which may sit in your program for eternity) is a kind of irresponsible.
Moreover, it is extremely misleading that the function call String.Equals(string1, string2, StringComparison. xxx) also accepts non-string arguments. Thus not only string1.Equals(string2).
If that is by design, then it is a poor design.
Convert.ToString(null)
returns
null
As I expected.
But
Convert.ToString(null as object)
returns
""
Why are these different?
There are 2 overloads of ToString that come into play here
Convert.ToString(object o);
Convert.ToString(string s);
The C# compiler essentially tries to pick the most specific overload which will work with the input. A null value is convertible to any reference type. In this case string is more specific than object and hence it will be picked as the winner.
In the null as object you've solidified the type of the expression as object. This means it's no longer compatible with the string overload and the compiler picks the object overload as it's the only compatible one remaining.
The really hairy details of how this tie breaking works is covered in section 7.4.3 of the C# language spec.
Following on from JaredPar's excellent overload resolution answer - the question remains "why does Convert.ToString(string) return null, but Convert.ToString(object) return string.Empty"?
And the answer to that is...because the docs say so:
Convert.ToString(string) returns "the specified string instance; no actual conversion is performed."
Convert.ToString(object) returns "the string representation of value, or String.Empty if value is null."
EDIT:
As to whether this is a "bug in the spec", "very bad API design", "why was it specified like this", etc. - I'll take a shot at some rationale for why I don't see it as big deal.
System.Convert has methods for converting every base type to itself. This is strange - since no conversion is needed or possible, so the methods end up just returning the parameter. Convert.ToString(string) behaves the same. I presume these are here for code generation scenarios.
Convert.ToString(object) has 3 choices when passed null. Throw, return null, or return string.Empty. Throwing would be bad - doubly so with the assumption these are used for generated code. Returning null requires your caller do a null check - again, not a great choice in generated code. Returning string.Empty seems a reasonable choice. The rest of System.Convert deals with value types - which have a default value.
It's debatable whether returning null is more "correct", but string.Empty is definitely more usable. Changing Convert.ToString(string) means breaking the "no actual conversion" rule. Since System.Convert is a static utility class, each method can be logically treated as its own. There's very few real world scenarios where this behavior should be "surprising", so let usability win over (possible) correctness.