Null-coalescing out parameter gives unexpected warning - c#

Using this construct:
var dict = new Dictionary<int, string>();
var result = (dict?.TryGetValue(1, out var value) ?? false) ? value : "Default";
I get an error saying CS0165 use of unassigned local variable 'value' which is not what I expect. How could value possibly be undefined? If the dictionary is null the inner statement will return false which will make the outer statement evaluate to false, returning Default.
What am I missing here? Is it just the compiler being unable to evaluate the statement fully? Or Have I messed it up somehow?

Your analysis is correct. It is not the analysis the compiler makes, because the compiler makes the analysis that is required by the C# specification. That analysis is as follows:
If the condition of a condition?consequence:alternative expression is a compile-time constant true then the alternative branch is not reachable; if false, then the consequence branch is not reachable; otherwise, both branches are reachable.
The condition in this case is not a constant, therefore the consequence and alternative are both reachable.
local variable value is only definitely assigned if dict is not null, and therefore value is not definitely assigned when the consequence is reached.
But the consequence requires that value be definitely assigned
So that's an error.
The compiler is not as smart as you, but it is an accurate implementation of the C# specification. (Note that I have not sketched out here the additional special rules for this situation, which include predicates like "definitely assigned after a true expression" and so on. See the C# spec for details.)
Incidentally, the C# 2.0 compiler was too smart. For example, if you had a condition like 0 * x == 0 for some int local x it would deduce "that condition is always true no matter what the value of x is" and mark the alternative branch as unreachable. That analysis was correct in the sense that it matched the real world, but it was incorrect in the sense that the C# specification clearly says that the deduction is only to be made for compile-time constants, and equally clearly says that expressions involving variables are not constant.
Remember, the purpose of this thing is to find bugs, and what is more likely? Someone wrote 0 * x == 0 ? foo : bar intending that it have the meaning "always foo", or that they've written a bug by accident? I fixed the bug in the compiler and since then it has strictly matched the specification.
In your case there is no bug, but the code is too complicated for the compiler to analyze, so it is probably also too complicated to expect humans to analyze. See if you can simplify it. What I might do is:
public static V GetValueOrDefault<K, V>(
this Dictionary<K, V> d,
K key,
V defaultValue)
{
if (d != null && d.TryGetValue(key, out var value))
return value;
return defaultValue;
}
…
var result = dict.GetValueOrDefault(1, "Default");
The goal should be to make the call site readable; I think my call site is considerably more readable than yours.

Is it just the compiler being unable to evaluate the statement fully?
Yes, more or less.
The compiler does not track unassigned, it tracks the opposite 'defintely assigned'. It has to stop somewhere, in this case it would need to incorporate knowledge about the library method TryGetValue(). It doesn't.

Related

Non-shortcircuting boolean operators and C# 7 Pattern matching

Im currently writing an C# application, targeting .NET 4.7 (C# 7). I am confused after I tried using the new way of declaring a variable utilizing the "is" keyword:
if (variable is MyClass classInstance)
This way it works, but when doing:
if (true & variable is MyClass classInstance)
{
var a = classInstance;
}
Visual Studio (I'm using 2017) shows me the the Error Use of unassigned local variable 'classInstance'. Using the short-circuting version of & (&&) it works fine. Am I missing something about the & operator? (I know using the shortcircuting versions are much more commonly used, but at this point I'm just curious)
This one hurt my head, but I think I have got it figured out.
This confusion is caused by two quirks: the way the is operation leaves the variable undeclared (not null), and the way the compiler optimizes away Boolean, but not bitwise, operations.
Quirk 1. If the cast fails, the variable is unassigned (not null)
Per the documentation for the new is syntax:
If exp is true and is is used with an if statement, varname is assigned and has local scope within the if statement only.
If you read between the lines, this means that if the is cast fails, the variable is considered unassigned. This may be counterintuitive (some might expect it to be null instead). This means that any code within the if block that relies on the variable will not compile if there is any chance the overall if clause could evaluate to true without a type match present. So for example
This compiles:
if (instance is MyClass y)
{
var x = y;
}
And this compiles:
if (true && instance is MyClass y)
{
var x = y;
}
But this does not:
void Test(bool f)
{
if (f && instance is MyClass y)
{
var x = y; //Error: Use of unassigned local variable
}
}
Quirk 2. Boolean operations are optimized away, binary ones are not
When the compiler detects a predestined Boolean result, the compiler will not emit unreachable code, and skips certain validations as a result. For example:
This compiles:
void Test(bool f)
{
object neverAssigned;
if (false && f)
{
var x = neverAssigned; //OK (never executes)
}
}
But if you use & instead of &&, it does not compile:
void Test(bool f)
{
object neverAssigned;
if (false & f)
{
var x = neverAssigned; //Error: Use of unassigned local variable
}
}
When the compiler sees something like true && it just ignores it completely. Thus
if (true && instance is MyClass y)
Is exactly the same as
if (instance is MyClass y)
But this
if (true & instance is MyClass y)
Is NOT the same. The compiler still needs to emit code that performs the & operation and uses its output in a conditional statement. Or even if it doesn't, the current C# 7 compiler apparently performs the same validations as if it were. This may seem a little strange, but bear in mind that when you use & instead of &&, there is a guarantee that the & must execute, and though it seems unimportant in this example, the general case must allow for additional complexifying factors, such as operator overloading.
How the quirks combine
In the last example, the result of the if clause is determined at run time, not compile time. So the compiler can't be certain that y will end up being assigned before the contents of the if block are executed. Thus you get
if (true & instance is MyClass y)
{
var x = y; //Error: use of unassigned local variable
}
TLDR
In the situation of a compound logical operation, c# can't be sure that the overall if condition will resolve to true if and only if the cast is successful. Absent that certainty, it can't allow access to the variable, since it might be unassigned. An exception is made when the expression can be reduced to non-compound operation at compile time, for example by removing true &&.
Workaround
I think the way we are meant to use the new is syntax is as a single condition with an if clause. Adding true && at the beginning works because the compiler simply removes it. But anything else combined with the new syntax creates ambiguity about whether the new variable will be in an unassigned state when the code block runs, and the compiler can't allow that.
The workaround of course is to nest your conditions instead of combining them:
Won't work:
void Test(bool f)
{
if (f & instance is MyClass y)
{
var x = y; //Error: Use of unassigned local variable
}
}
Works fine:
void Test(bool f)
{
if (f)
{
if (instance is MyClass y)
{
var x = y; //Works
}
}
}
This is due to the rules on definite assignment, which have a special case for &&, but don't have a special case for &. I believe it's working as intended by the C# design team, but that the specification may have a little bit more work to do, at least in terms of clarity.
From the ECMA C# 5 standard, section 5.3.3.24:
For an expression expr of the form expr-first && expr-second:
...
The definite assignment state of v after expr is determined by:
If expr-first is a constant expression with the value false, then the definite assignment state of v after expr is the same as the definite assignment state of v after expr-first.
Otherwise, if the state of v after expr-first is definitely assigned, then the state of v after expr is definitely assigned.
Otherwise, if the state of v after expr-second is definitely assigned, and the state of v after expr-first is “definitely assigned after false expression”, then the state of v after expr is definitely assigned.
Otherwise, if the state of v after expr-second is definitely assigned or “definitely assigned after true expression”, then the state of v after expr is “definitely assigned after true expression”.
...
The relevant part for this case is the one I've highlighted in bold. classInstance is "definitely assigned after true expression" with the pattern (expr-second above), and none of the earlier cases apply, so the overall state at the end of the condition is "definitely assigned after true expression". That means within the if statement body, it's definitely assigned.
There's no equivalent clause for the & operator. While there potentially could be, it would be complicated by the types involved - it would have to only apply to the & operator when used with bool expressions, and I don't think most of the rest of definite assignment deals with types in that way.
Note that you don't need to use pattern matching to see this.
Consider the following program:
using System;
class Program
{
static void Main()
{
bool a = false;
bool x;
bool y = true;
if (true & (y && (x = a)))
{
Console.WriteLine(x);
}
}
}
The expression y && (x = a) is another one where x ends up being "definitely assigned after true expression". Again, the code above fails to compile due to x not being definitely assigned, whereas if you change the & to && it compiles. So at least this isn't an issue in pattern matching itself.
What confuses me somewhat is why x isn't still "definitely assigned after true expression" due to 5.3.3.21 ("5.3.3.21 General rules for expressions with embedded expressions"), which contains:
The definite assignment state of v at the end of expr is the same as the definite assignment state at the end of exprn.
I suspect this is meant to only include "definitely assigned" or "not definitely assigned", rather than including the "definitely assigned after true expression" part - although it's not as clear as it should be.
Am I missing something about the & operator?
No, I don't think so. Your expectations seem correct to me. Consider this alternative example, which does compile without error:
object variable = null;
MyClass classInstance;
if (true & ((classInstance = variable as MyClass) != null))
{
var a = classInstance;
}
var b = classInstance;
(To me, it's more interesting to consider the assignment outside the if body, since that's where the short-circuiting would affect behavior.)
With the explicit assignment, the compiler recognizes classInstance as definitely assigned, in the assignments of both a and b. It should be able to do the same thing with the new syntax.
With logical and, short-circuiting or not shouldn't matter. Your first value is true, so the second half should always need to be evaluated to get the whole expression. As you've noted, the compiler does treat the & and && differently though, which is unexpected.
A variation on this is this code:
static void M3()
{
object variable = null;
if (true | variable is MyClass classInstance)
{
var a = classInstance;
}
}
The compiler correctly identifies classInstance as not definitely assigned when || is used, but has the same apparent misbehavior with | (i.e. also saying that classInstance is not definitely assigned), even though with the non-short-circuiting operator, the assignment must happen regardless.
Again, the above works correctly with the assignment is explicit, rather than using the new syntax.
If this were just about the definite assignment rules not being addressed with the new syntax, then I would expect && to be as broken as &. But it's not. The compiler handles that correctly. And indeed, in the feature documentation (I hesitate to say "specification", because there's no ECMA-ratified C# 7 specification yet), it reads:
The type_pattern both tests that an expression is of a given type and casts it to that type if the test succeeds. This introduces a local variable of the given type named by the given identifier. That local variable is definitely assigned when the result of the pattern-matching operation is true. [emphasis mine]
Since short-circuiting produces correct behavior without pattern matching, and since pattern matching produces correct behavior without short-circuiting (and definite-assignment is explicitly addressed in the feature description), I would say this is straight-up a compiler bug. There's probably some overlooked interaction between non-short-circuiting Boolean operators and the way the pattern-matched expression is evaluated that causes the definite assignment to get lost in the shuffle.
You should consider reporting it to the authorities. I think these days, the Roslyn GitHub issue-tracking is where they track this sort of thing. It might help if you explain in your report how you found this and why that particular syntax is important in your scenario (since in the code you posted, the && operator works equivalently…the non-short-circuiting & doesn't seem to confer any advantage to the code).

Need in IS operator in C# [duplicate]

This question already has answers here:
c# casting with is and as
(5 answers)
Closed 6 years ago.
In C# there is is operator for checking if object is compatible with some type. This operators tries to cast object to some type and if casting is successful it returns true (or false if casting fails).
From Jeffrey Richter CLR via C#:
The is operator checks whether an object is compatible with a given
type, and the result of the evaluation is a Boolean: true or false.
if (o is Employee)
{
Employee e = (Employee) o;
// Use e within the remainder of the 'if' statement.
}
In this code, the CLR is actually checking the object’s type twice:
The is operator first checks to see if o is compatible with the
Employee type. If it is, inside the if statement, the CLR again
verifies that o refers to an Employee when performing the cast. The
CLR’s type checking improves security, but it certainly comes at a
performance cost, because the CLR must determine the actual type of
the object referred to by the variable (o), and then the CLR must walk
the inheritance hierarchy, checking each base type against the
specified type (Employee).
Also, from the same book:
Employee e = o as Employee;
if (e != null)
{
// Use e within the 'if' statement.
}
In this code, the CLR checks if o is compatible with the Employee
type, and if it is, as returns a non-null reference to the same
object. If o is not compatible with the Employee type, the as operator
returns null. Notice that the as operator causes the CLR to verify an
object’s type just once. The if statement simply checks whether e is
null; this check can be performed faster than verifying an object’s
type.
So, my question is: why do we need is operator? Which are the cases when is operator is more preferable over as.
why do we need is operator?
We don't need it. It is redundant. If the is operator were not in the language you could emulate it by simply writing
(x as Blah) != null
for reference types and
(x as Blah?) != null
for value types.
In fact, that is all is is; if you look at the IL, both is and as compile down to the same IL instruction.
Your first question cannot be answered because it presumes a falsehood. Why do we need this operator? We don't need it, so there is no reason why we need it. So this is not a productive question to ask.
Which are the cases when is operator is more preferable over as.
I think you meant to ask
why would I write the "inefficient" code that does two type checks -- is followed by a cast -- when I could write the efficient code that does one type check using as and a null check?
First of all, the argument from efficiency is weak. Type checks are cheap, and even if they are expensive, they're probably not the most expensive thing you do. Don't change code that looks perfectly sensible just to save those few nanoseconds. If you think the code looks nicer or is easier to read using is rather than as, then use is rather than as. There is no product in the marketplace today whose success or failure was predicated on using as vs is.
Or, look at it the other way. Both is and as are evidence that your program doesn't even know what the type of a value is, and programs where the compiler cannot work out the types tend to be (1) buggy, and (2) slow. If you care so much about speed, don't write programs that do one type test instead of two; write programs that do zero type tests instead of one! Write programs where typing can be determined statically.
Second, there are situations in C# where you need an expression, not a statement, and C# unfortunately does not have "let" expressions outside of queries. You can write
... e is Manager ? ((Manager)e).Reports : 0 ...
as an expression but pre C# 7 there was no way to write
Manager m = e as Manager;
in an expression context. In a query you could write either
from e in Employees
select e is Manager ? ((Manager)e).Reports : 0
or
from e in Employees
let m = e as Manager
select m == null ? 0 : m.Reports
but there is no "let" in an expression context outside of queries. It would be nice to be able to write
... let m = e as Manager in m == null ? 0 : m.Reports ...
in an arbitrary expression. But we can get some of the way there. In C# 7 you'll (probably) be able to write
e is Manager m ? m.Reports : 0 ...
which is a nice sugar and eliminates the inefficient double-check. The is-with-new-variable syntax nicely combines everything together: you get a Boolean type test and a named, typed reference.
Now, what I just said is a slight lie; as of C# 6 you can write the code above as
(e as Manager)?.Reports ?? 0
which does the type check once. But pre C# 6.0 you were out of luck; you pretty much always had to do the type check twice if you were in an expression context.
With C# 7 operator is can be less wordy then as
Compare this
Employee e = o as Employee;
if (e != null)
{
// Use e within the 'if' statement.
}
and this
if (o is Employee e)
{
// Use e within the 'if' statement.
}
Information from here. Section Pattern Matching with Is Expressions
There are times when you might want to just check the type not actually go through the effort of casting it.
As such you can just use the is operator to confirm your object is compatible, and do whatever logic you want. Whereas in other scenarios you may just want to cast (Safely or otherwise) and utilise the returned value.
Ultimately because is just returns a boolean, you can use it for checking.
as and the (T)MyType type casting are used to safely casting to null, or throwing an Exception respectively
How to: Safely Cast by Using as and is Operators (C# Programming Guide)
At least one use-case I can think of is when comparing if a certain variable is a value type (as cannot be used in that case).
For instance,
var x = ...;
if(x is bool)
{
// do something
}
It can also be useful when you don't necessarily need to use the cast, but are simply interested whether or not something is of a certain underlying type.

Does the ?? operator guarantee only to run the left-hand argument once?

For example, given this code:
int? ReallyComplexFunction()
{
return 2; // Imagine this did something that took a while
}
void Something()
{
int i = ReallyCleverFunction() ?? 42;
}
... is it guaranteed that the function will only be called once? In my test it's only called once, but I can't see any documentation stating I can rely on that always being the case.
Edit
I can guess how it is implemented, but c'mon: we're developers. We shouldn't be muddling through on guesses and assumptions. For example, will all future implementations be the same? Will another platform's implementation of the language be the same? That depends on the specifications of the language and what guarantees it offers. For example, a future or different platform implementation (not a good one, but it's possible) may do this in the ?? implementation:
return ReallyComplexFunction() == null ? 42 : ReallyComplexFunction();
That, would call the ReallyComplexFunction twice if it didn't return null. (although this looks a ridiculous implementation, if you replace the function with a nullable variable it looks quite reasonable: return a == null ? 42 : a)
As stated above, I know in my test it's only called once, but my question is does the C# Specification guarantee/specify that the left-hand side will only be called once? If so, where? I can't see any such mention in the C# Language Specification for ?? operator (where I originally looked for the answer to my query).
The ?? operator will evaluate the left side once, and the right side zero or once, depending on the result of the left side.
Your code
int i = ReallyCleverFunction() ?? 42;
Is equivalent to (and this is actually very close to what the compiler actually generates):
int? temp = ReallyCleverFunction();
int i = temp.HasValue ? temp.Value : 42;
Or more simply:
int i = ReallyCleverFunction().GetValueOrDefault(42);
Either way you look at it, ReallyCleverFunction is only called once.
The ?? operator has nothing to do with the left hand. The left hand is run first and then the ?? operator evaluates its response.
In your code, ReallyCleverFunction will only run once.
It will be only called once. If the value of ReallyCleverFunction is null, the value 42 is used, otherwise the returned value of ReallyCleverFunction is used.
It will evaluate the function, then evaluate the left value (which is the result of the function) of the ?? operator. There is no reason it would call the function twice.
It is called once and following the documentation I believe this should be sufficient to assume it is only called once:
It returns the left-hand operand if the operand is not null; otherwise
it returns the right operand.
?? operator
It will only be run once because the ?? operator is just a shortcut.
Your line
int i = ReallyCleverFunction() ?? 42;
is the same as
int? temp = ReallyCleverFunction();
int i;
if (temp != null)
{
i = temp.Value;
} else {
i = 42;
}
The compiler does the hard work.
Firstly, the answer is yes it does guarantee it will only evaluate it once, by inference in section 7.13 of the official C# language specification.
Section 7.13 always treats a as an object, therefore it must only take the return of a function and use that in the processing. It says the following about the ?? operator (emphasis added):
• If b is a dynamic expression, the result type is dynamic. At run-time, a is first evaluated. If a is not null, a is converted to dynamic, and this becomes the result. Otherwise, b is evaluated, and this becomes the result.
• Otherwise, if A exists and is a nullable type and an implicit conversion exists from b to A0, the result type is A0. At run-time, a is first evaluated. If a is not null, a is unwrapped to type A0, and this becomes the result. Otherwise, b is evaluated and converted to type A0, and this becomes the result.
• Otherwise, if A exists and an implicit conversion exists from b to A, the result type is A. At run-time, a is first evaluated. If a is not null, a becomes the result. Otherwise, b is evaluated and converted to type A, and this becomes the result.
• Otherwise, if b has a type B and an implicit conversion exists from a to B, the result type is B. At run-time, a is first evaluated. If a is not null, a is unwrapped to type A0 (if A exists and is nullable) and converted to type B, and this becomes the result. Otherwise, b is evaluated and becomes the result.
As a side-note the link given at the end of the question is not the C# language specification despite it's first ranking in a Google Search for the "C# language specification".

Why doesn't null evaluate to false?

What is the reason null doesn't evaluate to false in conditionals?
I first thought about assignments to avoid the bug of using = instead of ==, but this could easily be disallowed by the compiler.
if (someClass = someValue) // cannot convert someClass to bool. Ok, nice
if (someClass) // Cannot convert someClass to bool. Why?
if (someClass != null) // More readable?
I think it's fairly reasonable to assume that null means false. There are other languages that use this too, and I've not had a bug because of it.
Edit: And I'm of course referring to reference types.
A good comment by Daniel Earwicker on the assignment bug... This compiles without a warning because it evaluates to bool:
bool bool1 = false, bool2 = true;
if (bool1 = bool2)
{
// oops... false == true, and bool1 became true...
}
It's a specific design feature in the C# language: if statements accept only a bool.
IIRC this is for safety: specifically, so that your first if (someClass = someValue) fails to compile.
Edit: One benefit is that it makes the if (42 == i) convention ("yoda comparisons") unnecessary.
"I think it's fairly reasonable to assume that null means false"
Not in C#. false is a boolean struct, a value type. Value types cannot have a null value. If you wanted to do what you achieved, you'd have to create custom converters of your particular type to boolean:
public class MyClass
{
public static implicit operator bool(MyClass instance)
{
return instance != null;
}
}
With the above, I could then do:
if (instance) {
}
etc.
"I think it's fairly reasonable to assume that null means false"
I don't agree. IMHO, more often than not, false means "no". Null means "I don't know"; i.e. completely indeterminate.
One thing that comes to mind what about in the instance of a data type, like int? Int's can't be null, so do they always evaluate to true? You could assume that int = 0 is false, but that starts to get really complicated, because 0 is a valid value (where maybe 0 should evaluate to true, because the progammer set it) and not just a default value.
There are a lot of edge cases where null isn't an option, or sometimes it's an option, and other times it's not.
They put in things like this to protect the programmer from making mistakes. It goes along the same line of why you can't do fall through in case statements.
Just use if(Convert.ToBoolean(someClass))
http://msdn.microsoft.com/en-us/library/wh2c31dd.aspx
Parameters
value Type: System.Object An object that implements the
IConvertible interface, or null.
Return Value
Type: System.Boolean true or false,
which reflects the value returned by
invoking the IConvertible.ToBoolean
method for the underlying type of
value. If value is null, the method
returns false
As far as I know, this is a feature that you see in dynamic languages, which C# is not (per the language specification if only accepts bool or an expression that evaluates to bool).
I don't think it's reasonable to assume that null is false in every case. It makes sense in some cases, but not in others. For example, assume that you have a flag that can have three values: set, unset, and un-initialized. In this case, set would be true, unset would be false and un-initialized would be null. As you can see, in this case the meaning of null is not false.
Because null and false are different things.
A perfect example is bool? foo
If foo's value is true, then its value is true.
If foo's value is false, then its value is false
If foo has nothing assigned to it, its value is null.
These are three logically separate conditions.
Think of it another way
"How much money do I owe you?"
"Nothing" and "I don't have that information" are two distinctly separate answers.
What is the reason null doesn't
evaluate to false in conditionals?
I first thought about assignments to
avoid the bug of using = instead of
==
That isn't the reason. We know this because if the two variables being compared happen to be of type bool then the code will compile quite happily:
bool a = ...
bool b = ...
if (a = b)
Console.WriteLine("Weird, I expected them to be different");
If b is true, the message is printed (and a is now true, making the subsequent debugging experience consistent with the message, thus confusing you even more...)
The reason null is not convertible to bool is simply that C# avoids implicit conversion unless requested by the designer of a user-defined type. The C++ history book is full of painful stories caused by implicit conversions.
Structurally, most people who "cannot think of any technological reason null should be equal to false" get it wrong.
Code is run by CPUs.
Most (if not all) CPUs have bits, groups of bits and interpretations of groups of bits. That said, something can be 0, 1, a byte, a word, a dword, a qword and so on.
Note that on x86 platform, bytes are octets (8 bits), and words are usually 16 bits, but this is not a necessity. Older CPUs had words of 4 bits, and even todays' low-end embedded controllers often use like 7 or 12 bits per word.
That said, something is either "equal", "zero", "greater", "less", "greater or equal", or "less or equal" in machine code. There is no such thing as null, false or true.
As a convention, true is 1, false is 0, and a null pointer is either 0x00, 0x0000, 0x00000000, or 0x0000000000000000, depending on address bus width.
C# is one of the exceptions, as it is an indirect type, where the two possible values 0 and 1 are not an immediate value, but an index of a structure (think enum in C, or PTR in x86 assembly).
This is by design.
It is important to note, though, that such design decisions are elaborate decisions, while the traditional, straightforward way is to assume that 0, null and false are equal.
C# doesn't make a conversion of the parameter, as C++ does. You need to explicitly convert the value in a boolean, if you want the if statement to accept the value.
It's simply the type system of c# compared to languages like PHP, Perl, etc.
A condition only accepts Boolean values, null does not have the type Boolean so it doesn't work there.
As for the NULL example in C/C++ you mentioned in another comment it has to be said that neither C nor C++ have a boolean type (afair C++ usually has a typecast for bool that resolves to an int, but thats another matter) and they also have no null-references, only NULL(=> 0)-pointers.
Of course the compiler designers could implement an automatic conversion for any nullable type to boolean but that would cause other problems, i.e.:
Assuming that foo is not null:
if (foo)
{
// do stuff
}
Which state of foo is true?
Always if it's not null?
But what if you want your type to be convertable to boolean (i.e. from your tri-state or quantum-logic class)?
That would mean you would have two different conversions to bool, the implicit and the explicit, which would both behave differently.
I don't even dare to imagine what should happen if you do
if (!!foo) // common pattern in C to normalize a value used as boolean,
// in this case might be abused to create a boolean from an object
{
}
I think the forced (foo == null) is good since it also adds clarity to your code, it's easier to understand what you really check for.

Are there any good reasons why ternaries in C# are limited?

Fails:
object o = ((1==2) ? 1 : "test");
Succeeds:
object o;
if (1 == 2)
{
o = 1;
}
else
{
o = "test";
}
The error in the first statement is:
Type of conditional expression cannot be determined because there is no implicit conversion between 'int' and 'string'.
Why does there need to be though, I'm assigning those values to a variable of type object.
Edit: The example above is trivial, yes, but there are examples where this would be quite helpful:
int? subscriptionID; // comes in as a parameter
EntityParameter p1 = new EntityParameter("SubscriptionID", DbType.Int32)
{
Value = ((subscriptionID == null) ? DBNull.Value : subscriptionID),
}
use:
object o = ((1==2) ? (object)1 : "test");
The issue is that the return type of the conditional operator cannot be un-ambiguously determined. That is to say, between int and string, there is no best choice. The compiler will always use the type of the true expression, and implicitly cast the false expression if necessary.
Edit:
In you second example:
int? subscriptionID; // comes in as a parameter
EntityParameter p1 = new EntityParameter("SubscriptionID", DbType.Int32)
{
Value = subscriptionID.HasValue ? (object)subscriptionID : DBNull.Value,
}
PS:
That is not called the 'ternary operator.' It is a ternary operator, but it is called the 'conditional operator.'
Though the other answers are correct, in the sense that they make true and relevant statements, there are some subtle points of language design here that haven't been expressed yet. Many different factors contribute to the current design of the conditional operator.
First, it is desirable for as many expressions as possible to have an unambiguous type that can be determined solely from the contents of the expression. This is desirable for several reasons. For example: it makes building an IntelliSense engine much easier. You type x.M(some-expression. and IntelliSense needs to be able to analyze some-expression, determine its type, and produce a dropdown BEFORE IntelliSense knows what method x.M refers to. IntelliSense cannot know what x.M refers to for sure if M is overloaded until it sees all the arguments, but you haven't typed in even the first argument yet.
Second, we prefer type information to flow "from inside to outside", because of precisely the scenario I just mentioned: overload resolution. Consider the following:
void M(object x) {}
void M(int x) {}
void M(string x) {}
...
M(b ? 1 : "hello");
What should this do? Should it call the object overload? Should it sometimes call the string overload and sometimes call the int overload? What if you had another overload, say M(IComparable x) -- when do you pick it?
Things get very complicated when type information "flows both ways". Saying "I'm assigning this thing to a variable of type object, therefore the compiler should know that it's OK to choose object as the type" doesn't wash; it's often the case that we don't know the type of the variable you're assigning to because that's what we're in the process of attempting to figure out. Overload resolution is exactly the process of working out the types of the parameters, which are the variables to which you are assigning the arguments, from the types of the arguments. If the types of the arguments depend on the types to which they're being assigned, then we have a circularity in our reasoning.
Type information does "flow both ways" for lambda expressions; implementing that efficiently took me the better part of a year. I've written a long series of articles describing some of the difficulties in designing and implementing a compiler that can do analysis where type information flows into complex expressions based on the context in which the expression is possibly being used; part one is here:
http://blogs.msdn.com/ericlippert/archive/2007/01/10/lambda-expressions-vs-anonymous-methods-part-one.aspx
You might say "well, OK, I see why the fact that I'm assigning to object cannot be safely used by the compiler, and I see why it's necessary for the expression to have an unambiguous type, but why isn't the type of the expression object, since both int and string are convertible to object?" This brings me to my third point:
Third, one of the subtle but consistently-applied design principles of C# is "don't produce types by magic". When given a list of expressions from which we must determine a type, the type we determine is always in the list somewhere. We never magic up a new type and choose it for you; the type you get is always one that you gave us to choose from. If you say to find the best type in a set of types, we find the best type IN that set of types. In the set {int, string}, there is no best common type, the way there is in, say, "Animal, Turtle, Mammal, Wallaby". This design decision applies to the conditional operator, to type inference unification scenarios, to inference of implicitly typed array types, and so on.
The reason for this design decision is that it makes it easier for ordinary humans to work out what the compiler is going to do in any given situation where a best type must be determined; if you know that a type that is right there, staring you in the face, is going to be chosen then it is a lot easier to work out what is going to happen.
It also avoids us having to work out a lot of complex rules about what's the best common type of a set of types when there are conflicts. Suppose you have types {Foo, Bar}, where both classes implement IBlah, and both classes inherit from Baz. Which is the best common type, IBlah, that both implement, or Baz, that both extend? We don't want to have to answer this question; we want to avoid it entirely.
Finally, I note that the C# compiler actually gets the determination of the types subtly wrong in some obscure cases. My first article about that is here:
http://blogs.msdn.com/ericlippert/archive/2006/05/24/type-inference-woes-part-one.aspx
It's arguable that in fact the compiler does it right and the spec is wrong; the implementation design is in my opinion better than the spec'd design.
Anyway, that's just a few reasons for the design of this particular aspect of the ternary operator. There are other subtleties here, for instance, how the CLR verifier determines whether a given set of branching paths are guaranteed to leave the correct type on the stack in all possible paths. Discussing that in detail would take me rather far afield.
Why is feature X this way is often a very hard question to answer. It's much easier to answer the actual behavior.
My educated guess as to why. The conditional operator is allowed to succinctly and tersely use a boolean expression to pick between 2 related values. They must be related because they are being used in a single location. If the user instead picks 2 unrelated values perhaps the had a subtle typo / bug in there code and the compiler is better off alerting them to this rather than implicitly casting to object. Which may be something they did not expect.
"int" is a primitive type, not an object while "string" is considered more of a "primitive object". When you do something like "object o = 1", you're actually boxing the "int" to an "Int32". Here's a link to an article about boxing:
http://msdn.microsoft.com/en-us/magazine/cc301569.aspx
Generally, boxing should be avoided due to performance loses that are hard to trace.
When you use a ternary expression, the compiler does not look at the assignment variable at all to determine what the final type is. To break down your original statement into what the compiler is doing:
Statement:
object o = ((1==2) ? 1 : "test");
Compiler:
What are the types of "1" and "test" in '((1==2) ? 1 : "test")'? Do they match?
Does the final type from #1 match the assignment operator type for 'object o'?
Since the compiler doesn't evaluate #2 until #1 is done, it fails.

Categories

Resources