This subject has been discussed many times and is clear: Is relying on && short-circuiting safe in .NET?
What I cannot find is a clear answer if the short circuit is also reliable when out parameter is used in the right part of the if clause:
int foo;
....
if ( false == Int32.TryParse( bar, out foo ) || 0 == foo )
When I try this code then it works. But I am just a bit worried if the behavior is reliable, because I don't know how the compiler is translating the code and accessing foo.
My question:
Is the access to foo really done after the TryParse which would consider any changed value or the compiler can sometimes read the value before TryParse [because of some optimizations] and therefore I cannot rely on this code.
Is the access to foo really done after the TryParse which would
consider any changed value
Yes, this will be the case.
or the compiler can sometimes read the value before TryParse [because
of some optimizations] and therefore I cannot rely on this code.
No, it will not read foo before the call to Int32.TryParse; one being that the logical OR will always evaluate the left side and only evaluates the right side if it's necessary. two, it will not evaluate the right side first because the foo variable is not initialised at that point which would cause a compilation error (assuming it's a local variable).
You can verify this works by using the new language features in C#:
if (!int.TryParse("0", out int foo) || foo == 0)
{
//foo == 0 or the tryparse failed
//foo still exists here though, so if it fails, it will still == 0
}
You can see that the declaration of foo happens inside the int.TryParse call, which wouldn't happen if foo == 0 was evaluated first. This is legal syntax in C# 7 and up. Really this is all syntactic sugar for this:
int foo = 0
if (!int.TryParse("0", out foo) || foo == 0)
{
//foo == 0 or the tryparse failed
}
Related
Im currently writing an C# application, targeting .NET 4.7 (C# 7). I am confused after I tried using the new way of declaring a variable utilizing the "is" keyword:
if (variable is MyClass classInstance)
This way it works, but when doing:
if (true & variable is MyClass classInstance)
{
var a = classInstance;
}
Visual Studio (I'm using 2017) shows me the the Error Use of unassigned local variable 'classInstance'. Using the short-circuting version of & (&&) it works fine. Am I missing something about the & operator? (I know using the shortcircuting versions are much more commonly used, but at this point I'm just curious)
This one hurt my head, but I think I have got it figured out.
This confusion is caused by two quirks: the way the is operation leaves the variable undeclared (not null), and the way the compiler optimizes away Boolean, but not bitwise, operations.
Quirk 1. If the cast fails, the variable is unassigned (not null)
Per the documentation for the new is syntax:
If exp is true and is is used with an if statement, varname is assigned and has local scope within the if statement only.
If you read between the lines, this means that if the is cast fails, the variable is considered unassigned. This may be counterintuitive (some might expect it to be null instead). This means that any code within the if block that relies on the variable will not compile if there is any chance the overall if clause could evaluate to true without a type match present. So for example
This compiles:
if (instance is MyClass y)
{
var x = y;
}
And this compiles:
if (true && instance is MyClass y)
{
var x = y;
}
But this does not:
void Test(bool f)
{
if (f && instance is MyClass y)
{
var x = y; //Error: Use of unassigned local variable
}
}
Quirk 2. Boolean operations are optimized away, binary ones are not
When the compiler detects a predestined Boolean result, the compiler will not emit unreachable code, and skips certain validations as a result. For example:
This compiles:
void Test(bool f)
{
object neverAssigned;
if (false && f)
{
var x = neverAssigned; //OK (never executes)
}
}
But if you use & instead of &&, it does not compile:
void Test(bool f)
{
object neverAssigned;
if (false & f)
{
var x = neverAssigned; //Error: Use of unassigned local variable
}
}
When the compiler sees something like true && it just ignores it completely. Thus
if (true && instance is MyClass y)
Is exactly the same as
if (instance is MyClass y)
But this
if (true & instance is MyClass y)
Is NOT the same. The compiler still needs to emit code that performs the & operation and uses its output in a conditional statement. Or even if it doesn't, the current C# 7 compiler apparently performs the same validations as if it were. This may seem a little strange, but bear in mind that when you use & instead of &&, there is a guarantee that the & must execute, and though it seems unimportant in this example, the general case must allow for additional complexifying factors, such as operator overloading.
How the quirks combine
In the last example, the result of the if clause is determined at run time, not compile time. So the compiler can't be certain that y will end up being assigned before the contents of the if block are executed. Thus you get
if (true & instance is MyClass y)
{
var x = y; //Error: use of unassigned local variable
}
TLDR
In the situation of a compound logical operation, c# can't be sure that the overall if condition will resolve to true if and only if the cast is successful. Absent that certainty, it can't allow access to the variable, since it might be unassigned. An exception is made when the expression can be reduced to non-compound operation at compile time, for example by removing true &&.
Workaround
I think the way we are meant to use the new is syntax is as a single condition with an if clause. Adding true && at the beginning works because the compiler simply removes it. But anything else combined with the new syntax creates ambiguity about whether the new variable will be in an unassigned state when the code block runs, and the compiler can't allow that.
The workaround of course is to nest your conditions instead of combining them:
Won't work:
void Test(bool f)
{
if (f & instance is MyClass y)
{
var x = y; //Error: Use of unassigned local variable
}
}
Works fine:
void Test(bool f)
{
if (f)
{
if (instance is MyClass y)
{
var x = y; //Works
}
}
}
This is due to the rules on definite assignment, which have a special case for &&, but don't have a special case for &. I believe it's working as intended by the C# design team, but that the specification may have a little bit more work to do, at least in terms of clarity.
From the ECMA C# 5 standard, section 5.3.3.24:
For an expression expr of the form expr-first && expr-second:
...
The definite assignment state of v after expr is determined by:
If expr-first is a constant expression with the value false, then the definite assignment state of v after expr is the same as the definite assignment state of v after expr-first.
Otherwise, if the state of v after expr-first is definitely assigned, then the state of v after expr is definitely assigned.
Otherwise, if the state of v after expr-second is definitely assigned, and the state of v after expr-first is “definitely assigned after false expression”, then the state of v after expr is definitely assigned.
Otherwise, if the state of v after expr-second is definitely assigned or “definitely assigned after true expression”, then the state of v after expr is “definitely assigned after true expression”.
...
The relevant part for this case is the one I've highlighted in bold. classInstance is "definitely assigned after true expression" with the pattern (expr-second above), and none of the earlier cases apply, so the overall state at the end of the condition is "definitely assigned after true expression". That means within the if statement body, it's definitely assigned.
There's no equivalent clause for the & operator. While there potentially could be, it would be complicated by the types involved - it would have to only apply to the & operator when used with bool expressions, and I don't think most of the rest of definite assignment deals with types in that way.
Note that you don't need to use pattern matching to see this.
Consider the following program:
using System;
class Program
{
static void Main()
{
bool a = false;
bool x;
bool y = true;
if (true & (y && (x = a)))
{
Console.WriteLine(x);
}
}
}
The expression y && (x = a) is another one where x ends up being "definitely assigned after true expression". Again, the code above fails to compile due to x not being definitely assigned, whereas if you change the & to && it compiles. So at least this isn't an issue in pattern matching itself.
What confuses me somewhat is why x isn't still "definitely assigned after true expression" due to 5.3.3.21 ("5.3.3.21 General rules for expressions with embedded expressions"), which contains:
The definite assignment state of v at the end of expr is the same as the definite assignment state at the end of exprn.
I suspect this is meant to only include "definitely assigned" or "not definitely assigned", rather than including the "definitely assigned after true expression" part - although it's not as clear as it should be.
Am I missing something about the & operator?
No, I don't think so. Your expectations seem correct to me. Consider this alternative example, which does compile without error:
object variable = null;
MyClass classInstance;
if (true & ((classInstance = variable as MyClass) != null))
{
var a = classInstance;
}
var b = classInstance;
(To me, it's more interesting to consider the assignment outside the if body, since that's where the short-circuiting would affect behavior.)
With the explicit assignment, the compiler recognizes classInstance as definitely assigned, in the assignments of both a and b. It should be able to do the same thing with the new syntax.
With logical and, short-circuiting or not shouldn't matter. Your first value is true, so the second half should always need to be evaluated to get the whole expression. As you've noted, the compiler does treat the & and && differently though, which is unexpected.
A variation on this is this code:
static void M3()
{
object variable = null;
if (true | variable is MyClass classInstance)
{
var a = classInstance;
}
}
The compiler correctly identifies classInstance as not definitely assigned when || is used, but has the same apparent misbehavior with | (i.e. also saying that classInstance is not definitely assigned), even though with the non-short-circuiting operator, the assignment must happen regardless.
Again, the above works correctly with the assignment is explicit, rather than using the new syntax.
If this were just about the definite assignment rules not being addressed with the new syntax, then I would expect && to be as broken as &. But it's not. The compiler handles that correctly. And indeed, in the feature documentation (I hesitate to say "specification", because there's no ECMA-ratified C# 7 specification yet), it reads:
The type_pattern both tests that an expression is of a given type and casts it to that type if the test succeeds. This introduces a local variable of the given type named by the given identifier. That local variable is definitely assigned when the result of the pattern-matching operation is true. [emphasis mine]
Since short-circuiting produces correct behavior without pattern matching, and since pattern matching produces correct behavior without short-circuiting (and definite-assignment is explicitly addressed in the feature description), I would say this is straight-up a compiler bug. There's probably some overlooked interaction between non-short-circuiting Boolean operators and the way the pattern-matched expression is evaluated that causes the definite assignment to get lost in the shuffle.
You should consider reporting it to the authorities. I think these days, the Roslyn GitHub issue-tracking is where they track this sort of thing. It might help if you explain in your report how you found this and why that particular syntax is important in your scenario (since in the code you posted, the && operator works equivalently…the non-short-circuiting & doesn't seem to confer any advantage to the code).
This question already has answers here:
c# casting with is and as
(5 answers)
Closed 6 years ago.
In C# there is is operator for checking if object is compatible with some type. This operators tries to cast object to some type and if casting is successful it returns true (or false if casting fails).
From Jeffrey Richter CLR via C#:
The is operator checks whether an object is compatible with a given
type, and the result of the evaluation is a Boolean: true or false.
if (o is Employee)
{
Employee e = (Employee) o;
// Use e within the remainder of the 'if' statement.
}
In this code, the CLR is actually checking the object’s type twice:
The is operator first checks to see if o is compatible with the
Employee type. If it is, inside the if statement, the CLR again
verifies that o refers to an Employee when performing the cast. The
CLR’s type checking improves security, but it certainly comes at a
performance cost, because the CLR must determine the actual type of
the object referred to by the variable (o), and then the CLR must walk
the inheritance hierarchy, checking each base type against the
specified type (Employee).
Also, from the same book:
Employee e = o as Employee;
if (e != null)
{
// Use e within the 'if' statement.
}
In this code, the CLR checks if o is compatible with the Employee
type, and if it is, as returns a non-null reference to the same
object. If o is not compatible with the Employee type, the as operator
returns null. Notice that the as operator causes the CLR to verify an
object’s type just once. The if statement simply checks whether e is
null; this check can be performed faster than verifying an object’s
type.
So, my question is: why do we need is operator? Which are the cases when is operator is more preferable over as.
why do we need is operator?
We don't need it. It is redundant. If the is operator were not in the language you could emulate it by simply writing
(x as Blah) != null
for reference types and
(x as Blah?) != null
for value types.
In fact, that is all is is; if you look at the IL, both is and as compile down to the same IL instruction.
Your first question cannot be answered because it presumes a falsehood. Why do we need this operator? We don't need it, so there is no reason why we need it. So this is not a productive question to ask.
Which are the cases when is operator is more preferable over as.
I think you meant to ask
why would I write the "inefficient" code that does two type checks -- is followed by a cast -- when I could write the efficient code that does one type check using as and a null check?
First of all, the argument from efficiency is weak. Type checks are cheap, and even if they are expensive, they're probably not the most expensive thing you do. Don't change code that looks perfectly sensible just to save those few nanoseconds. If you think the code looks nicer or is easier to read using is rather than as, then use is rather than as. There is no product in the marketplace today whose success or failure was predicated on using as vs is.
Or, look at it the other way. Both is and as are evidence that your program doesn't even know what the type of a value is, and programs where the compiler cannot work out the types tend to be (1) buggy, and (2) slow. If you care so much about speed, don't write programs that do one type test instead of two; write programs that do zero type tests instead of one! Write programs where typing can be determined statically.
Second, there are situations in C# where you need an expression, not a statement, and C# unfortunately does not have "let" expressions outside of queries. You can write
... e is Manager ? ((Manager)e).Reports : 0 ...
as an expression but pre C# 7 there was no way to write
Manager m = e as Manager;
in an expression context. In a query you could write either
from e in Employees
select e is Manager ? ((Manager)e).Reports : 0
or
from e in Employees
let m = e as Manager
select m == null ? 0 : m.Reports
but there is no "let" in an expression context outside of queries. It would be nice to be able to write
... let m = e as Manager in m == null ? 0 : m.Reports ...
in an arbitrary expression. But we can get some of the way there. In C# 7 you'll (probably) be able to write
e is Manager m ? m.Reports : 0 ...
which is a nice sugar and eliminates the inefficient double-check. The is-with-new-variable syntax nicely combines everything together: you get a Boolean type test and a named, typed reference.
Now, what I just said is a slight lie; as of C# 6 you can write the code above as
(e as Manager)?.Reports ?? 0
which does the type check once. But pre C# 6.0 you were out of luck; you pretty much always had to do the type check twice if you were in an expression context.
With C# 7 operator is can be less wordy then as
Compare this
Employee e = o as Employee;
if (e != null)
{
// Use e within the 'if' statement.
}
and this
if (o is Employee e)
{
// Use e within the 'if' statement.
}
Information from here. Section Pattern Matching with Is Expressions
There are times when you might want to just check the type not actually go through the effort of casting it.
As such you can just use the is operator to confirm your object is compatible, and do whatever logic you want. Whereas in other scenarios you may just want to cast (Safely or otherwise) and utilise the returned value.
Ultimately because is just returns a boolean, you can use it for checking.
as and the (T)MyType type casting are used to safely casting to null, or throwing an Exception respectively
How to: Safely Cast by Using as and is Operators (C# Programming Guide)
At least one use-case I can think of is when comparing if a certain variable is a value type (as cannot be used in that case).
For instance,
var x = ...;
if(x is bool)
{
// do something
}
It can also be useful when you don't necessarily need to use the cast, but are simply interested whether or not something is of a certain underlying type.
According to the Constraints on Type Parameters (C# Programming Guide) documentation it says, and I quote:
When applying the where T : class constraint, avoid the == and !=
operators on the type parameter because these operators will test for
reference identity only, not for value equality. This is the case even
if these operators are overloaded in a type that is used as an
argument. The following code illustrates this point; the output is
false even though the String class overloads the == operator.
With the following example:
public static void OpTest<T>(T s, T t) where T : class
{
System.Console.WriteLine(s == t);
}
static void Main()
{
string s1 = "target";
System.Text.StringBuilder sb = new System.Text.StringBuilder("target");
string s2 = sb.ToString();
OpTest<string>(s1, s2);
}
However, ReSharper (I've just started using the demo/trial version to see if it is any worth to me) gives a hint/tip to do null-checking of parameters like so:
public Node(T type, Index2D index2D, int f, Node<T> parent)
{
if (type == null) throw new ArgumentNullException("type");
if (index2D == null) throw new ArgumentNullException("index2D");
if (parent == null) throw new ArgumentNullException("parent");
}
(T is constrained with where T : class, new())
Can I safely follow ReSharper without running into problems that the C# documentation is trying to tell me to avoid?
Yes, that is fine. The documentation says not to compare two parameters because you're doing a reference comparison. The code ReSharper suggests is just ensuring the reference you've been passed is not null so that is safe to do.
In my opinion the main reason the C# docs recommend you don't do that is because if for example you do == with two strings that were passed as T1 and T2 it will do a reference comparison. Any other time you do stringA == stringB it will do a value comparison with the overload in the string class. It's really just warning against doing these types of comparisons because the operator overload that would normally be used (if you used that operator on two types declared in local scope) is not.
Yes.
Only a very strange implementation of the == operator would make x == null mean something other than ReferenceEquals(x, null). If you are using classes that have such a strange implementation of equality, you have bigger problems than it being inconsistent when checking for null using a generic type argument.
The documentation is telling you that == will use a reference comparison regardless of what the actual type of T is, even if it has overloaded the == operator. In many cases this isn't what users would expect.
In the case of the resharper code it's comparing the variable to null, which you want to be a reference comparison, not a value comparison, so what the documentation is warning you of is the correct behavior here.
You could however make it a bit more explicit by writing something like this, just to be clearer:
if(object.ReferenceEquals(type, null))//...
Yes, it's generally ok to assume that null is a special case, and that it's okay to do == or != on it. This is due, in part at least, to the fact that the recommendation for overriding Equals says that x.Equals(null) should be false.
If you didn't have the T : class constraint, it'd be a different story, since you'd see some possibly-unexpected behavior for structs and nullable structs (you might want to do default(T) instead of null there).
And if you want to be explicit that yes, you know you're comparing the reference to null, you can always use object.ReferenceEquals(x, null) (I'd just go with ==/!=, though).
I would say that this is OK.
null is kinda special, so you are not doing an actual comparison between two reference variables, you are just assuring that the reference variable is not a null reference.
Compare this with, for example, SQL. There you have a completely different syntax for comparing ( a = b) and for the null check (a is null). The guideline is that you should avoid the first, but the second is ok.
I came across some statements while surfing the .net source:
If( null != name)
If( false == check())
what is the difference between (name != null) and (Check() == false) statements from the above statements?
Can any one get me clear of this? Please.
I'm sure this is a duplicate, but I can't find it right now.
Code with the constant first is usually written by developers with an (old) background in C, where you might write this:
if (NULL == x)
instead of this:
if (x == NULL)
just in case you made a typo like this:
if (x = NULL) // Valid, but doesn't do what you want.
In most cases in C#, you don't need to do this - because the condition in an if statement has to be convertible to bool.
It would make a difference when you're comparing to a Boolean constant:
if ((Check() = false) // Eek!
... which is one reason to just write:
if (!Check())
Basically, putting the variable first is generally more readable - and in C# the reason for putting the constant first is almost never applicable anyway. (Modern C and C++ compilers will usually give you a warning anyway, so it's no longer a great idea in those languages either, as far as I'm aware...)
In terms of the order of evaluation, there is a theoretical difference in that the left-hand operand will be evaluated before the right-hand operand... but as the constants null and false don't have any side-effects, there's no real difference.
I think this style comes with C/C++ background, where the operator = which can also be used inside if statement, which could be used for comparison as well as assignment.
The statement
if( null != name) is same as if(name != null), so as the other statement
To safeguard from mistakenly assigning the value to the left hand side, this style of check is used in C/C++,
Take a look Coding Horror Yoda Conditions
Using if(constant == variable) instead of if(variable == constant),
like if(4 == foo). Because it's like saying "if blue is the sky" or
"if tall is the man".
The odd reversal looks like it was written by a C/C++ programmer... basically, in C/C++, if you accidentally type:
if(name = null)
instead of
if(name == null)
then that compiles and changes the value to null when invoked. A C/C++ habit to avoid this mistake is to use a constant such as null / false always on the left. This is because you can't reassign a constant, so it generates a compiler error if you use = instead of ==. In C# it (if(a = null)) doesn't usually compile, so that is not a concern.
There is no difference re the position here - both sides are evaluated. Idiomatic C# for this would be simply:
if(name != null) ...
if(!check()) ...
In c#, is there any difference in the excecution speed for the order in which you state the condition?
if (null != variable) ...
if (variable != null) ...
Since recently, I saw the first one quite often, and it caught my attention since I was used to the second one.
If there is no difference, what is the advantage of the first one?
It's a hold-over from C. In C, if you either use a bad compiler or don't have warnings turned up high enough, this will compile with no warning whatsoever (and is indeed legal code):
// Probably wrong
if (x = 5)
when you actually probably meant
if (x == 5)
You can work around this in C by doing:
if (5 == x)
A typo here will result in invalid code.
Now, in C# this is all piffle. Unless you're comparing two Boolean values (which is rare, IME) you can write the more readable code, as an "if" statement requires a Boolean expression to start with, and the type of "x=5" is Int32, not Boolean.
I suggest that if you see this in your colleagues' code, you educate them in the ways of modern languages, and suggest they write the more natural form in future.
There is a good reason to use null first: if(null == myDuck)
If your class Duck overrides the == operator, then if(myDuck == null) can go into an infinite loop.
Using null first uses a default equality comparator and actually does what you were intending.
(I hear you get used to reading code written that way eventually - I just haven't experienced that transformation yet).
Here is an example:
public class myDuck
{
public int quacks;
static override bool operator ==(myDuck a, myDuck b)
{
// these will overflow the stack - because the a==null reenters this function from the top again
if (a == null && b == null)
return true;
if (a == null || b == null)
return false;
// these wont loop
if (null == a && null == b)
return true;
if (null == a || null == b)
return false;
return a.quacks == b.quacks; // this goes to the integer comparison
}
}
Like everybody already noted it comes more or less from the C language where you could get false code if you accidentally forget the second equals sign. But there is another reason that also matches C#: Readability.
Just take this simple example:
if(someVariableThatShouldBeChecked != null
&& anotherOne != null
&& justAnotherCheckThatIsNeededForTestingNullity != null
&& allTheseChecksAreReallyBoring != null
&& thereSeemsToBeADesignFlawIfSoManyChecksAreNeeded != null)
{
// ToDo: Everything is checked, do something...
}
If you would simply swap all the null words to the beginning you can much easier spot all the checks:
if(null != someVariableThatShouldBeChecked
&& null != anotherOne
&& null != justAnotherCheckThatIsNeededForTestingNullity
&& null != allTheseChecksAreReallyBoring
&& null != thereSeemsToBeADesignFlawIfSoManyChecksAreNeeded)
{
// ToDo: Everything is checked, do something...
}
So this example is maybe a bad example (refer to coding guidelines) but just think about you quick scroll over a complete code file. By simply seeing the pattern
if(null ...
you immediately know what's coming next.
If it would be the other way around, you always have to scan to the end of the line to see the nullity check, just letting you stumble for a second to find out what kind of check is made there. So maybe syntax highlighting may help you, but you are always slower when those keywords are at the end of the line instead of the front.
I guess this is a C programmer that has switched languages.
In C, you can write the following:
int i = 0;
if (i = 1)
{
...
}
Notice the use of a single equal sign there, which means the code will assign 1 to the variable i, then return 1 (an assignment is an expression), and use 1 in the if-statement, which will be handled as true. In other words, the above is a bug.
In C# however, this is not possible. There is indeed no difference between the two.
In earlier times, people would forget the '!' (or the extra '=' for equality, which is more difficult to spot) and do an assignment instead of a comparison. putting the null in front eliminates the possibility for the bug, since null is not an l-value (I.E. it can't be assigned to).
Most modern compilers give a warning when you do an assignment in a conditional nowadays, and C# actually gives an error. Most people just stick with the var == null scheme since it's easier to read for some people.
I don't see any advantage in following this convention. In C, where boolean types don't exist, it's useful to write
if( 5 == variable)
rather than
if (variable == 5)
because if you forget one of the eaqual sign, you end up with
if (variable = 5)
which assigns 5 to variable and always evaluate to true. But in Java, a boolean is a boolean. And with !=, there is no reason at all.
One good advice, though, is to write
if (CONSTANT.equals(myString))
rather than
if (myString.equals(CONSTANT))
because it helps avoiding NullPointerExceptions.
My advice would be to ask for a justification of the rule. If there's none, why follow it? It doesn't help readability
To me it's always been which style you prefer
#Shy - Then again if you confuse the operators then you should want to get a compilation error or you will be running code with a bug - a bug that come back and bite you later down the road since it produced unexpected behaviour
As many pointed out, it is mostly in old C code it was used to identify compilation error, as compiler accepted it as legal
New programming language like java, go are smart enough to capture such compilation errors
One should not use "null != variable" like conditions in code as it very unreadable
One more thing... If you are comparing a variable to a constant (integer or string for ex.), putting the constant on the left is good practice because you'll never run into NullPointerExceptions :
int i;
if(i==1){ // Exception raised: i is not initialized. (C/C++)
doThis();
}
whereas
int i;
if(1==i){ // OK, but the condition is not met.
doThis();
}
Now, since by default C# instanciates all variables, you shouldn't have that problem in that language.