In C# 8, how do I detect impossible null checks?

In C# 8, how do I detect impossible null checks? - c#

I've started using nullable reference types in C# 8. So far, I'm loving the improvement except for one small thing.
I'm migrating an old code base, and it's filled with a lot of redundant or unreachable code, something like:
void Blah(SomeClass a) {
if (a == null) {
// this should be unreachable, since a is not nullable
}
}
Unfortunately, I don't see any warning settings that can flag this code for me! Was this an oversight by Microsoft, or am I missing something?
I also use ReSharper, but none of its warning settings appear to capture this either. Has anybody else found a solution to this?
Edit: I'm aware that technically this is still reachable because the nullability checks aren't bulletproof. That's not really the point. In a situation like this, where I declare a paramater as NOT nullable, it is a usually a mistake to check if it's null. In the rare event that null gets passed in as a non-nullable type, I'd prefer to see the NullReferenceException and track down the offending code that passed in null by mistake.

It's really important to note that not only are the nullability checks not bullet proof, but while they're designed to discourage callers from sending null references, they do nothing to prevent it. Code can still compile that sends a null to this method, and there isn't any runtime validation of the parameter values themselves.
If you’re certain that all callers will be using C# 8’s nullability context—e.g., this is an internal method—and you’re really diligent about resolving all warnings from Roslyn’s static flow analysis (e.g., you’ve configured your build server to treat them as errors) then you’re correct that these null checks are redundant.
As noted in the migration guide, however, any external code that isn’t using C# nullability context will be completely oblivious to this:
The new syntax doesn't provide runtime checking. External code might circumvent the compiler's flow analysis.
Given that, it’s generally considered a best practice to continue to provide guard clauses and other nullability checks in any public or protected members.
In fact, if you use Microsoft’s Code Analysis package—which I’d recommend—it will warn you to use a guard clause in this exact situation. They considered removing this for code in C# 8’s nullability context, but decided to maintain it due to the above concerns.
When you get these warnings from Code Analysis, you can wrap your code in a null check, as you've done here. But you can also throw an exception. In fact, you could throw another NullReferenceException—though that's definitely not recommended. In a case like this, you should instead throw an ArgumentNullException, and pass the name of the parameter to the constructor:
void Blah(SomeClass a) {
if (a == null) {
throw new ArgumentNullException(nameof(a));
}
…
}
This is much preferred over throwing a NullReferenceException at the source because it communicates to callers what they can do to avoid this scenario by explicitly naming the exact parameter (in this case) that was passed as a null. That's more useful than just getting a NullReferenceException—and, possibly a reference to your internal code—where the exception occurred.
Critically, this exception isn't meant to help you debug your code—that's what Code Analysis is doing for you. Instead, it's demonstrating that you've already identified the potential dereference of a null value, and you've accounted for it at the source.
Note: These guard clauses can add a lot of clutter to your code. My preference is to create a reusable internal utility that handles this via a single line. Alternatively, a single-line shorthand for the above code is:
void Blah(SomeClass a) {
_ = a?? throw new ArgumentNullException(nameof(a));
}
This is a really roundabout way of answering your original question, which is how to detect the presence of null checks made unnecessary by C#’s non-nullable reference types.
The short answer is that you can’t; at this point, Roselyn’s static flow analysis is focused on identifying the possibility of dereferencing null objects, not detecting potentially extraneous checks.
The long answer, though, as outlined above, is that you shouldn’t; until Microsoft adds runtime validation, or mandates the nullability context, those null checks continue to provide value.

Related

How to find potential points of NullReferenceException in a static code analyzer utility?

We're developing a static code analysis tool that aims at improving code via some hints.
We want to find places where developer has forgotten to check nullability of a variable or property or method return and has accessed the members via Dot Notation, because it might encounter NullReferenceException.
For example this code:
class Program
{
static void Main(string[] args)
{
var human = new Human();
if (human.Name.Length > 10)
{
// Jeez! you have a long name;
}
}
}
public class Human
{
public string Name { get; set; }
}
We use Mono.Cecil and we find the body of all methods of all types in a given assembly, and for each method body we find the Instructions of it, and then we check for Callvirt operations. Yet that doesn't support this example:
class Program
{
static string name;
static void Main(string[] args)
{
if (name.Length > 10)
{
}
}
}
How can we find all of the accesses to members (variable, field, property, method) of a given nullable type?
Update:
In fact we're searching for OpCodes which represent member access for a given variable in IL. Is this possible?

The documentation for NullReferenceException helpfully documents the following:
The following Microsoft intermediate language (MSIL) instructions
throw NullReferenceException: callvirt, cpblk, cpobj,
initblk, ldelem.<type>, ldelema, ldfld, ldflda,
ldind.<type>, ldlen, stelem.<type>, stfld, stind.<type>,
throw, and unbox.
These break down to the following:
Array access: ldelem, ldelema, ldlen, stelem. The array reference must not be null.
Non-array member access: ldfld, ldflda, stfld. The object reference must not be null.
Method access: callvirt. The object reference must not be null. Property access is also method access, since it's calling the property getter/setter.
Pointer/reference access: cpblk, cpobj, initblk, ldind, stind. The pointer/reference must not be null. In verified managed code, these opcodes are not typically used in a context where their arguments could be null.
Throwing an exception: throw. The exception reference must not be null.
Unboxing: unbox. The object reference must not be null.
Tracing the opcode arguments back to variables/fields is another question altogether. This can be arbitrarily complicated since the opcodes only care about what's on the stack, not where it came from. In some cases you may be dealing with expressions (a[0].SomeMethod().FieldAccess, where any of a, a[0] and a[0].SomeMethod() could be null when they're not supposed to be).
You are better off not checking this on the IL level, but using Roslyn to provide you with analysis on the language level. Producing high quality feedback is much simpler with access to the source.
Even then, be aware that high quality static analysis for nullability isn't easy. You can certainly write an analyzer that will happily warn at every possible case where the programmer may have forgotten to check, but an analyzer like that becomes almost useless if the programmer is forced to insert tons of superfluous checks for references that are obviously never null. If you tie this to a TFS check-in policy, be prepared to receive death threats from both developers and managers who want to know why productivity has taken a nosedive.
There's a reason existing tools like Resharper add a lot of attributes for controlling the analysis, and there's a proposal up to add nullability checking to C# itself. Know what you're getting into before reinventing the wheel.

CC Suggesting Redundant Ensures

I have a piece of code which looks a little like this:
public TReturn SubRegion(TParam foo)
{
Contract.Requires(foo!= null);
Contract.Ensures(Contract.Result<TReturn>() != null);
if (!CheckStuff(foo))
foo.Blah();
return OtherStuff(foo);
}
CC is giving me a warning:
Warning 301 CodeContracts: Consider adding the postcondition Contract.Ensures(Contract.Result() != null); to provide extra-documentation to the library clients
Which is obviously completely redundant! I have several such redundant warnings and it's becoming a problem (real warnings getting buried in a torrent of redundant suggestions).
So I have two questions:
1) Am I missing something which means this is not a redundant recommendation? In which case what do I need to do to fix this warning?
2) Alternatively, if this is just a quirk of CCCheck and cannot be fixed how can I hide or suppress this warning?
N.b. Just in case you think my example is missing something important, the full code is the SubRegion method here.

Regarding 2: The documentation is pretty good, take a look at 6.6.10 Filtering Warning Messages:
To instruct the static contract checker not to emit a particular class
of warnings for a method (a type, an assembly), annotate the method
(the type, the assembly) with the attribute:
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Contracts", warningFamily)]
where warningFamily is one of: Requires, Ensures, Invariant, NonNull,
ArrayCreation, ArrayLowerBound, ArrayUpperBound, DivByZero,
MinValueNegation.
If necessary, the static contract checker allows filtering a single
warning message (instead of an entire family) as well. To do so you
can annotate a method with the attribute
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Contracts", warningFamily-ILOffset-MethodILOffset)]
where warningFamily is as
above, and ILOffset and MethodILOffset are used by the static
contract checker to determine the program point the warning refers to.
The offsets can be obtained from the static contract checker by
providing the -outputwarnmasks switch in the "Custom Options" entry in
the VS pane. Check the Build Output Window for the necessary
information.

Why doesn't C# default to null for unassigned local variables?

Say I have something like this:
public IOrder SomeMethodOnAnOrderClass()
{
IOrder myOrder = null;
if (SomeOtherOrder != null)
{
myOrder = SomeOtherOrder.MethodThatCreatesACopy();
}
return myOrder;
}
Why did the makers of C# require the explicit set of myOrder to null?
Is there ever a case where you would want to leave it unassigned?
Does the setting to null have a cost associated with it? Such that you would not want to always have unassigned variables set to null? (Even if they are later set to something else.)
Or is it required to make sure you have "dotted all your i's and crossed all your t's"?
Or is there some other reason?

They do default to null or, more accurately, your objects default to the value returned by default(T), which is different for value types.
This is a feature. There are all sorts of bugs in the wild caused by programmers using uninitialized variables. Not all languages give you such well defined behavior for this sort of thing (you know who you are...).
Apparently you haven't experienced that yet. Be happy and accept that the compiler is helping you to write better code.

In Why are local variables definitely assigned in unreachable statements? (thanks, MiMo for the link) Eric Lippert says:
The reason why we want to make this illegal is not, as many people
believe, because the local variable is going to be initialized to
garbage and we want to protect you from garbage. We do in fact
automatically initialize locals to their default values. (Though the C
and C++ programming languages do not, and will cheerfully allow you to
read garbage from an uninitialized local.) Rather, it is because the
existence of such a code path is probably a bug, and we want to throw
you in the pit of quality; you should have to work hard to write that
bug.
As far as I understand this, if a local variable is not assigned a value, it does not mean, that the developer indeed wanted to get the default(T) while reading from it. It means (in the majority of cases) that the developer probably missed it and forgot to initialize it. That is rather a bug, then a situation when a developer consciously wants to init a local variable to default(T) with just declaring it.

is there a warning (error), similar to C4061 for C#

Usually, if I use switch for enums in C#, I have to write something like that:
switch (e)
{
case E.Value1:
//...
break;
case E.Value2:
//...
break;
//...
default:
throw new NotImplementedException("...");
}
In C++ (for VS) I could enable warnings C4061 and C4062 for this switch, make them errors and have a compile-time check. In C# I have to move this check to runtime...
Does anyone know how in C# I can have this checked in compile time? Maybe there is a warning, disabled by default, which I missed, or some other way?

No, there isn't be a compile-time check - it's legitimate to have a switch/case which only handles some of the named values. It would have been possible to include it, but there are some issues.
Firstly, it's entirely valid (unfortunately) for an enum value not to have any of the "named" values:
enum Foo
{
Bar = 0,
Baz = 1
}
...
Foo nastyValue = (Foo) 50;
Given that any value is feasible within the switch/case, the compiler can't know that you didn't mean to try to handle an unnamed value.
Secondly, it wouldn't work well with Flags enums - the compiler doesn't really know which values are meant to be convenient combinations. It could infer that, but it would be a bit icky.
Thirdly, it's not always what you want - sometimes you really do only want to respond to a few cases. I wouldn't want to have to suppress warnings on a reasonably regular basis.
You can use Enum.IsDefined to check for this up front, but that's relatively inefficient.
I agree that all of this is a bit of a pain - enums are a bit of a nasty area when it comes to .NET :(

I understand that this is necroposting, but nothing really changed in this area inside of the compiler. So, I made Roslyn analyzer for switch statements.
You can download SwitchAnalyzer package.
This is Roslyn analyzer and it supports
Enums with operations | and & for them. So, you can check flags as well
(but not like single int value)
Interface implementations (pattern matching) in the current data context.
Pattern matching for classes is not implemented in version 0.4 yet (but I hope to implement it soon).
To use it, just add this package to your project, you will get warnings for all uncovered cases if you don't have default branch or if it throws exception. And of course, you can enable "Treat warnings as errors" option for your project for all or specific warnings. Feel free to contact me in case if you will find any bugs.

Why can’t down-casting be checked at compile time?

Why can’t compiler detect at compile-time that obj references object of type B and thus reports an error when we try to cast it to type A?
public class A { }
public class B { }
static void Main(string[] args)
{
B b = new B();
object obj = (object)b;
A a = (A)obj; // exception
thanx

Because of the Halting problem. This essentially means that you cannot decide which execution path will the program follow (and there is a mathematical proof for that). For example the following code may or may not be correct:
object o = SomeTest() ? (new A()) : (new B());
A a = (A)o;
If the SomeTest method always returns true then it is correct. Unfortunatelly, it is not possible to decide that. However, there is a lot of research going on in this field. Even though it cannot be always checked, there are tools that can sometimes verify that something will always succeed or give you an example of execution path for which the assumption fails.
A good example of this technique are Code Contracts, which will be a part of Visual Studio 2010. I believe you could use them to give prove that your down-casting will be correct. However, there is no explicit support for this - although, it would be useful!

Let me turn the question around: if the compiler could prove that, then why would we need casts at all? The purpose of a cast is tell the compiler "I know more about this code than you do, and I promise you that this cast is valid. I am so sure of that fact that I am willing to let you generate code that throws an exception if I'm wrong." The compiler can't prove that the cast is valid precisely because the cast is for scenarios where the compiler can't prove that it is valid.

A compiler certainly could implement checks that would work in trivial cases like this. But doing so would be unlikely to help "real" code very much, since programmers rarely write such obviously wrong code.
To handle more complicated cases, a compiler would have to perform much more complicated analysis. This is harder for the compiler writer to do, and is also slower for your machine to run, and it still wouldn't be able to catch every possible bad cast. And again, because most code doesn't have easily-identifiable errors of this sort, it's not clear that the payoff would be worth the cost of writing the analysis.
Two drawbacks of more complicated static analysis are error messages and false positives. First, having a tool explain a problem in code is often an order of magnitude harder than having the tool merely check for the problem. Second, as checked-for problems turn from "bad thing X will definitely happen" to "bad thing Y might happen", it becomes much more likely that the tool will flag things that aren't ever a problem in practice.
There's an interesting essay written by a company, selling static analysis tools, that was spun off from academic research. One thing they discovered is that they often made fewer sales with more complicated analyses! A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World

You want the compiler to follow the control flow, and determine ahead of time that the cast will cause an exception? Why bother? With a real program, the control flow will be too complicated to figure this out.

Even static analysis tools wouldn't be able to solve this problem. What if your code uses reflection?
void Test(string typeName)
{
Type t = Type.GetType(typeName);
object obj = Activator.CreateInstance(t);
A a = (A)obj;
// etc.
}
Will this throw an exception? There is absolutely no possible way to know the answer without actually running it. No amount of code-path analysis will unravel a bug that depends on the value of some particular parameter. And if you have to run the code to detect the bug, then that makes it a runtime error, not compile-time.
This is exactly the reason why you need to test your code. Compilers can't ensure that your code is correct, only that it's syntactically valid and follows whatever rules are in the grammar.
And although this might seem like a contrived example, reflection is used pretty much everywhere these days, from your O/R mapper to your DI framework. It's actually quite common in a modern application not to know the type of some instance, or at least not the specific concrete type, until runtime.

Because you'd sit there for days while compilers tried every possible path through your code.

As others have mentioned, the general problem is that the compiler would have to trace back through all possible execution paths to see where that variable may have come from - and then determine if the cast is valid.
Imagine if the object was passed in to the function, which then downcast it. The compiler would have to know the run-time type of the object passed in. The calling code may not even exist at compile time, if this is a library.

In a basic example like yours, one might think it would be easy for a compiler to intelligently look for all references to a particular object and then see if it's being illegally cast. But consider this counterexample:
public class A { }
public class B { }
static void Main(string[] args)
{
B b = new B();
object obj = (object)b;
// re-using the obj reference
obj = new A();
A a = (A)obj; // cast is now valid
There are so many possible permutations of ways you could re-use and cast a particular base reference that a compiler writer would need to foresee. It gets even more complicated when the obj reference is passed in a parameter to a method. Compile-time checking becomes non-deterministic, making compilations times potentially much longer and still not guaranteeing it would be able to catch all invalid casts.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.