C#: Should I bother checking for null in this situation? - c#

Lets say I have this extention method:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
if(subjects == null)
throw new ArgumentNullException("subjects");
return subjects.Count() == 5;
}
Do you think this null check and exception throwing is really necessary? I mean, when I use the Count method, an ArgumentNullException will be thrown anyways, right?
I can maybe think of one reason why I should, but would just like to hear others view on this. And yes, my reason for asking is partly laziness (want to write as little as possible), but also because I kind of think a bunch of null checking and exception throwing kind of clutters up the methods which often end up being twice as long as they really needed to be. Someone should know better than to send null into a method :p
Anyways, what do you guys think?
Note: Count() is an extension method and will throw an ArgumentNullException, not a NullReferenceException. See Enumerable.Count<TSource> Method (IEnumerable<TSource>). Try it yourself if you don't believe me =)
Note2: After the answers given here I have been persuaded to start checking more for null values. I am still lazy though, so I have started to use the Enforce class in Lokad Shared Libraries. Can recommend taking a look at it. Instead of my example I can do this instead:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
Enforce.Argument(() => subjects);
return subjects.Count() == 5;
}

Yes, it will throw an ArgumentNullException. I can think of two reasons for putting the extra checking in:
If you later go back and change the method to do something before calling subjects.Count() and forget to put the check in at that point, you could end up with a side effect before the exception is thrown, which isn't nice.
Currently, the stack trace will show subjects.Count() at the top, and probably with a message with the source parameter name. This could be confusing to the caller of HasFive who can see a subjects parameter name.
EDIT: Just to save me having to write it yet again elsewhere:
The call to subjects.Count() will throw an ArgumentNullException, not a NullReferenceException. Count() is another extension method here, and assuming the implementation in System.Linq.Enumerable is being used, that's documented (correctly) to throw an ArgumentNullException. Try it if you don't believe me.
EDIT: Making this easier...
If you do a lot of checks like this you may want to make it simpler to do so. I like the following extension method:
internal static void ThrowIfNull<T>(this T argument, string name)
where T : class
{
if (argument == null)
{
throw new ArgumentNullException(name);
}
}
The example method in the question can then become:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
subjects.ThrowIfNull("subjects");
return subjects.Count() == 5;
}
Another alternative would be to write a version which checked the value and returned it like this:
internal static T NullGuard<T>(this T argument, string name)
where T : class
{
if (argument == null)
{
throw new ArgumentNullException(name);
}
return argument;
}
You can then call it fluently:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
return subjects.NullGuard("subjects").Count() == 5;
}
This is also helpful for copying parameters in constructors etc:
public Person(string name, int age)
{
this.name = name.NullGuard("name");
this.age = age;
}
(You might want an overload without the argument name for places where it's not important.)

I think #Jon Skeet is absolutely spot on, however I'd like to add the following thoughts:-
Providing a meaningful error message is useful for debugging, logging and exception reporting. An exception thrown by the BCL is less likely to describe the specific circumstances of the exception WRT your codebase. Perhaps this is less of an issue with null checks which (most of the time) necessarily can't give you much domain-specific information - 'I was passed a null unexpectedly, no idea why' is pretty much the best you can do most of the time, however sometimes you can provide more information and obviously this is more likely to be relevant when dealing with other exception types.
The null check clearly demonstrates to other developers and you, a form of documentation, if/when you come back to the code a year later, that it's possible someone might pass a null, and it would be problematic if they did so.
Expanding on Jon's excellent point - you might do something before the null gets picked up - I think it is vitally important to engage in defensive programming. Checking for an exception before running other code is a form of defensive programming as you are taking into account things might not work the way you expected (or changes might be made in the future that you didn't expect) and ensuring that no matter what happens (assuming your null check isn't removed) such problems cannot arise.
It's a form of runtime assert that your parameter is not null. You can proceed on the assumption that it isn't.
The above assumption can result in slimmer code, you write the rest of your code knowing the parameter is not null, cutting down on extraneous subsequent null checks.

In my opinion you should check for the null value. Two things that comes to mind.
It makes explicit the possible errors that can happen during runtime.
It also gives you a chance to throw a better exception instead of a generic ArgumentNullException. Thus, making the reason for the exception more explicit.

The exception that you will get thrown will be an Object reference not set to an instance of an object.
Not the most useful of exceptions when tracking down the problem.
The way you have it there will give you much more useful information by specifically stating that it's your subjects reference that is null.

I think it is a good practice to do precondition checks at the top of the function. Maybe it's just my code that is full of bugs, but this practice catched a lot of errors for me.
Also, it's much easier to figure out the source of the problem if you got an ArgumentNullException with the name of the parameter, thrown from the most relevant stack frame. Also, the code in the body of your function can change over time so I wouldn't depend on it catching precondition problems in the future.

It always depends on the context (in my opinion).
For instance, when writing a library (for others to use), it certainly makes sense to fully check each and every parameter and throw the appropriate exceptions.
When writing methods that are used inside a project, I usually skip those checks, attempting to reduce the size of the codebase. But even in this case, there might be a level (between application layers) where you still place such checks. It depends on the context, on the size of the project, on the size of the team working on it...
It certainly doesn't make sense doing it for small projects built by one person :)

It depends on the concrete method. In this case - I think, the exception is not necesary and the better usage will be, if teh extension method can deal with null.
public static bool HasFive<T>(this IEnumerable<T> subjects) {
if ( object.ReferenceEquals( subjects, null ) ) { return false; }
return subjects.Count() == 5;
}
If you call "items.HasFive()" and the "items" is null, then is true that items has not five items.
But if you have extension method:
public static T GetFift<T>(this IEnumerable<T> subjects) {
...
}
The exception for "subjects == null" should be called, because there is no valid way, how to deal with it.

If you look at the source to the Enumerable class (System.Core.dll) where a lot of the default extension methods are defined for IEnumerables classes, you can see that they all check for null references with arguments.
public static IEnumerable<TSource> Skip<TSource>(this IEnumerable<TSource> source, int count)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
return SkipIterator<TSource>(source, count);
}
It's a bit of an obvious point, but I tend to follow what I find in the base framework library source as you know that is more than likely to be best practices.

Yes, for two reasons:
Firstly, the other extension methods on IEnumerable do and consumers of your code can expect yours to do so as well, but secondly and more importantly, if you have a long chain of operators in your query then knowing which one threw the exception is useful information.

In my opinion one should check for known conditions that will raise errors later on (at least for public methods). That way it's easier to detect the root of the problem.
I would raise a more informational exception like:
if (subjects == null)
{
throw new ArgumentNullException("subjects ", "subjects is null.");
}

Related

When Implementing IEqualityComparer Should GetHashCode check for null?

When implementing IEqualityComparer<Product> (Product is a class), ReSharper complains that the null check below is always false:
public int GetHashCode(Product product)
{
// Check whether the object is null.
if (Object.ReferenceEquals(product, null))
return 0;
// ... other stuff ...
}
(Code example from MSDN VS.9 documentation of Enumerable.Except)
ReSharper may be wrong, but when searching for an answer, I came across the official documentation for IEqualityComparer<T> which has an example where null is not checked for:
public int GetHashCode(Box bx)
{
int hCode = bx.Height ^ bx.Length ^ bx.Width;
return hCode.GetHashCode();
}
Additionally, the documentation for GetHashCode() states that ArgumentNullException will be thrown when "The type of obj is a reference type and obj is null."
So, when implementing IEqualityComparer should GetHashCode check for null, and if so, what should it do with null (throw an exception or return a value)?
I'm interested most in .NET framework official documentation that specifies one way or another if null should be checked.
ReSharper is wrong.
Obviously code you write can call that particular GetHashCode method and pass in a null value. All known methods might ensure this will never happen, but obviously ReSharper can only take existing code (patterns) into account.
So in this case, check for null and do the "right thing".
Corollary: If the method in question was private, then ReSharper might analyze (though I'm not sure it does) the public code and verify that there is indeed no way that this particular private method will be called with a null reference, but since it is a public method, and one available through an interface, then
ReSharper is wrong.
The documentation says that null values should never be hashable, and that attempting to do so should always result in an exception.
Of course, you're free to do whatever you want. If you want to create a hash based structure for which null keys are valid, you're free to do so, in this case you should simply ignore this warning.
ReSharper has some special case code here. It will not warn about the ReferenceEquals in this:
if (ReferenceEquals(obj, null)) { throw new ArgumentNullException("obj"); }
It will warn about the ReferenceEquals in this:
if (ReferenceEquals(obj, null)) { return 0; }
Throwing an ArgumentNullException exception is consistent with the contract specified in IEqualityComparer(Of T).GetHashCode
If you go to the definition of IEqualityComparer (F12) you'll also find further documentation:
// Exceptions:
// System.ArgumentNullException:
// The type of obj is a reference type and obj is null.
int GetHashCode(T obj);
So ReSharper is right that there is something wrong, but the error displayed doesn't match the change you should make to the code.
There is some nuance to this question.
The docs state that IEqualityComparer<T>.GetHashCode(T) throws on null input; however EqualityComparer<>.Default - which is almost certainly by far the most used implementation - does not throw.
Clearly, an implementation does not need to throw on null it merely has the option too.
However, I'd argue that no implementation should ever throw on null here, it's just confusing, and a possible source of bugs. Exceptions are a pain in any case, being a non-local control flow mechanism, and that alone argues for using them when necessary only (i.e.: not here). But additionally, for IEqualityComparer specifically, the docs state that whenever Equals(x, y) then GetHashCode(x) should equal GetHashCode(y) - and Equals does allow nulls, and is not documented as throwing any exceptions.
The invariant that equality implies hashcode equality makes implementing things relying on those hashcodes much simpler. Having a gotcha with the null value is a design cost you should avoid paying without need. And here there is no need, ever.
In short:
do not throw from GetHashCode, even though it is allowed
and do check for nulls; Resharper's warning is incorrect.
Doing this results in simpler code with fewer gotchas, and it follows the behavior of EqualityComparer<>.Default which is the most common implementation used.

IEnumerable Extensions with No-Throw Guarantee

Personally, I'm a fan of the fluent interface syntax of the IEnumerable/List extension methods in C#, as a client. That is, I prefer syntax like this:
public void AddTheseGuysToSomeLocal(IEnumerable<int> values)
{
values.ToList().ForEach(v => _someLocal += v);
}
as opposed to a control structure like a foreach loop. I find that easier to process mentally, at a glance. Problem with this is that my code here is going to generate null reference exceptions if clients pass me a null argument.
Let's assume that I don't want to consider a null enumeration to be exceptional -- I want that to result in leaving some local as-is -- I would add a null guard. But, that's a little syntactically noisy for my taste, and that noise adds up in cases where you're chaining these together in a fluent interface and the interim results can be null as well.
So, I created a class called SafeEnumerableExtensions that offers a no-throw guarantee by treating nulls as empty enumerables (lists). Example methods include:
//null.ToList() returns empty list
public static List<T> SafeToList<T>(this IEnumerable<T> source)
{
return (source ?? new List<T>()).ToList();
}
//x.SafeForEach(y) is a no-op for null x or null y
//This is a shortcut that should probably go in a class called SafeListExtensions later
public static void SafeForEach<T>(this List<T> source, Action<T> action)
{
var myAction = action ?? new Action<T>(t => { });
var mySource = source ?? new List<T>();
mySource.ForEach(myAction);
}
public static void SafeForEach<T>(this IEnumerable<T> source, Action<T> action)
{
SafeToList(source).SafeForEach(action);
}
Now, my original method is prettier than if there were a null guard, but just as safe, since null result in a no-op:
public void AddTheseGuysToSomeLocal(IEnumerable<int> values)
{
values.ForEach(v => _someLocal += v);
}
So, my question is twofold. (1) I'm assuming that I'm not so original as to be the first person ever to have thought of this -- does anyone know if there is an existing library that does this or something similar? And (2) has anyone used said library or implemented a scheme like this and experienced unpleasant consequences or else can anyone foresee unpleasant consequences for doing something like this? Is this even a good idea?
(I did find this question when checking for duplicates, but I don't want to explicitly do this check in clients - I want the extension class to do this implicitly and not bother clients with that extra method call)
And (2) has anyone used said library or implemented a scheme like this
and experienced unpleasant consequences or else can anyone foresee
unpleasant consequences for doing something like this? Is this even a
good idea?
Personally I would consider this a bad idea. In most cases passing a null enumeration or null Func is probably not intended.
You are "fixing" the problem which might lead to seemingly unrelated problems later down the road. Instead I would rather throw an exception in this case so that you find this problem in your code early on ("Fail fast").

What to name a variant of a Get() method?

I am developing an API for a repository-like abstraction. I have two methods:
// Throws an exception if object cannot be found
MyObj Get(MyIdType id);
// Returns false if object cannot be found; no exception
bool TryGet(out MyObj obj);
There is a requirement for a third variant: one that returns null if object cannot be found, and does not throw an exception.
// Returns null if object cannot be found; no exception
MyObj ?????(MyIdType id);
I'm stuck as on what to name it. GetOrDefault has been ruled out as confusing. GetIfNotNull has been suggested, but also seems unclear. GetOrNull is the most promising so far.
Does anyone have any other suggestions, or know of any public APIs whose conventions I can follow?
I would opt to not have a Get method that behaves differently in two situations. Why not have the Get return null for all cases. Why throw an exception at all?
I would opt to leave it up to user code to throw an exception if a null value is returned, if required.
See this question for further guidance related to when to throw exceptions.
I'd go with GetOrDefault (as you suggested yourself) based on the LINQ extension method FirstOrDefault.
Maybe GetValue and GetValueOrDefault would sound better though.
How about: GetOrDefault
The ...OrDefault is fairly standard in LINQ.
You could try GetObjectOrReturnDefaultValue or, since you know it's a reference type GetObjectOrReturnNull. It's long and ugly, but it's not ambiguous.
I'd keep only bool TryGetXXXXX(out T value) variant on your interface and provide the rest as extension methods to it. It makes your interface itself very compact, but at the same time as useful as client wants.
In my opinion, you should stick with:
MyObj Get(MyIdType id);
Instead of throwing an exception here, simply return null. If there is a definite requirement to throw an exception or optionally, null, I would try:
MyObj Get (MyIdType id, bool ReturnDefault = false) // if .net 4
I don't particularly like this option - but sometimes requirements will override what we think feels right or natural.

Is there a way to mark a method as ensuring that T is not null?

For example, if I have a method defined as...
T Create()
{
T t = Factory.Create<T>();
// ...
Assert.IsNotNull(t, "Some message.");
// -or-
if (t == null) throw new Exception("...");
// -or- anything that verifies that it is not null
}
...and I am calling that method from somewhere else...
void SomewhereElse()
{
T t = Create();
// >><<
}
...at >><<, I know (meaning me, the person who wrote this) that t is guaranteed to not be null. Is there a way (an attribute, perhaps, that I have not found) to mark a method as ensuring that a reference type that it returns or otherwise passes out (perhaps an out parameter) is guaranteed by internal logic to not be null?
I have to sheepishly admit that ReSharper is mostly why I care as it highlights anything it thinks could cause either InvalidOperationException or NullReferenceException. I figure either it's reading something that I can mark on my methods or it just knows that Assert.IsNotNull, simple boolean checks or a few other things will remove the chance of something being null and that it can remove the highlight.
Any thoughts? Am I just falling victim to oh-my-god-resharper-highlights-it-I-have-to-fix-it disease?
If ReSharper is why you care then you can mark the Factory.Create<T>() method with their [NotNull] attribute described in their web help
Not sure how R# handles this, but the Contract.Assert method may be what you're looking for
You could put a constraint on T to only allow struct.
You could use a language extension that allows you to make stronger definitions of pre/post conditions for your function (contract based programming), like SpecSharp, or Code Contracts. Code Contracts seems to leverage built-in systems from C# 4.0. I have no experience with either - only heard of them.
Could you cast T to an object then check if its null?
var o = (object)Factory.Create<T>();
if(o == null) throw new Exception();

Not all code paths return, but compiler treats it as if all paths return

I can't think of a good title, but my question is not as naive as it appears.
Consider this:
public static void ExitApp(string message)
{
// Do stuff
throw new Exception(...);
}
OR
public static void ExitApp(string message)
{
// Do stuff
System.Environment.Exit(-1);
}
Neither of these methods will ever return. But when you invoke these methods elsewhere:
public int DoStuff()
{
// Do stuff
if (foo == 0)
{
throw new Exception(...);
}
else if (foo == 1)
{
// Do other stuff
return ...;
}
else
{
ExitApp("Something borked");
}
}
Try to compile that and you will get a "not all code paths return a value" in DoStuff. It seems silly to trail the call to ExitApp with an Exception just to satisfy the compiler even though I know that it's good. There seems to be nothing in ExitApp() that I can do to indicate it will never return.
How can I indicate to the compiler that ExitApp never returns and, thus, that DoStuff's else block will never return either? It seems like a rather simple bug that it's path checking fails to account for.
Even if I only use the first ExitApp (throws the exception) and that method returns an int the path checker is smart enough to realize that it will never return so it doesn't complain about the int type. This compiles file:
public static int ExitApp(string message)
{
// Do stuff
throw new Exception(...);
}
However, given that it knows this ExitApp will never return an int it does not extrapolate that to DoStuff() so I'm inclined to believe there is no solution to my question. My only choice is to throw an exception after calling ExitApp.
public int DoStuff()
{
...
else
{
ExitApp("Something borked");
throw new NotImplementedException("Should not reach this");
}
}
Is there a reason for this behavior by the compiler?
I have an exception defined for this purpose: UnreachableException. It might seem superfluous, but it's an easy way to say "Hey, person reading this, this line should never be executed!". I usually use it for the default case of some switch statements, but it applies here as well.
Just throw one after the ExitApp line.
public void int DoStuff()
{
// Do stuff
if (foo == 0)
{
throw new Exception(...);
}
else if (foo == 1)
{
// Do other stuff
return ...;
}
else
{
ExitApp("Something borked");
throw new UnreachableException();
}
}
The actual reason the language doesn't support declaring a method which always throws is just: it's not worth it. The language developers don't have unlimited time to apply every feature we can think of. They have to prioritize.
I'm betting this is the first time you've run into this situation, and look: explicitly throwing an exception deals with the issue. Why would they bother dealing with such a rare, easy to bypass case? They could be spending that time implementing optional parameters, dynamic, or a bunch of other things that will be more useful and used more often than being able to say a function always throws an exception.
That's not to say it will never be implemented. This type of method information is exactly the type of thing contracts are great at specifying. So maybe it will be included with code contracts.
The C# compiler doesn't support exception reporting, like the Java compiler does. Because of this, the compiler doesn't know (outside of the context of the method itself) that the method is guaranteed to throw an exception on every invocation.
I'm not going to repeat what's already been said here before, but if you're looking for a way such that the compiler will not bother you in regards to this error, you could do the following:
public int DoStuff()
{
var result = 0; //put default result here
// Do stuff
if (foo == 0)
{
throw new Exception(...);
}
else if (foo == 1)
{
// Do other stuff
result = ...;
}
else
{
ExitApp("Something borked");
}
return result;
}
The fact of the matter is that the compiler just isn't smart enough to understand in your case that all paths return values, so its better to return the value outside of your conditional structure.
For the second case, System.Environment.Exit is part of the framework, not the C# language. The C# compiler doesn't "know" it's a non-returning function.
This is unfortunate. Visual C++ supports a __declspec(noreturn), but I don't know of any C# like construct. I typically put a comment saying "Unreachable code" and an assert in these cases, and put it a return or a throw to make the compiler happy.
Yes a very good reason.
It means that you can make changes to ExitApp's implementation without suddenly having compiler errors turning up all over your application.
The compiler has no consistent way to infer what your ExitApp() method will do. Although it could analyse the source code and "guess" that it will never return there could just as easily be cases where the source isn't available and it wouldn't know. The only reasonable and consistent approach is for it not to analyse your code.
Because of this you've either got to add an exception or dummy return value.
C# choose not to implement checked exceptions, so although the compiler could choose to do the static analysis and figure out that ExitApp or MethodThrows never returns, the team spent it's energy elsewhere.
The compiler has no way of knowing that the behavior of System.Environment.Exit() causes it to not return.
It is complaining because it assumes the function will return and execution will continue.
Just add a simple return -1; statement after the Exit() call.
The simple answer is: the compiler is not that clever. It makes the basic assumption that any method call within a certain method (DoStuff in your case) will eventually complete in the general case (although it may sometimes throw an assumption) - a very fair one in general, though clearly not every case as you point out. Saying that, I don't however see it as a problem that you have to add a line of code after the call to ExitApp. Either of the following would do the job fine and should never get called:
return 0;
// More than necessary, but potentially useful for spotting an undersirable case when your called method *does* return (undesirably in this situation).
throw new Exception("This should never happen because the previous call should never return");

Categories

Resources