I frequently face the problem to check whether an IEnumerable<T> is null before iterating over it through foreach or LINQ queries, then I often come into codes like this:
var myProjection = (myList ?? Enumerable.Empty<T>()).Select(x => x.Foo)...
Hence, I thought to add this extension-method to an Extensions class:
public static class MyExtensions
{
public static IEnumerable<T> AsEmptyIfNull<T>(this IEnumerable<T> source)
{
return source ?? Enumerable.Empty<T>();
}
}
A little issue comes immediately in my mind looking at this code, i.e., given the "instance-methods aspect" of extension-methods, it should be implemented as a mere static method, otherwise something like this would be perfectly legal:
IEnumerable<int> list = null;
list.AsEmptyIfNull();
Do you see any other drawback in using it ?
Can such extension leading to some kind of bad-trend in the developer(s), if massively used?
Bonus question:
Can you suggest a better name to it ? :)
(English is not my first language, then I'm not so good in naming...)
Thanks in advance.
Methods that return an IEnumerable<T> should return an empty one, instead of null. So you wouldn't need this.
See this question : Is it better to return null or empty collection?
Otherwise, your code seems ok.
It's generally a bad idea to return a null instead of an empty sequence if you can control it. This is self-explanatory if you consider that when someone is asked to produce a collection, returning null is not like saying "the collection is empty" but "there is no such collection at all".
If you own the methods returning the enumerables, then returning an empty IEnumerable (which can even be a special purpose readonly static object if it might be returned a lot) is the way to go, period.
If you are forced to use a bad-mannered library that has the habit of returning null in such cases, then this an extension method might be a solution, but again I wouldn't prefer it. It's probably better to wrap the bad-mannered methods in your own versions that do the coalescing where people won't see it. This way you get both the convenience of always having an enumerable instead of null and the correctness of not supporting the "return null" paradigm.
Related
Which is better when check list is null ?
var newList;
if(newList!= null)
or newList.Any()
In the above code, sometimes I check not null and sometimes I use Any(), I don't know which one is best practice and why ?
Any advice?
Thanks in advance
These are not the same.
Any will throw an exception if used on a null reference.
With lists, you can think of .Any() as .Count() != 0 (*)
You may have to check for both, and you have to do the null check before calling Any() on your IEnumerable.
One way is to check them for both in one strike with the null-safe navigation ?.as in Thierry V's answer.
But sometimes you want to throw a custom Exception if you have a null value that you are not supposed to have, and treat an empty list as a correct input, so it all depends on the context.
Just remember that these are different.
(*) : as noticed in a comment, .Any() is not actually implemented as Count() == 0. For lists, it's functionally equivalent, but it's best practice to use Any() to test if an IEnumerable is empty or not, because Count() might need to go through all the elements.
null and Any() have the different purpose.
Any is used to check your list if it contains any item.
Before calling Any, your list must be not null, it not, it throws Null exception.
Think about newList?.Any()
As other answers say != null and Any() are different
I would write an extension method to make you expect.
public static class ExtenstionArray
{
public static bool CheckAny<T>(this IEnumerable<T> list) {
return list != null && list.Any();
}
}
then you can use easier to check.
if(newList.CheckAny())
Personally, I'm a fan of the fluent interface syntax of the IEnumerable/List extension methods in C#, as a client. That is, I prefer syntax like this:
public void AddTheseGuysToSomeLocal(IEnumerable<int> values)
{
values.ToList().ForEach(v => _someLocal += v);
}
as opposed to a control structure like a foreach loop. I find that easier to process mentally, at a glance. Problem with this is that my code here is going to generate null reference exceptions if clients pass me a null argument.
Let's assume that I don't want to consider a null enumeration to be exceptional -- I want that to result in leaving some local as-is -- I would add a null guard. But, that's a little syntactically noisy for my taste, and that noise adds up in cases where you're chaining these together in a fluent interface and the interim results can be null as well.
So, I created a class called SafeEnumerableExtensions that offers a no-throw guarantee by treating nulls as empty enumerables (lists). Example methods include:
//null.ToList() returns empty list
public static List<T> SafeToList<T>(this IEnumerable<T> source)
{
return (source ?? new List<T>()).ToList();
}
//x.SafeForEach(y) is a no-op for null x or null y
//This is a shortcut that should probably go in a class called SafeListExtensions later
public static void SafeForEach<T>(this List<T> source, Action<T> action)
{
var myAction = action ?? new Action<T>(t => { });
var mySource = source ?? new List<T>();
mySource.ForEach(myAction);
}
public static void SafeForEach<T>(this IEnumerable<T> source, Action<T> action)
{
SafeToList(source).SafeForEach(action);
}
Now, my original method is prettier than if there were a null guard, but just as safe, since null result in a no-op:
public void AddTheseGuysToSomeLocal(IEnumerable<int> values)
{
values.ForEach(v => _someLocal += v);
}
So, my question is twofold. (1) I'm assuming that I'm not so original as to be the first person ever to have thought of this -- does anyone know if there is an existing library that does this or something similar? And (2) has anyone used said library or implemented a scheme like this and experienced unpleasant consequences or else can anyone foresee unpleasant consequences for doing something like this? Is this even a good idea?
(I did find this question when checking for duplicates, but I don't want to explicitly do this check in clients - I want the extension class to do this implicitly and not bother clients with that extra method call)
And (2) has anyone used said library or implemented a scheme like this
and experienced unpleasant consequences or else can anyone foresee
unpleasant consequences for doing something like this? Is this even a
good idea?
Personally I would consider this a bad idea. In most cases passing a null enumeration or null Func is probably not intended.
You are "fixing" the problem which might lead to seemingly unrelated problems later down the road. Instead I would rather throw an exception in this case so that you find this problem in your code early on ("Fail fast").
When I have a method which returns a collection of objects, what I should return if the objects count is zero? null or just empty List<T>? What is good practice?
public List<string> GetPupilsByClass(string className)
{
....
}
I'd definitely return an empty list so methods can still be called on the object without requiring null checks. There's a difference between returning an empty list and returning nothing at all, so the calling code probably isn't expecting to receive a null reference anyway (unless an exception occurs or something).
An empty list is what I'd expect as a caller. Null would indicate to me that the "conceptual list" is undefined, like null in a database.
Also, by always returning empty collections rather than null, clients like these will never fail:
foreach(var element in obj.Method()) ...
It depends on a number of factors, but an empty list would be a more typical return value, as otherwise the caller must know to perform null checking. The main time I'd return a null is if it was a method of this style:
bool Try*(args, out result)
The caller expects (on receiving false) not to even look at the value of result.
If you happen to be returning arrays, there is a nice cheat - you can store a zero-length typed array in a static field somewhere are return that. But ultimately an empty list isn't going to be a huge overhead to allocate, so just sent that.
According to MS, you should never return a null string or array from a field or property, and I think this could be extended to methods (and may well be, somewhere that I haven't found).
On returning empty arrays:
String and Array properties should never return a null reference. Null can be difficult to understand in this context.
The general rule is that null, empty string (""), and empty (0 item) arrays should be treated the same way. Return an empty array instead of a null reference.
MSDN Reference.
is this method an ifc method? which means used by external objects not controlled by you?
if the answer yes, then i would return an empty collection, as the caller is not expecting an exception.
if this method is internal and you will be the only user, i would return a null to save the wasted memory allocated for an empty collection and a GC performance hit once you stop using it.
A better practice is to return IEnumerable<string>. Use yield return and yield break within your method to build up your collection. In this manner, you postpone the creation of the array. Look here for more information. You'll find IEnumerable has the benefit that it can be chained within extension methods and linq queries:
var results = from x in GetPupilsByClass(className) where x.StartsWith("A");
If you absolutely must return a complete list (due to the lazy nature of yield), then I would recommend changing your method signature to the following:
public bool TryGetPupilsByClass(string className, out ICollection<string> pupils)
This technique has three advantages.
Your intentions are clear -- pupils will be initialized if the return value is true. A user of your code does not have to guess which practice you've settled on.
You of course will not bother allocating a list in the event that the collection is empty, which saves on memory allocations.
ICollection<string> is strongly typed without revealing the storage mechanism you are using. Returning concrete classes should be used sparingly. Alternatives are things like IList<string>, ReadOnlyCollection<string>, and IEnumerable<string>, but the one you choose certainly makes your intentions of what the user can and can not do with the results much clearer.
Allways return emty list. And you will avoid the most often happening exception - NullReferenceException.
It depends really on the context in which your code is being used. For most purposes, I return a list of zero objects, and I think it would be best here as well - it is more consistent with your normal results.
I generally follow the Clean Code guidance by R.Martin. It is recommended to return an empty list:
public List GetPupilsByClass(string className)
{
....
}
I do it for the following reasons:
Caller method does not have to check for null.
This method does not have to do error handling.
Exceptions will be caught at a higher level method.
If I have a long list of objects that each has the possibility of returning null within a "Linq where" clause, e.g.
SomeSource.Where(srcItem=>(srcItem.DataMembers["SomeText"].Connection.ConnectedTo as Type1).Handler.ForceInvocation == true));
the indexer can return null and the "as" operator may return null. It is possible that the object does not have a connection (ie. The property is null).
If a null is encountered anywhere, I would like the where clause to return "false" for the item being evaluated. Instead, it aborts with a null reference exception.
It appears to me that this would be contrived to express within a single C# expression. I don't like to create a multi line statement or create a separate func for it.
Is there some use of the null coalescing operator that I'm missing?
You're looking for the .? operator (or is it ?.—one of those, anyway), which does not exist in C# (though it is an often-requested feature, according to Eric Lippert).
The only possible suggestion I have is to write a method that takes an expression and uses it to check for any nulls. But this will come at a performance cost. Anyway, it might look like:
T TryOrDefault<T>(Expression<Func<T>> expression)
{
// Check every MemberExpression within expression one by one,
// looking for any nulls along the way.
// If a null is found, return default(T) or some default value.
// Otherwise...
Func<T> func = expression.Compile();
return func();
}
Using the andand operator from Ruby as inspiration, you could create an extension method that acts as a null guard.
public static U AndAnd<T, U>(this T obj, Func<T, U> func)
{
return obj == null ? default(U) : func(obj);
}
Your original code could then be rewritten as follows:
SomeSource.Where(srcItem => (srcItem.AndAnd(val => val.DataMembers["SomeText"]).AndAnd(val => val.Connection).AndAnd(val => val.ConnectedTo) as Type1).AndAnd(val => val.Handler).AndAnd(val => val.ForceInvocation));
Do be careful when returning non-boolean value types using this method - make sure you are familiar with the values returned by default(U).
create a separate func for it
This is the way to go. Do not be allergic to proper techniques. Methods you create are no more expensive (at runtime, and conceptually) than anonymous methods.
A while ago I wrote a project that mimics AndAnd that relies on DynamicProxy. It works fine, although I've not used it in prod. The only drawback is that it requires all of the members to be virtual or the returned types to be an interface so DynamicProxy can do its magic.
Check it here
https://bitbucket.org/mamadero/andand/overview
Lets say I have this extention method:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
if(subjects == null)
throw new ArgumentNullException("subjects");
return subjects.Count() == 5;
}
Do you think this null check and exception throwing is really necessary? I mean, when I use the Count method, an ArgumentNullException will be thrown anyways, right?
I can maybe think of one reason why I should, but would just like to hear others view on this. And yes, my reason for asking is partly laziness (want to write as little as possible), but also because I kind of think a bunch of null checking and exception throwing kind of clutters up the methods which often end up being twice as long as they really needed to be. Someone should know better than to send null into a method :p
Anyways, what do you guys think?
Note: Count() is an extension method and will throw an ArgumentNullException, not a NullReferenceException. See Enumerable.Count<TSource> Method (IEnumerable<TSource>). Try it yourself if you don't believe me =)
Note2: After the answers given here I have been persuaded to start checking more for null values. I am still lazy though, so I have started to use the Enforce class in Lokad Shared Libraries. Can recommend taking a look at it. Instead of my example I can do this instead:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
Enforce.Argument(() => subjects);
return subjects.Count() == 5;
}
Yes, it will throw an ArgumentNullException. I can think of two reasons for putting the extra checking in:
If you later go back and change the method to do something before calling subjects.Count() and forget to put the check in at that point, you could end up with a side effect before the exception is thrown, which isn't nice.
Currently, the stack trace will show subjects.Count() at the top, and probably with a message with the source parameter name. This could be confusing to the caller of HasFive who can see a subjects parameter name.
EDIT: Just to save me having to write it yet again elsewhere:
The call to subjects.Count() will throw an ArgumentNullException, not a NullReferenceException. Count() is another extension method here, and assuming the implementation in System.Linq.Enumerable is being used, that's documented (correctly) to throw an ArgumentNullException. Try it if you don't believe me.
EDIT: Making this easier...
If you do a lot of checks like this you may want to make it simpler to do so. I like the following extension method:
internal static void ThrowIfNull<T>(this T argument, string name)
where T : class
{
if (argument == null)
{
throw new ArgumentNullException(name);
}
}
The example method in the question can then become:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
subjects.ThrowIfNull("subjects");
return subjects.Count() == 5;
}
Another alternative would be to write a version which checked the value and returned it like this:
internal static T NullGuard<T>(this T argument, string name)
where T : class
{
if (argument == null)
{
throw new ArgumentNullException(name);
}
return argument;
}
You can then call it fluently:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
return subjects.NullGuard("subjects").Count() == 5;
}
This is also helpful for copying parameters in constructors etc:
public Person(string name, int age)
{
this.name = name.NullGuard("name");
this.age = age;
}
(You might want an overload without the argument name for places where it's not important.)
I think #Jon Skeet is absolutely spot on, however I'd like to add the following thoughts:-
Providing a meaningful error message is useful for debugging, logging and exception reporting. An exception thrown by the BCL is less likely to describe the specific circumstances of the exception WRT your codebase. Perhaps this is less of an issue with null checks which (most of the time) necessarily can't give you much domain-specific information - 'I was passed a null unexpectedly, no idea why' is pretty much the best you can do most of the time, however sometimes you can provide more information and obviously this is more likely to be relevant when dealing with other exception types.
The null check clearly demonstrates to other developers and you, a form of documentation, if/when you come back to the code a year later, that it's possible someone might pass a null, and it would be problematic if they did so.
Expanding on Jon's excellent point - you might do something before the null gets picked up - I think it is vitally important to engage in defensive programming. Checking for an exception before running other code is a form of defensive programming as you are taking into account things might not work the way you expected (or changes might be made in the future that you didn't expect) and ensuring that no matter what happens (assuming your null check isn't removed) such problems cannot arise.
It's a form of runtime assert that your parameter is not null. You can proceed on the assumption that it isn't.
The above assumption can result in slimmer code, you write the rest of your code knowing the parameter is not null, cutting down on extraneous subsequent null checks.
In my opinion you should check for the null value. Two things that comes to mind.
It makes explicit the possible errors that can happen during runtime.
It also gives you a chance to throw a better exception instead of a generic ArgumentNullException. Thus, making the reason for the exception more explicit.
The exception that you will get thrown will be an Object reference not set to an instance of an object.
Not the most useful of exceptions when tracking down the problem.
The way you have it there will give you much more useful information by specifically stating that it's your subjects reference that is null.
I think it is a good practice to do precondition checks at the top of the function. Maybe it's just my code that is full of bugs, but this practice catched a lot of errors for me.
Also, it's much easier to figure out the source of the problem if you got an ArgumentNullException with the name of the parameter, thrown from the most relevant stack frame. Also, the code in the body of your function can change over time so I wouldn't depend on it catching precondition problems in the future.
It always depends on the context (in my opinion).
For instance, when writing a library (for others to use), it certainly makes sense to fully check each and every parameter and throw the appropriate exceptions.
When writing methods that are used inside a project, I usually skip those checks, attempting to reduce the size of the codebase. But even in this case, there might be a level (between application layers) where you still place such checks. It depends on the context, on the size of the project, on the size of the team working on it...
It certainly doesn't make sense doing it for small projects built by one person :)
It depends on the concrete method. In this case - I think, the exception is not necesary and the better usage will be, if teh extension method can deal with null.
public static bool HasFive<T>(this IEnumerable<T> subjects) {
if ( object.ReferenceEquals( subjects, null ) ) { return false; }
return subjects.Count() == 5;
}
If you call "items.HasFive()" and the "items" is null, then is true that items has not five items.
But if you have extension method:
public static T GetFift<T>(this IEnumerable<T> subjects) {
...
}
The exception for "subjects == null" should be called, because there is no valid way, how to deal with it.
If you look at the source to the Enumerable class (System.Core.dll) where a lot of the default extension methods are defined for IEnumerables classes, you can see that they all check for null references with arguments.
public static IEnumerable<TSource> Skip<TSource>(this IEnumerable<TSource> source, int count)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
return SkipIterator<TSource>(source, count);
}
It's a bit of an obvious point, but I tend to follow what I find in the base framework library source as you know that is more than likely to be best practices.
Yes, for two reasons:
Firstly, the other extension methods on IEnumerable do and consumers of your code can expect yours to do so as well, but secondly and more importantly, if you have a long chain of operators in your query then knowing which one threw the exception is useful information.
In my opinion one should check for known conditions that will raise errors later on (at least for public methods). That way it's easier to detect the root of the problem.
I would raise a more informational exception like:
if (subjects == null)
{
throw new ArgumentNullException("subjects ", "subjects is null.");
}