IEnumerable Extensions with No-Throw Guarantee

IEnumerable Extensions with No-Throw Guarantee - c#

Personally, I'm a fan of the fluent interface syntax of the IEnumerable/List extension methods in C#, as a client. That is, I prefer syntax like this:
public void AddTheseGuysToSomeLocal(IEnumerable<int> values)
{
values.ToList().ForEach(v => _someLocal += v);
}
as opposed to a control structure like a foreach loop. I find that easier to process mentally, at a glance. Problem with this is that my code here is going to generate null reference exceptions if clients pass me a null argument.
Let's assume that I don't want to consider a null enumeration to be exceptional -- I want that to result in leaving some local as-is -- I would add a null guard. But, that's a little syntactically noisy for my taste, and that noise adds up in cases where you're chaining these together in a fluent interface and the interim results can be null as well.
So, I created a class called SafeEnumerableExtensions that offers a no-throw guarantee by treating nulls as empty enumerables (lists). Example methods include:
//null.ToList() returns empty list
public static List<T> SafeToList<T>(this IEnumerable<T> source)
{
return (source ?? new List<T>()).ToList();
}
//x.SafeForEach(y) is a no-op for null x or null y
//This is a shortcut that should probably go in a class called SafeListExtensions later
public static void SafeForEach<T>(this List<T> source, Action<T> action)
{
var myAction = action ?? new Action<T>(t => { });
var mySource = source ?? new List<T>();
mySource.ForEach(myAction);
}
public static void SafeForEach<T>(this IEnumerable<T> source, Action<T> action)
{
SafeToList(source).SafeForEach(action);
}
Now, my original method is prettier than if there were a null guard, but just as safe, since null result in a no-op:
public void AddTheseGuysToSomeLocal(IEnumerable<int> values)
{
values.ForEach(v => _someLocal += v);
}
So, my question is twofold. (1) I'm assuming that I'm not so original as to be the first person ever to have thought of this -- does anyone know if there is an existing library that does this or something similar? And (2) has anyone used said library or implemented a scheme like this and experienced unpleasant consequences or else can anyone foresee unpleasant consequences for doing something like this? Is this even a good idea?
(I did find this question when checking for duplicates, but I don't want to explicitly do this check in clients - I want the extension class to do this implicitly and not bother clients with that extra method call)

And (2) has anyone used said library or implemented a scheme like this
and experienced unpleasant consequences or else can anyone foresee
unpleasant consequences for doing something like this? Is this even a
good idea?
Personally I would consider this a bad idea. In most cases passing a null enumeration or null Func is probably not intended.
You are "fixing" the problem which might lead to seemingly unrelated problems later down the road. Instead I would rather throw an exception in this case so that you find this problem in your code early on ("Fail fast").

Related

Generic method calls another generic with a concrete overload with an out parameter: concrete version is never used

There are several similar questions about generic methods with concrete overloads here on SO and most of them say essentially the same thing: Generic overload resolution is done at compile time so if you want to use a concrete overload, you may need to use dynamic in order to have the runtime decide which overload to use. OK, cool--I understand that. None of the questions I've found deal with an out parameter and I'm wondering if there's a better way to handle this than what I've done. Let's start with the most basic case (my code was all written and tested in LinqPad):
void Main()
{
string x = "";
var g = new GenericTest();
g.RunTest(x); //Ran from TryFoo<T>
g.RunTestWithDynamic(x);//Ran from TryFoo(string)
g.Foo(x); //Ran from TryFoo(string)
}
public class GenericTest
{
//public void RunTest(string withInput) => Foo(withInput); <-- This would fix it
public void RunTest<T>(T withInput) => Foo(withInput);
public void RunTestWithDynamic<T>(T withInput) => Foo((dynamic)withInput);
public void Foo<T>(T withInput) => Console.WriteLine("Ran from TryFoo<T>");
public void Foo(string withInput) => Console.WriteLine("Ran from TryFoo(string)");
}
Here are some things to note:
RunTest is a generic method that calls another generic method, Foo. When I call RunTest, it appears the compiler doesn't follow all the way from the call site to see that g.RunTeset is passing in x which is a string and link it all up so that the Foo(string) overload is called; instead, it just sees that RunTest<T> is calling Foo. It doesn't make different "paths" based on different input values of T. OK, that's fair and understandable.
If I call Foo directly from my Main method, the compiler is smart enough to see that we are calling Foo with a string directly and correctly selects the concrete overload.
As the linked SO posts describe, I can call RunTestWithDynamic which will change which overload is used at runtime based on value. It feels just a bit "hacky", but I'm good with this solution.
I've commented out a line: a concrete overload of RunTest. This would be essentially the same as calling Foo directly. If that were un-commented, it would fix everything. Alas, that is not an option for me in my case.
Now, what if the T is for an out parameter? Consider the pattern used by, say, int.TryParse where you return a bool to indicate if it succeeded or not, but the value you actually want is an out. Now you can't really do dynamic resolution because you can't cast an out parameter. I considered doing something where I make a default(T) and then casting that to dynamic, but if that ever works well anywhere else, there is the problem of reference types that default to null to deal with. Nope, that doesn't work, either.
In the end, the best I could come up with was this:
void Main()
{
string x;
var g = new GenericTest();
g.TryRunTest(out x); //Ran from TryFoo<T>
g.TryRunTestWithDynamic(out x); //Ran from TryFoo(string)
g.TryFoo(out x); //Ran from TryFoo(string)
}
public class GenericTest
{
//This would fix it, but in my case, not an option
//public bool TryRunTest(out string withOutput) => return TryFoo(out withOutput);
public bool TryRunTest<T>(out T withOutput)
{
return TryFoo(out withOutput);
}
public bool TryRunTestWithDynamic<T>(out T withOutput)
{
if(typeof(T) == typeof(string))
{
var retval = TryFoo(out string s);
withOutput = (T)(dynamic)s;
return retval;
}
return TryFoo(out withOutput);
}
public bool TryFoo<T>(out T withOutput)
{
withOutput = default(T);
Console.WriteLine("Ran from TryFoo<T>");
return true;
}
public bool TryFoo(out string withOutput)
{
withOutput = "Strings are special";
Console.WriteLine("Ran from TryFoo(string)");
return true;
}
}
You can see that TryRunTestWithDynamic has to look for the concrete string type specifically. I can't figure out a way to do it so that I can just add overloads and then use dynamic resolution to select the overload I want without having to spell it all out (and let's face it--spelling it all out kind of kills the whole point of having overloads in the first place).
In his post I linked to above (and here for good measure), Jon Skeet mentions using MethodInfo.MakeGenericMethod as an alternative. Here he talks about how to use it. I'm curious if that would help me here, but I can't figure out how to use it with out parameters.
So, here are my specific questions:
While, yes this DOES work, it is VERY clunky. Is there a better way to do this? Most importantly, is there a way I can consume overloads that wouldn't require me checking each type specifically?
Is there a way to use the MethodInfo.MakeGenericMethod route using out parameters and would that help me?
Let's say we could wave a magic wand and all of a sudden a language feature was added to C# that would give us the option to force dynamic resolution for generic methods (either for all cases, or only for the case where the generic parameter is for an out, like my main problem). Taking from the above examples, let's say we had something like this:
//The following are all invalid syntax but illustrate possible
// ways we may express that we want to force dynamic resolution
//For out parameters
public void TryFoo<T>(dynamic out T withOutput){...}
//"dynamic out" could be a special combination of keywords so that this only works
// with output variables. Not sure if that constraint buys us anything.
// Maybe we would want to allow it for regular (not out) generic parameters as well
public void Foo<T>(dynamic T withInput){...}
public void Foo<dynamic T>(T withInput){...}
public void dynamic Foo<T>(T withInput){...}
Let's not focus on the syntax itself; we could easily use a different notation--that isn't the point. The idea is simply that we are telling the runtime that when we get to TryFoo<T>... pause and check to see if there is a different, better fitting overload that we should switch to instead. I'm also not particular on whether we are checking for a better resolution on T specifically, or if it applies to the method in general.
Is there any reason why this might be a BAD thing? As he often does, Eric Lippert makes some interesting points about generic and concrete overloading resolution when inheritance is involved (Jon Skeet is also quoted in other answers, as well) and discusses why choosing the most specific overload isn't always the best choice. I'm wondering if having such a language construct might introduce similar problems.

OK, so I proposed the idea of adding support for opting-in to dynamic binding at the C# GitHub and was given this alternate solution which I like MUCH better than what I am currently doing: ((dynamic)this).TryFoo(out withOutput);
It would look like this:
public bool TryRunTestWithDynamic<T>(out T withOutput)
{
return ((dynamic)this).TryFoo(out withOutput);
}
I will reiterate the warning I was given: this only works for instance methods. In my case, that's fine, but if someone else had a method that was static, it wouldn't help, so the second part of my question (regarding MethodInfo.MakeGenericMethod) may still be relevant.

C# constructor chaining call order

I have a class that can be created with data or with a binary reader (which it uses to deserialize it's data). It does additional initialization with the data, and since I don't want to duplicate that code, I want to chain the constructors. Right now this chaining looks like this:
public Item(string id, int count = 1) { /*...*/ }
public Item(BinaryReader reader) : this(reader.ReadString(), reader.ReadInt32()) { /*...*/ }
Is order in which these read calls are made deterministic?
Clarification: I was talking about the order in which read.ReadString() and reader.ReadInt32() are called.

The exact behavior of the order of evaluation when calling methods with multiple arguments is documented in the C# specification Run-time evaluation of argument lists which states:
During the run-time processing of a function member invocation (Compile-time checking of dynamic overload resolution), the expressions or variable references of an argument list are evaluated in order, from left to right, as follows:
So in your case the order will be:
reader.ReadString()
reader.ReadInt32()
finally the other constructor will be called
Now, there are a couple of problem with this, mainly related to readability and exceptions.
The fact that you ask which call order is the correct one, the next programmer might have the same question. It would be better to refactor this into a different solution that doesn't come with a lot of questions attached.
Additionally, if you pass in a null reader you'll get a NullReferenceException instead of a ArgumentNullException, even if you should happen to validate this parameter not being null inside the constructor, simply because all this evaluation will happen before the constructor body executes. There are hacks to "fix" this but they are worse than the code you already have.
To refactor into a "better" solution (my opinion) you would create a factory method like this:
public static Item Create(BinaryReader reader)
{
if (reader == null) throw new ArgumentNullException(nameof(reader));
// optional: handle exceptions from reader.ReadString and ReadInt32
var s = reader.ReadString();
var i = reader.ReadInt32();
return new Item(s, i);
}

How can I use code contracts to check the value of a Lazy<T> without losing the advantages of lazy evaluation?

I have code that looks something like this:
public class Foo
{
private readonly Lazy<string> lazyBar;
public Foo()
{
lazyBar = new Lazy<string>(() => someExpression);
}
public string Bar
{
get { return lazyBar.Value; }
}
public void DoSomething()
{
Contract.Requires(Bar != null); // This evaluates Bar, but all I really require is that WHEN Bar is evaluated, it is not null.
...
}
}
Now, every place DoSomething is called, we must also prove that Bar is not null. But won't checking for this eliminate the benefit of lazy evaluation? Additionally, aren't we supposed to avoid side-effects in our contracts?
Is it possible to prove that someExpression will not resolve to null, without evaluating it?

Code Contracts doesn't have enough insight into Lazy<T> in order to make the connection between the original lambda and the result you get back.
What you really want to be able to state (for Lazy<T>) is that any contracts that hold about the lambda's return value also hold about the Value value, but meta-level contracts like this aren't possible at the moment.
What you could do is move someExpression into a method and then have a Contract.Ensures that result != null. This will then warn you if this condition does not hold. You can then put an Invariant onto the result; lazyBar.Value != null. This will mean that it isn't actually lazy, but for your release code you can build with CC in ReleaseRequires mode and these types of check will be eliminated (having a read in the manual of the different 'levels' of contract enforcement is highly recommended!)

After you edited in your comment in the code I can answer your question, but I don't think I'm answering the exact question lurking in the shadows here, you be the judge.
The only way to know what the property would resolve to if you read from it, is to actually read from it, evaluating the expression you constructed the underlying Lazy<T> field with.
Now, what you can do, is to check the underlying field to ask if it has been evaluated.
You can, in other words, do this:
if (lazyBar.IsValueCreated)
....
Now, whether that will work with a contract I'm not sure, but the following thing should allow you to check:
Contract.Requires(!lazyBar.IsValueCreated || lazyBar.Value != null);
Since I am not an expert on contracts, I don't know if the above code will be turned into an expression tree and evaluated either way.

This partly depends on the type that you're dealing.
For example, types that inherit from ICollection will not be itterated over in order to evaluate such methods as Count().
In the case where ICollection is implimented, the Count() method is optimised to check the Count property.
If you simply want to check that your Lazy object has "something" inside I suggest using the Any() method available to us in the System.Linq namespace.
This will then only need to do a single MoveNext in the collection before being able to determine whether there is "anything" inside it.

I can't understand why this doesn't work (and it really doesn't):
public class Foo
{
private readonly Lazy lazyBar;
public Foo()
{
lazyBar = new Lazy<string>(() => someExpression);
Contract.Ensures(!lazyBar.IsValueCreated);
}
[ContractInvariantMethod]
private static void Invariant()
{
Contract.Invariant(!lazyBar.IsValueCreated || lazyBar.Value != null);
}
public void DoSomething()
{
Contract.Assume(lazyBar.IsValueCreated);
lazyBar.Value.AccessSomeMethod; //Here Code Contracts detect "Possibly calling a method on a null reference"
...
}
}

IEnumerable<T> null coalescing Extension

I frequently face the problem to check whether an IEnumerable<T> is null before iterating over it through foreach or LINQ queries, then I often come into codes like this:
var myProjection = (myList ?? Enumerable.Empty<T>()).Select(x => x.Foo)...
Hence, I thought to add this extension-method to an Extensions class:
public static class MyExtensions
{
public static IEnumerable<T> AsEmptyIfNull<T>(this IEnumerable<T> source)
{
return source ?? Enumerable.Empty<T>();
}
}
A little issue comes immediately in my mind looking at this code, i.e., given the "instance-methods aspect" of extension-methods, it should be implemented as a mere static method, otherwise something like this would be perfectly legal:
IEnumerable<int> list = null;
list.AsEmptyIfNull();
Do you see any other drawback in using it ?
Can such extension leading to some kind of bad-trend in the developer(s), if massively used?
Bonus question:
Can you suggest a better name to it ? :)
(English is not my first language, then I'm not so good in naming...)
Thanks in advance.

Methods that return an IEnumerable<T> should return an empty one, instead of null. So you wouldn't need this.
See this question : Is it better to return null or empty collection?
Otherwise, your code seems ok.

It's generally a bad idea to return a null instead of an empty sequence if you can control it. This is self-explanatory if you consider that when someone is asked to produce a collection, returning null is not like saying "the collection is empty" but "there is no such collection at all".
If you own the methods returning the enumerables, then returning an empty IEnumerable (which can even be a special purpose readonly static object if it might be returned a lot) is the way to go, period.
If you are forced to use a bad-mannered library that has the habit of returning null in such cases, then this an extension method might be a solution, but again I wouldn't prefer it. It's probably better to wrap the bad-mannered methods in your own versions that do the coalescing where people won't see it. This way you get both the convenience of always having an enumerable instead of null and the correctness of not supporting the "return null" paradigm.

C#: Should I bother checking for null in this situation?

Lets say I have this extention method:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
if(subjects == null)
throw new ArgumentNullException("subjects");
return subjects.Count() == 5;
}
Do you think this null check and exception throwing is really necessary? I mean, when I use the Count method, an ArgumentNullException will be thrown anyways, right?
I can maybe think of one reason why I should, but would just like to hear others view on this. And yes, my reason for asking is partly laziness (want to write as little as possible), but also because I kind of think a bunch of null checking and exception throwing kind of clutters up the methods which often end up being twice as long as they really needed to be. Someone should know better than to send null into a method :p
Anyways, what do you guys think?
Note: Count() is an extension method and will throw an ArgumentNullException, not a NullReferenceException. See Enumerable.Count<TSource> Method (IEnumerable<TSource>). Try it yourself if you don't believe me =)
Note2: After the answers given here I have been persuaded to start checking more for null values. I am still lazy though, so I have started to use the Enforce class in Lokad Shared Libraries. Can recommend taking a look at it. Instead of my example I can do this instead:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
Enforce.Argument(() => subjects);
return subjects.Count() == 5;
}

Yes, it will throw an ArgumentNullException. I can think of two reasons for putting the extra checking in:
If you later go back and change the method to do something before calling subjects.Count() and forget to put the check in at that point, you could end up with a side effect before the exception is thrown, which isn't nice.
Currently, the stack trace will show subjects.Count() at the top, and probably with a message with the source parameter name. This could be confusing to the caller of HasFive who can see a subjects parameter name.
EDIT: Just to save me having to write it yet again elsewhere:
The call to subjects.Count() will throw an ArgumentNullException, not a NullReferenceException. Count() is another extension method here, and assuming the implementation in System.Linq.Enumerable is being used, that's documented (correctly) to throw an ArgumentNullException. Try it if you don't believe me.
EDIT: Making this easier...
If you do a lot of checks like this you may want to make it simpler to do so. I like the following extension method:
internal static void ThrowIfNull<T>(this T argument, string name)
where T : class
{
if (argument == null)
{
throw new ArgumentNullException(name);
}
}
The example method in the question can then become:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
subjects.ThrowIfNull("subjects");
return subjects.Count() == 5;
}
Another alternative would be to write a version which checked the value and returned it like this:
internal static T NullGuard<T>(this T argument, string name)
where T : class
{
if (argument == null)
{
throw new ArgumentNullException(name);
}
return argument;
}
You can then call it fluently:
public static bool HasFive<T>(this IEnumerable<T> subjects)
{
return subjects.NullGuard("subjects").Count() == 5;
}
This is also helpful for copying parameters in constructors etc:
public Person(string name, int age)
{
this.name = name.NullGuard("name");
this.age = age;
}
(You might want an overload without the argument name for places where it's not important.)

I think #Jon Skeet is absolutely spot on, however I'd like to add the following thoughts:-
Providing a meaningful error message is useful for debugging, logging and exception reporting. An exception thrown by the BCL is less likely to describe the specific circumstances of the exception WRT your codebase. Perhaps this is less of an issue with null checks which (most of the time) necessarily can't give you much domain-specific information - 'I was passed a null unexpectedly, no idea why' is pretty much the best you can do most of the time, however sometimes you can provide more information and obviously this is more likely to be relevant when dealing with other exception types.
The null check clearly demonstrates to other developers and you, a form of documentation, if/when you come back to the code a year later, that it's possible someone might pass a null, and it would be problematic if they did so.
Expanding on Jon's excellent point - you might do something before the null gets picked up - I think it is vitally important to engage in defensive programming. Checking for an exception before running other code is a form of defensive programming as you are taking into account things might not work the way you expected (or changes might be made in the future that you didn't expect) and ensuring that no matter what happens (assuming your null check isn't removed) such problems cannot arise.
It's a form of runtime assert that your parameter is not null. You can proceed on the assumption that it isn't.
The above assumption can result in slimmer code, you write the rest of your code knowing the parameter is not null, cutting down on extraneous subsequent null checks.

In my opinion you should check for the null value. Two things that comes to mind.
It makes explicit the possible errors that can happen during runtime.
It also gives you a chance to throw a better exception instead of a generic ArgumentNullException. Thus, making the reason for the exception more explicit.

The exception that you will get thrown will be an Object reference not set to an instance of an object.
Not the most useful of exceptions when tracking down the problem.
The way you have it there will give you much more useful information by specifically stating that it's your subjects reference that is null.

I think it is a good practice to do precondition checks at the top of the function. Maybe it's just my code that is full of bugs, but this practice catched a lot of errors for me.
Also, it's much easier to figure out the source of the problem if you got an ArgumentNullException with the name of the parameter, thrown from the most relevant stack frame. Also, the code in the body of your function can change over time so I wouldn't depend on it catching precondition problems in the future.

It always depends on the context (in my opinion).
For instance, when writing a library (for others to use), it certainly makes sense to fully check each and every parameter and throw the appropriate exceptions.
When writing methods that are used inside a project, I usually skip those checks, attempting to reduce the size of the codebase. But even in this case, there might be a level (between application layers) where you still place such checks. It depends on the context, on the size of the project, on the size of the team working on it...
It certainly doesn't make sense doing it for small projects built by one person :)

It depends on the concrete method. In this case - I think, the exception is not necesary and the better usage will be, if teh extension method can deal with null.
public static bool HasFive<T>(this IEnumerable<T> subjects) {
if ( object.ReferenceEquals( subjects, null ) ) { return false; }
return subjects.Count() == 5;
}
If you call "items.HasFive()" and the "items" is null, then is true that items has not five items.
But if you have extension method:
public static T GetFift<T>(this IEnumerable<T> subjects) {
...
}
The exception for "subjects == null" should be called, because there is no valid way, how to deal with it.

If you look at the source to the Enumerable class (System.Core.dll) where a lot of the default extension methods are defined for IEnumerables classes, you can see that they all check for null references with arguments.
public static IEnumerable<TSource> Skip<TSource>(this IEnumerable<TSource> source, int count)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
return SkipIterator<TSource>(source, count);
}
It's a bit of an obvious point, but I tend to follow what I find in the base framework library source as you know that is more than likely to be best practices.

Yes, for two reasons:
Firstly, the other extension methods on IEnumerable do and consumers of your code can expect yours to do so as well, but secondly and more importantly, if you have a long chain of operators in your query then knowing which one threw the exception is useful information.

In my opinion one should check for known conditions that will raise errors later on (at least for public methods). That way it's easier to detect the root of the problem.
I would raise a more informational exception like:
if (subjects == null)
{
throw new ArgumentNullException("subjects ", "subjects is null.");
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.