Is there any point calling Any() before a for each? - c#

I've been looking at some code I've refactored which makes use of the null object pattern so will always return an empty list if null.
However some of the other code within the function makes use of an Any() check for doing a ForEach().
Is there any benefit to doing so?
e.g.
var items = new List<items>
items.AddRange(GetItems(1))
if (items.Any())
{
items.ForEach(item => { // Do something with item });
}
I realise as long as there is no need for the null guard this is safe, but I wondered whether there are any best practices with LINQ and C# relating to this.
Cheers,
Jamie

The only check you could do is check if items is null, as you already said. But if you handle that case you should keep in mind to consider the else condition as well and how your application should react in that scenario.
Other than that, there is no need to check if there are items in the list if you just want to iterate over them.

Sorry I am unable to say any best practices for Linq/C#, but I will share my thoughts on the usage of Any before ForEach
However some of the other code within the function makes use of an
Any() check for doing a ForEach().
Having this Any() before ForEach() will passively suppressing null reference exception and carrying on with our operation. This appears to be good but actually the code permits others to send null values.
I normally use Debug.Assert(!=null) for the places where its possible to inject null values. This way these kind of errors can be captured in unit tests and ensure those who calls this code will be notified and wont allow to break this code. Because if null is tolerated, we will not address the root cause - which is we need to initialize objects in proper places.

Related

Can the C# compiler throw an error or warning if a certain method is called in a loop

Often times a developer on my team writes code in a loop that makes a call that is relatively slow (i.e. database access or web service call or other slow method). This is a super common mistake.
Yes, we practice code reviews, and we try to catch these and fix them before merging. However, failing early is better, right?
So is there a way to catch this mistake via the compiler?
Example:
Imagine this method
public ReturnObject SlowMethod(Something thing)
{
// method work
}
Below the method is called in a loop, which is a mistake.
public ReturnObject Call(IEnumerable<Something> things)
{
foreach(var thing in Things)
SlowMethod(thing); // Should throw compiler error or warning in a loop
}
Is there any way to decorate the above SlowMethod() with an attribute or compiler statement so that it would complain if used in a loop?
No, there is nothing in regular C# to prevent a method being used in a loop.
Your options:
discourage usage in a loop by providing easier to use alternatives. Providing second (or only) method that deals with collections will likely discourage one from writing calls in a loop enough so it is no longer a major concern.
try to write your own code analysis rule (stating tutorial - https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/tutorials/how-to-write-csharp-analyzer-code-fix)
add run-time protection to the method if it is called more often than you'd like.
Obviously it makes sense to invoke those slow methods in a loop - you're trying to put work into preventing that, but that's putting work into something fundamentally negative. Why not do something positive instead? Obviously, you've provided an API that's convenient to use in a loop. So, provide some alternatives that are easier to use correctly where formerly an incorrect use in a loop would take place, like:
an iterable-based API that would make the loop implicit, to remove some of the latency since you'd have a full view of what will be iterated, and can hide the latency appropriately,
an async API that won't block the thread, with example code showing how to use it in the typical situations you've encountered thus far; remember that an API that's too hard to use correctly won't get used!
a lowest-common-denominator API: split the methods into a requester and a result provider, so that there'd naturally be two loops: one to submit all the requests, another to collect and process the results (I dislike this approach, since it doesn't make the code any nicer)

Safe usage of .FirstorDefault()?

When using Enumerable.FirstorDefault(), do I need to always catch the ArumentNullException that can be thrown when the collection operated on is null?
In the past I've always just done something like this:
WorkflowColorItemType associatedColor = ColorItems
.Where(ci => ci.AssociatedState == WorkflowStateStatus.NotStarted)
.FirstOrDefault();
if (associatedColor != null)
{
this.ColorItems.CurrentColor = associatedColor;
}
In the context of this code snippet, I would never expect ColorItems to be null but is it good practice to be enclosing every instance of snippets like this in try catch blocks so I can handle the off chance that the ColorItems collection might be null?
If you don't expect the collection to ever be empty, and it would be an error in your program for it to be empty, then don't use FirstOrDefault in the first place, use First. Since it's not an expected situation for you to be in you want to draw attention to the problem because it's a sign that something is wrong.
If it's entirely valid for the collection to be empty, and you want to use the first item only if there is at least one item, then using FirstOrDefault and providing a null check is fine.
Apply the same logic to the collection being null, and not empty. If it is expected that the collection be allowed to be null, then check for it using an if. If it's not expected that the collection be allowed to be null (which is generally the case for most uses of collections) then you shouldn't check, and you want the exception to be thrown as it will draw attention to the bug in the code that is supposed to populate that collection. Trying to catch the exception and move on is trying to obscure the bug which prevents you from finding and fixing it.
Yes, certainly. You need to be defensive all the time.
In fact, if associatedColor is null, that means there is something wrong, hence you need to handle it.
In fact, your code used to be wrapped in a try/catch block to handle exceptions, since, exceptions are "expensive", this is cheaper and nicer way to handle exceptional cases.
In any case, I would almost always use FirstOrDefault or something simirlar, like SingleOrDefault then I would do the null check.
The built in LINQ functions (like .Where() here) always return an empty enumerable if there are no results, not null. So there is no need to check for null after doing the .Where()
Depending on where ColorItems comes from, you should check for null on the object:
if (ColorItems != null)
{
}
else
{
}
There's no need to put a try/catch block around the code, but you should be checking for null just to be safe. In fact, using try/catch in a scenario like this, where you can just check for a null object, is a bad programming practice.

Is .Select<T>(...) to be prefered before .Where<T>(...)?

I got in a discussion with two colleagues regarding a setup for an iteration over an IEnumerable (the contents of which will not be altered in any way during the operation). There are three conflicting theories on which is the optimal approach. Both the others (and me as well) are very certain and that got me unsure, so for the sake of clarity, I want to check with an external source.
The scenario is as follows. We had the code below as a starting point and discovered that some of the hazaas need not to be acted upon. So, starting with the code below, we started to add a blocker for the action.
foreach(Hazaa hazaa in hazaas) ;
My suggestion is as follows.
foreach(Hazaa hazaa in hazaas.Where(element => condition)) ;
One of the guys wants to resolve it by a more explicit form, claiming that LINQ is not appropriate in this case (not sure why it'd be so but he seems to be very convinced). He's solution is this.
foreach(Hazaa hazaa in hazaas) ;
if(condition) ;
The other contra-suggestion is supported by the claim that Where risks to repeat the filtering process needlessly and that it's more certain to minimize the computational workload by picking the appropriate elements once for all by Select.
foreach(Hazaa hazaa in hazaas.Select(element => condition)) ;
I argue that the first is obsolete, since LINQ can handle data objects quite well.
I also believe that Select-ing is in this case equivalently fast to Where-ing and no needless steps will be taken (e.g. the evaluation of the condition on the elements will only be performed once). If anything, it should be faster using Where because we won't be creating an extra instance of anything.
Who's right?
Select is inappropriate. It doesn't filter anything.
if is a possible solution, but Where is just as explicit.
Where executes the condition exactly once per item, just as the if. Additionally, it is important to note that the call to Where doesn't iterate the list. So, using Where you iterate the list exactly once, just like when using if.
I think you are discussing with one person that didn't understand LINQ - the guy that wants to use Select - and one that doesn't like the functional aspect of LINQ.
I would go with Where.
The .Where() and the if(condition) approach will be the same.
But since LinQ is nicely readable i'd prefer that.
The approach with .Select() is nonsense, since it will not return the Hazaa-Object, but an IEnumerable<Boolean>
To be clear about the functions:
myEnumerable.Where(a => isTrueFor(a)) //This is filtering
myEnumerable.Select(a => a.b) //This is projection
Where() will run a function, which returns a Boolean foreach item of the enumerable and return this item depending on the result of the Boolean function
Select() will run a function for every item in the list and return the result of the function without doing any filtering.

Should I redundantly test arguments (e.g collection emptiness)?

Is there a problem with the redundant collection checking here?:
SomeMethod()
{
shapes = GetShapes();
//maybe Assert(shapes.Any())?
if(shapes.Any())
{
ToggleVisibility(shapes);
}
}
ToggleVisibility(IEnumerable<Shape> shapes)
{
//maybe Assert(shapes.Any())?
if(shapes.Any())
{
//do stuff
}
}
I don't think there's a big problem here because calling Any() is not an expensive operation.
There is a minor problem in that the responsibility and behavior of ToggleVisibility is not declared. ToggleVisibility should let callers know how it will behave if shapes is empty or null. The best way to do this is through XML comments so that it shows up in Intellisense. This will let ToggleVisibility callers decide if they need to check if the collection is empty or null.
If you're adding those assertions for testing and debugging, sure, that makes sense.
In those situations, you want to be told when things don't go the way you expect them always to go.
In production, however, you probably don't want to tank the whole application by making calls on non-existent members of the shapes collection.
You can use Code Contract Library. In this case you can dynamically configure preconditions (validating incoming values), postconditions (validating results) and invariants (conditions that must be always true for a particular class) in your code.
I think the key here is knowing the responsibility. If you know every single place that will ever call ToggleVisibility and intend to always check before hand then it is fine to not check in the ToggleVisibility method.
For my part I would check it inside ToggleVisibility because it makes the caller code cleaner and if you call the ToggleVisibility function from 50 different places then you have considerably less code.
I would suggest that the answer is... as usual... "it depends". While calling Any on an IEnumerable is not expensive, is it really necessary? That depends on what you are planning on doing with your collection in the method.
Will your method throw an exception, or something else undesirable, because of an empty collection? Are you iterating over your collection with a foreach? If so, then having an empty collection wouldn't necessarily do any harm, though it may be against your business rules. Trying to iterate over a null collection is obviously different.
You use GetShapes() as an example framework for an answer. To expand on my idea, is it really illegal to ToggleVisibility() on an empty collection? It obviously won't do much, but if the user highlighted an empty set of shapes, and then clicked on the toggle visibility function, would it do anything bad?
If ToggleVisibility(IEnumerable<Shape>) is a private method (thus SomeMethod() must be in the same library), then I would definitely include the check only one time in the Release build. Whether the check is in one method or the other depends on what makes sense for what is happening. If the collection is expected to never be empty in a correct execution, then perhaps no check is needed. If ToggleVisibility(IEnumerable<Shape>) is being called from ten different places, and any of them may have an empty collection, then I would definitely relieve the caller of the burden of doing the check every time, and just stick it inside the method itself.
If ToggleVisibility(IEnumerable<Shape>) is part of a public API, then it should definitely do whatever parameter validation is necessary, since users of APIs are likely to do anything, and all parameters must be checked at all times. If the documentation for the method states that empty collections will be ignored, then SomeMethod() does not need to worry about it, obviously. Otherwise, SomeMethod() needs to do whatever it takes to verify that the collection that it is passing is valid, even if that means that redundant checks are made.

Best Practice - Removing item from generic collection in C#

I'm using C# in Visual Studio 2008 with .NET 3.5.
I have a generic dictionary that maps types of events to a generic list of subscribers. A subscriber can be subscribed to more than one event.
private static Dictionary<EventType, List<ISubscriber>> _subscriptions;
To remove a subscriber from the subscription list, I can use either of these two options.
Option 1:
ISubscriber subscriber; // defined elsewhere
foreach (EventType event in _subscriptions.Keys) {
if (_subscriptions[event].Contains(subscriber)) {
_subscriptions[event].Remove(subscriber);
}
}
Option 2:
ISubscriber subscriber; // defined elsewhere
foreach (EventType event in _subscriptions.Keys) {
_subscriptions[event].Remove(subscriber);
}
I have two questions.
First, notice that Option 1 checks for existence before removing the item, while Option 2 uses a brute force removal since Remove() does not throw an exception. Of these two, which is the preferred, "best-practice" way to do this?
Second, is there another, "cleaner," more elegant way to do this, perhaps with a lambda expression or using a LINQ extension? I'm still getting acclimated to these two features.
Thanks.
EDIT
Just to clarify, I realize that the choice between Options 1 and 2 is a choice of speed (Option 2) versus maintainability (Option 1). In this particular case, I'm not necessarily trying to optimize the code, although that is certainly a worthy consideration. What I'm trying to understand is if there is a generally well-established practice for doing this. If not, which option would you use in your own code?
Option 1 will be slower than Option 2. Lambda expressions and LINQ will be slower. I would use HashSet<> instead of List<>.
If you need confirmation about item removal, then Contains has to be used.
EDITED:
Since there is a high probabilty of using your code inside lock statement, and best practice is to reduce time of execution inside lock, it may be useful to apply Option 2. It looks like there is no best practice to use or not-use Contains with Remove.
The Remove() method 'approches O(1)' and is OK when a key does not exist.
But otherwise: when in doubt, measure. Getting some timings isn't that difficult...
Why enumerate the keys when all you're concerned with is the values?
foreach (List<ISubscriber> list in _subscriptions.Values)
{
list.Remove(subscriber);
}
That said, the LINQ solution suggested by Eric P is certainly more concise. Performance might be an issue, though.
I'd opt for the second option. Contains() and Remove() are both O(n) methods, and there's no reason to call both since Remove doesn't throw. At least with method 2, you're only calling one expensive operation instead of two.
I don't know of a faster way to handle it.
If you wanted to use Linq to do this, I think this would work (not tested):
_subscriptions.Values.All(x => x.Remove(subscriber));
Might want to check the performance on that though.

Categories

Resources