When should I use a ThrowHelper method instead of throwing directly? - c#

When is it appropriate to use a ThrowHelper method instead of throwing directly?
void MyMethod() {
...
//throw new ArgumentNullException("paramName");
ThrowArgumentNullException("paramName");
...
}
void ThrowArgumentNullException(string paramName) {
throw new ArgumentNullException(paramName);
}
I've read that calling a ThrowHelper method (a method with the only purpouse of throwing an exception) instead of throwing directly should yield smaller bytecode.
This, and the obvious encapsulation (another layer of indirection), may be good reasons to not throw directly, at least in some scenarios.
Anyway, IMO the drawbacks are not insubstantial too.
A part of the (exceptional) control flow is hidden
Exceptions end up having a more cryptic stacktrace
the compiler (2.0) will not recognize that ThrowHelper calls are exit points from a method, hence some code-around is necessary.
My limited experience is that often the overall design gets worse.
int MyMethod(int i) {
switch (i) {
case 1:
return 1;
default:
ThrowMyException();
}
return 0; // Unreachable (but needed) code
}
This may partly be a matter of personal taste. Anyway what are your personal guidelines about this issue? Do you find it is a good idea to use ThrowHelpers for all those common tasks like method param validation (ThrowArgumentNullException(paramName) and such)?
Am I missing something obvious on this issue?
Btw I'm trying not to mix this issue with the validation issue, e.g. a method like:
ThrowIfNameIsNullOrEmpty(name);

My default approach is to throw directly from the exceptional code branch.
The DRY principle and the rule of 3 guides when I would wrap that in a method: if I find myself writing the same 'throw' code 3 or more times, I consider wrapping it in a helper method.
However, instead of a method that throws, it's much better to write a Factory Method that creates the desired Exception and then throw it from the original place:
public void DoStuff(string stuff)
{
// Do something
throw this.CreateException("Boo hiss!");
}
private MyException CreateException(string message)
{
return new MyException(message);
}
This preserves the stack trace.

This seems like a needless abstraction to me. You have a method that does one thing, named as verbosely as the thing that it does.
The argument about less bytecode is practically meaningless since the size of your code rarely matters (you wouldn't be saving more than a kilobyte per instance of your program, unless you throw that exception from a ridiculous number of places in your source code). Meanwhile your stated drawbacks are all real concerns, especially where clarity of exception handling is concerned.
Basically abstracting away minor things like this usually comes back to bite you. (Some entertaining reading on the subject: http://www.joelonsoftware.com/articles/LeakyAbstractions.html)

I would say that the only reasonable time to do this is in something like the BCL where the amount of usage of the class is wide spread enough that the savings may be worth it. It would have to make sense, however. In your case, I don't think you're actually saving any space. Instances of ThrowHelper in the BCL reduce size by substituting enums for strings and method calls (to get string messages). Your case simply passes the same arguments and thus doesn't achieve any savings. It costs just as much to call your method as it does to throw the exception in place.

From what I've understood, the smaller bytecode is pretty much the only advantage (See this question), so I don't expect you'll see a lot of arguments for it in the answers here.

I always try to throw my own exceptions for the mere purpose of customization. I can throw a message that is going to tell me in short order what the problem is (and possibly where it originated from). As far as performance issues go, I try to limit the number of exceptions thrown on the release versions of my software as much as possible, so I have never actually thought about it too much. I am interested to see what other say though.
That is my 2 cents. Please take it as such.

There are two justifiable reasons to do this. The first is if the same exception message / info is supposed to be shared in multiple places.
The second is for performance reasons. Historically a throw in a method would prevent that method from being inlined which can drastically impact performance for small methods. You see this very often in collection types where small methods that modify the collection need to check the bounds before performing the modification.
The idea is that we want the normal case to be inlined as far up the stack as possible and to do so the throw has to stay in it's own method which is not inlined.
See these blog posts among many others for further explanation : https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/#exceptions
https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-5/

One advantage not yet mentioned to using either a throw-helper or exception-factory method is the possibility of passing a delegate for such a purpose. When using a throw-helper delegate, one gains the possibility of not throwing (which would in some cases be a good thing; in other cases, a bad thing). For example:
string TryGetEntry(string key, Funct<problemCause, string> errorHandler)
{
if (disposed)
return errorHandler(problemCause.ObjectDisposed);
else if (...entry_doesnt_exist...)
return errorHandler(problemCause.ObjectNotFound);
else
return ...entry...;
}
string GetEntry(string key)
{
return TryGetEntry(key, ThrowExceptionOnError);
}
bool TryGetEntry(string key, ref result)
{
bool ok;
string result;
result = TryGetEntry(key, (problemCause theProblem) => {ok=false; return (string)null;});
return ok;
}
Using this approach, one may easily use one routine with a variety of error-handling strategies. One could eliminate the need for a closure by having the TryGetEntry method accept a generic type parameter along with a 'ref' parameter of that type and a delegate that accepts such a 'ref' parameter; the example is simpler, though, just using a closure.

Although having smaller bytecode seems only save you a few kb, it definetly "does" effect performance because more code have to be fetched/loaded to run your method which also means more cpu cache wasted. If you write high perf code where every micro-optimisation matters then it matters.
But what is more important you can localize or one day may want to localize your exception messages.

Related

Can the C# compiler throw an error or warning if a certain method is called in a loop

Often times a developer on my team writes code in a loop that makes a call that is relatively slow (i.e. database access or web service call or other slow method). This is a super common mistake.
Yes, we practice code reviews, and we try to catch these and fix them before merging. However, failing early is better, right?
So is there a way to catch this mistake via the compiler?
Example:
Imagine this method
public ReturnObject SlowMethod(Something thing)
{
// method work
}
Below the method is called in a loop, which is a mistake.
public ReturnObject Call(IEnumerable<Something> things)
{
foreach(var thing in Things)
SlowMethod(thing); // Should throw compiler error or warning in a loop
}
Is there any way to decorate the above SlowMethod() with an attribute or compiler statement so that it would complain if used in a loop?
No, there is nothing in regular C# to prevent a method being used in a loop.
Your options:
discourage usage in a loop by providing easier to use alternatives. Providing second (or only) method that deals with collections will likely discourage one from writing calls in a loop enough so it is no longer a major concern.
try to write your own code analysis rule (stating tutorial - https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/tutorials/how-to-write-csharp-analyzer-code-fix)
add run-time protection to the method if it is called more often than you'd like.
Obviously it makes sense to invoke those slow methods in a loop - you're trying to put work into preventing that, but that's putting work into something fundamentally negative. Why not do something positive instead? Obviously, you've provided an API that's convenient to use in a loop. So, provide some alternatives that are easier to use correctly where formerly an incorrect use in a loop would take place, like:
an iterable-based API that would make the loop implicit, to remove some of the latency since you'd have a full view of what will be iterated, and can hide the latency appropriately,
an async API that won't block the thread, with example code showing how to use it in the typical situations you've encountered thus far; remember that an API that's too hard to use correctly won't get used!
a lowest-common-denominator API: split the methods into a requester and a result provider, so that there'd naturally be two loops: one to submit all the requests, another to collect and process the results (I dislike this approach, since it doesn't make the code any nicer)

Should I throw on null parameters in private/internal methods?

I'm writing a library that has several public classes and methods, as well as several private or internal classes and methods that the library itself uses.
In the public methods I have a null check and a throw like this:
public int DoSomething(int number)
{
if (number == null)
{
throw new ArgumentNullException(nameof(number));
}
}
But then this got me thinking, to what level should I be adding parameter null checks to methods? Do I also start adding them to private methods? Should I only do it for public methods?
Ultimately, there isn't a uniform consensus on this. So instead of giving a yes or no answer, I'll try to list the considerations for making this decision:
Null checks bloat your code. If your procedures are concise, the null guards at the beginning of them may form a significant part of the overall size of the procedure, without expressing the purpose or behaviour of that procedure.
Null checks expressively state a precondition. If a method is going to fail when one of the values is null, having a null check at the top is a good way to demonstrate this to a casual reader without them having to hunt for where it's dereferenced. To improve this, people often use helper methods with names like Guard.AgainstNull, instead of having to write the check each time.
Checks in private methods are untestable. By introducing a branch in your code which you have no way of fully traversing, you make it impossible to fully test that method. This conflicts with the point of view that tests document the behaviour of a class, and that that class's code exists to provide that behaviour.
The severity of letting a null through depends on the situation. Often, if a null does get into the method, it'll be dereferenced a few lines later and you'll get a NullReferenceException. This really isn't much less clear than throwing an ArgumentNullException. On the other hand, if that reference is passed around quite a bit before being dereferenced, or if throwing an NRE will leave things in a messy state, then throwing early is much more important.
Some libraries, like .NET's Code Contracts, allow a degree of static analysis, which can add an extra benefit to your checks.
If you're working on a project with others, there may be existing team or project standards covering this.
If you're not a library developer, don't be defensive in your code
Write unit tests instead
In fact, even if you're developing a library, throwing is most of the time: BAD
1. Testing null on int must never be done in c# :
It raises a warning CS4072, because it's always false.
2. Throwing an Exception means it's exceptional: abnormal and rare.
It should never raise in production code. Especially because exception stack trace traversal can be a cpu intensive task. And you'll never be sure where the exception will be caught, if it's caught and logged or just simply silently ignored (after killing one of your background thread) because you don't control the user code. There is no "checked exception" in c# (like in java) which means you never know - if it's not well documented - what exceptions a given method could raise. By the way, that kind of documentation must be kept in sync with the code which is not always easy to do (increase maintenance costs).
3. Exceptions increases maintenance costs.
As exceptions are thrown at runtime and under certain conditions, they could be detected really late in the development process. As you may already know, the later an error is detected in the development process, the more expensive the fix will be. I've even seen exception raising code made its way to production code and not raise for a week, only for raising every day hereafter (killing the production. oops!).
4. Throwing on invalid input means you don't control input.
It's the case for public methods of libraries. However if you can check it at compile time with another type (for example a non nullable type like int) then it's the way to go. And of course, as they are public, it's their responsibility to check for input.
Imagine the user who uses what he thinks as valid data and then by a side effect, a method deep in the stack trace trows a ArgumentNullException.
What will be his reaction?
How can he cope with that?
Will it be easy for you to provide an explanation message ?
5. Private and internal methods should never ever throw exceptions related to their input.
You may throw exceptions in your code because an external component (maybe Database, a file or else) is misbehaving and you can't guarantee that your library will continue to run correctly in its current state.
Making a method public doesn't mean that it should (only that it can) be called from outside of your library (Look at Public versus Published from Martin Fowler). Use IOC, interfaces, factories and publish only what's needed by the user, while making the whole library classes available for unit testing. (Or you can use the InternalsVisibleTo mechanism).
6. Throwing exceptions without any explanation message is making fun of the user
No need to remind what feelings one can have when a tool is broken, without having any clue on how to fix it. Yes, I know. You comes to SO and ask a question...
7. Invalid input means it breaks your code
If your code can produce a valid output with the value then it's not invalid and your code should manage it. Add a unit test to test this value.
8. Think in user terms:
Do you like when a library you use throws exceptions for smashing your face ? Like: "Hey, it's invalid, you should have known that!"
Even if from your point of view - with your knowledge of the library internals, the input is invalid, how you can explain it to the user (be kind and polite):
Clear documentation (in Xml doc and an architecture summary may help).
Publish the xml doc with the library.
Clear error explanation in the exception if any.
Give the choice :
Look at Dictionary class, what do you prefer? what call do you think is the fastest ? What call can raises exception ?
Dictionary<string, string> dictionary = new Dictionary<string, string>();
string res;
dictionary.TryGetValue("key", out res);
or
var other = dictionary["key"];
9. Why not using Code Contracts ?
It's an elegant way to avoid the ugly if then throw and isolate the contract from the implementation, permitting to reuse the contract for different implementations at the same time. You can even publish the contract to your library user to further explain him how to use the library.
As a conclusion, even if you can easily use throw, even if you can experience exceptions raising when you use .Net Framework, that doesn't mean it could be used without caution.
Here are my opinions:
General Cases
Generally speaking, it is better to check for any invalid inputs before you process them in a method for robustness reason - be it private, protected, internal, protected internal, or public methods. Although there are some performance costs paid for this approach, in most cases, this is worth doing rather than paying more time to debug and to patch the codes later.
Strictly Speaking, however...
Strictly speaking, however, it is not always needed to do so. Some methods, usually private ones, can be left without any input checking provided that you have full guarantee that there isn't single call for the method with invalid inputs. This may give you some performance benefit, especially if the method is called frequently to do some basic computation/action. For such cases, doing checking for input validity may impair the performance significantly.
Public Methods
Now the public method is trickier. This is because, more strictly speaking, although the access modifier alone can tell who can use the methods, it cannot tell who will use the methods. More over, it also cannot tell how the methods are going to be used (that is, whether the methods are going to be called with invalid inputs in the given scopes or not).
The Ultimate Determining Factor
Although access modifiers for methods in the code can hint on how to use the methods, ultimately, it is humans who will use the methods, and it is up to the humans how they are going to use them and with what inputs. Thus, in some rare cases, it is possible to have a public method which is only called in some private scope and in that private scope, the inputs for the public methods are guaranteed to be valid before the public method is called.
In such cases then, even the access modifier is public, there isn't any real need to check for invalid inputs, except for robust design reason. And why is this so? Because there are humans who know completely when and how the methods shall be called!
Here we can see, there is no guarantee either that public method always require checking for invalid inputs. And if this is true for public methods, it must also be true for protected, internal, protected internal, and private methods as well.
Conclusions
So, in conclusion, we can say a couple of things to help us making decisions:
Generally, it is better to have checks for any invalid inputs for robust design reason, provided that performance is not at stake. This is true for any type of access modifiers.
The invalid inputs check could be skipped if performance gain could be significantly improved by doing so, provided that it can also be guaranteed that the scope where the methods are called are always giving the methods valid inputs.
private method is usually where we skip such checking, but there is no guarantee that we cannot do that for public method as well
Humans are the ones who ultimately use the methods. Regardless of how the access modifiers can hint the use of the methods, how the methods are actually used and called depend on the coders. Thus, we can only say about general/good practice, without restricting it to be the only way of doing it.
The public interface of your library deserves tight checking of preconditions, because you should expect the users of your library to make mistakes and violate the preconditions by accident. Help them understand what is going on in your library.
The private methods in your library do not require such runtime checking because you call them yourself. You are in full control of what you are passing. If you want to add checks because you are afraid to mess up, then use asserts. They will catch your own mistakes, but do not impede performance during runtime.
Though you tagged language-agnostic, it seems to me that it probably doesn't exist a general response.
Notably, in your example you hinted the argument: so with a language accepting hinting it'll fire an error as soon as entering the function, before you can take any action.
In such a case, the only solution is to have checked the argument before calling your function... but since you're writing a library, that cannot have sense!
In the other hand, with no hinting, it remains realistic to check inside the function.
So at this step of the reflexion, I'd already suggest to give up hinting.
Now let's go back to your precise question: to what level should it be checked?
For a given data piece it'd happen only at the highest level where it can "enter" (may be several occurrences for the same data), so logically it'd concern only public methods.
That's for the theory. But maybe you plan a huge, complex, library so it might be not easy to ensure having certainty about registering all "entry points".
In this case, I'd suggest the opposite: consider to merely apply your controls everywhere, then only omit it where you clearly see it's duplicate.
Hope this helps.
In my opinion you should ALWAYS check for "invalid" data - independent whether it is a private or public method.
Looked from the other way... why should you be able to work with something invalid just because the method is private? Doesn't make sense, right? Always try to use defensive programming and you will be happier in life ;-)
This is a question of preference. But consider instead why are you checking for null or rather checking for valid input. It's probably because you want to let the consumer of your library to know when he/she is using it incorrectly.
Let's imagine that we have implemented a class PersonList in a library. This list can only contain objects of the type Person. We have also on our PersonList implemented some operations and therefore we do not want it to contain any null values.
Consider the two following implementations of the Add method for this list:
Implementation 1
public void Add(Person item)
{
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Implementation 2
public void Add(Person item)
{
if(item == null)
{
throw new ArgumentNullException("Cannot add null to PersonList");
}
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Let's say we go with implementation 1
Null values can now be added in the list
All opoerations implemented on the list will have to handle theese null values
If we should check for and throw a exception in our operation, consumer will be notified about the exception when he/she is calling one of the operations and it will at this state be very unclear what he/she has done wrong (it just wouldn't make any sense to go for this approach).
If we instead choose to go with implementation 2, we make sure input to our library has the quality that we require for our class to operate on it. This means we only need to handle this here and then we can forget about it while we are implementing our other operations.
It will also become more clear for the consumer that he/she is using the library in the wrong way when he/she gets a ArgumentNullException on .Add instead of in .Sort or similair.
To sum it up my preference is to check for valid argument when it is being supplied by the consumer and it's not being handled by the private/internal methods of the library. This basically means we have to check arguments in constructors/methods that are public and takes parameters. Our private/internal methods can only be called from our public ones and they have allready checked the input which means we are good to go!
Using Code Contracts should also be considered when verifying input.

Best practice for null testing [duplicate]

To avoid all standard-answers I could have Googled on, I will provide an example you all can attack at will.
C# and Java (and too many others) have with plenty of types some of ‘overflow’ behaviour I don’t like at all (e.g type.MaxValue + type.SmallestValue == type.MinValue for example : int.MaxValue + 1 == int.MinValue).
But, seen my vicious nature, I’ll add some insult to this injury by expanding this behaviour to, let’s say an Overridden DateTime type. (I know DateTime is sealed in .NET, but for the sake of this example, I’m using a pseudo language that is exactly like C#, except for the fact that DateTime isn’t sealed).
The overridden Add method:
/// <summary>
/// Increments this date with a timespan, but loops when
/// the maximum value for datetime is exceeded.
/// </summary>
/// <param name="ts">The timespan to (try to) add</param>
/// <returns>The Date, incremented with the given timespan.
/// If DateTime.MaxValue is exceeded, the sum wil 'overflow' and
/// continue from DateTime.MinValue.
/// </returns>
public DateTime override Add(TimeSpan ts)
{
try
{
return base.Add(ts);
}
catch (ArgumentOutOfRangeException nb)
{
// calculate how much the MaxValue is exceeded
// regular program flow
TimeSpan saldo = ts - (base.MaxValue - this);
return DateTime.MinValue.Add(saldo)
}
catch(Exception anyOther)
{
// 'real' exception handling.
}
}
Of course an if could solve this just as easy, but the fact remains that I just fail to see why you couldn’t use exceptions (logically that is, I can see that when performance is an issue that in certain cases exceptions should be avoided).
I think in many cases they are more clear than if-structures and don’t break any contract the method is making.
IMHO the “Never use them for regular program flow” reaction everybody seems to have is not that well underbuild as the strength of that reaction can justify.
Or am I mistaken?
I've read other posts, dealing with all kind of special cases, but my point is there's nothing wrong with it if you are both:
Clear
Honour the contract of your method
Shoot me.
Have you ever tried to debug a program raising five exceptions per second in the normal course of operation ?
I have.
The program was quite complex (it was a distributed calculation server), and a slight modification at one side of the program could easily break something in a totally different place.
I wish I could just have launched the program and wait for exceptions to occur, but there were around 200 exceptions during the start-up in the normal course of operations
My point : if you use exceptions for normal situations, how do you locate unusual (ie exceptional) situations ?
Of course, there are other strong reasons not to use exceptions too much, especially performance-wise
Exceptions are basically non-local goto statements with all the consequences of the latter. Using exceptions for flow control violates a principle of least astonishment, make programs hard to read (remember that programs are written for programmers first).
Moreover, this is not what compiler vendors expect. They expect exceptions to be thrown rarely, and they usually let the throw code be quite inefficient. Throwing exceptions is one of the most expensive operations in .NET.
However, some languages (notably Python) use exceptions as flow-control constructs. For example, iterators raise a StopIteration exception if there are no further items. Even standard language constructs (such as for) rely on this.
My rule of thumb is:
If you can do anything to recover from an error, catch exceptions
If the error is a very common one (eg. user tried to log in with the wrong password), use returnvalues
If you can't do anything to recover from an error, leave it uncaught (Or catch it in your main-catcher to do some semi-graceful shutdown of the application)
The problem I see with exceptions is from a purely syntax point of view (I'm pretty sure the perfomance overhead is minimal). I don't like try-blocks all over the place.
Take this example:
try
{
DoSomeMethod(); //Can throw Exception1
DoSomeOtherMethod(); //Can throw Exception1 and Exception2
}
catch(Exception1)
{
//Okay something messed up, but is it SomeMethod or SomeOtherMethod?
}
.. Another example could be when you need to assign something to a handle using a factory, and that factory could throw an exception:
Class1 myInstance;
try
{
myInstance = Class1Factory.Build();
}
catch(SomeException)
{
// Couldn't instantiate class, do something else..
}
myInstance.BestMethodEver(); // Will throw a compile-time error, saying that myInstance is uninitalized, which it potentially is.. :(
Soo, personally, I think you should keep exceptions for rare error-conditions (out of memory etc.) and use returnvalues (valueclasses, structs or enums) to do your error checking instead.
Hope I understood your question correct :)
A first reaction to a lot of answers :
you're writing for the programmers and the principle of least astonishment
Of course! But an if just isnot more clear all the time.
It shouldn't be astonishing eg : divide (1/x) catch (divisionByZero) is more clear than any if to me (at Conrad and others) . The fact this kind of programming isn't expected is purely conventional, and indeed, still relevant. Maybe in my example an if would be clearer.
But DivisionByZero and FileNotFound for that matter are clearer than ifs.
Of course if it's less performant and needed a zillion time per sec, you should of course avoid it, but still i haven't read any good reason to avoid the overal design.
As far as the principle of least astonishment goes : there's a danger of circular reasoning here : suppose a whole community uses a bad design, this design will become expected! Therefore the principle cannot be a grail and should be concidered carefully.
exceptions for normal situations, how do you locate unusual (ie exceptional) situations ?
In many reactions sth. like this shines trough. Just catch them, no? Your method should be clear, well documented, and hounouring it's contract. I don't get that question I must admit.
Debugging on all exceptions : the same, that's just done sometimes because the design not to use exceptions is common. My question was : why is it common in the first place?
Before exceptions, in C, there were setjmp and longjmp that could be used to accomplish a similar unrolling of the stack frame.
Then the same construct was given a name: "Exception". And most of the answers rely on the meaning of this name to argue about its usage, claiming that exceptions are intended to be used in exceptional conditions. That was never the intent in the original longjmp. There were just situations where you needed to break control flow across many stack frames.
Exceptions are slightly more general in that you can use them within the same stack frame too. This raises analogies with goto that I believe are wrong. Gotos are a tightly coupled pair (and so are setjmp and longjmp). Exceptions follow a loosely coupled publish/subscribe that is much cleaner! Therefore using them within the same stack frame is hardly the same thing as using gotos.
The third source of confusion relates to whether they are checked or unchecked exceptions. Of course, unchecked exceptions seem particularly awful to use for control flow and perhaps a lot of other things.
Checked exceptions however are great for control flow, once you get over all the Victorian hangups and live a little.
My favorite usage is a sequence of throw new Success() in a long fragment of code that tries one thing after the other until it finds what it is looking for. Each thing -- each piece of logic -- may have arbritrary nesting so break's are out as also any kind of condition tests. The if-else pattern is brittle. If I edit out an else or mess up the syntax in some other way, then there is a hairy bug.
Using throw new Success() linearizes the code flow. I use locally defined Success classes -- checked of course -- so that if I forget to catch it the code won't compile. And I don't catch another method's Successes.
Sometimes my code checks for one thing after the other and only succeeds if everything is OK. In this case I have a similar linearization using throw new Failure().
Using a separate function messes with the natural level of compartmentalization. So the return solution is not optimal. I prefer to have a page or two of code in one place for cognitive reasons. I don't believe in ultra-finely divided code.
What JVMs or compilers do is less relevant to me unless there is a hotspot. I cannot believe there is any fundamental reason for compilers to not detect locally thrown and caught Exceptions and simply treat them as very efficient gotos at the machine code level.
As far as using them across functions for control flow -- i. e. for common cases rather than exceptional ones -- I cannot see how they would be less efficient than multiple break, condition tests, returns to wade through three stack frames as opposed to just restore the stack pointer.
I personally do not use the pattern across stack frames and I can see how it would require design sophistication to do so elegantly. But used sparingly it should be fine.
Lastly, regarding surprising virgin programmers, it is not a compelling reason. If you gently introduce them to the practice, they will learn to love it. I remember C++ used to surprise and scare the heck out of C programmers.
The standard anwser is that exceptions are not regular and should be used in exceptional cases.
One reason, which is important to me, is that when I read a try-catch control structure in a software I maintain or debug, I try to find out why the original coder used an exception handling instead of an if-else structure. And I expect to find a good answer.
Remember that you write code not only for the computer but also for other coders. There is a semantic associated to an exception handler that you cannot throw away just because the machine doesn't mind.
Josh Bloch deals with this topic extensively in Effective Java. His suggestions are illuminating and should apply to .NET as well (except for the details).
In particular, exceptions should be used for exceptional circumstances. The reasons for this are usability-related, mainly. For a given method to be maximally usable, its input and output conditions should be maximally constrained.
For example, the second method is easier to use than the first:
/**
* Adds two positive numbers.
*
* #param addend1 greater than zero
* #param addend2 greater than zero
* #throws AdditionException if addend1 or addend2 is less than or equal to zero
*/
int addPositiveNumbers(int addend1, int addend2) throws AdditionException{
if( addend1 <= 0 ){
throw new AdditionException("addend1 is <= 0");
}
else if( addend2 <= 0 ){
throw new AdditionException("addend2 is <= 0");
}
return addend1 + addend2;
}
/**
* Adds two positive numbers.
*
* #param addend1 greater than zero
* #param addend2 greater than zero
*/
public int addPositiveNumbers(int addend1, int addend2) {
if( addend1 <= 0 ){
throw new IllegalArgumentException("addend1 is <= 0");
}
else if( addend2 <= 0 ){
throw new IllegalArgumentException("addend2 is <= 0");
}
return addend1 + addend2;
}
In either case, you need to check to make sure that the caller is using your API appropriately. But in the second case, you require it (implicitly). The soft Exceptions will still be thrown if the user didn't read the javadoc, but:
You don't need to document it.
You don't need to test for it (depending upon how aggresive your
unit testing strategy is).
You don't require the caller to handle three use cases.
The ground-level point is that Exceptions should not be used as return codes, largely because you've complicated not only YOUR API, but the caller's API as well.
Doing the right thing comes at a cost, of course. The cost is that everyone needs to understand that they need to read and follow the documentation. Hopefully that is the case anyway.
How about performance? While load testing a .NET web app we topped out at 100 simulated users per web server until we fixed a commonly-occuring exception and that number increased to 500 users.
I think that you can use Exceptions for flow control. There is, however, a flipside of this technique. Creating Exceptions is a costly thing, because they have to create a stack trace. So if you want to use Exceptions more often than for just signalling an exceptional situation you have to make sure that building the stack traces doesn't negatively influence your performance.
The best way to cut down the cost of creating exceptions is to override the fillInStackTrace() method like this:
public Throwable fillInStackTrace() { return this; }
Such an exception will have no stacktraces filled in.
Here are best practices I described in my blog post:
Throw an exception to state an unexpected situation in your software.
Use return values for input validation.
If you know how to deal with exceptions a library throws, catch them at the lowest level possible.
If you have an unexpected exception, discard current operation completely. Don’t pretend you know how to deal with them.
I don't really see how you're controlling program flow in the code you cited. You'll never see another exception besides the ArgumentOutOfRange exception. (So your second catch clause will never be hit). All you're doing is using an extremely costly throw to mimic an if statement.
Also you aren't performing the more sinister of operations where you just throw an exception purely for it to be caught somewhere else to perform flow control. You're actually handling an exceptional case.
Apart from the reasons stated, one reason not to use exceptions for flow control is that it can greatly complicate the debugging process.
For example, when I'm trying to track down a bug in VS I'll typically turn on "break on all exceptions". If you're using exceptions for flow control then I'm going to be breaking in the debugger on a regular basis and will have to keep ignoring these non-exceptional exceptions until I get to the real problem. This is likely to drive someone mad!!
Lets assume you have a method that does some calculations. There are many input parameters it has to validate, then to return a number greater then 0.
Using return values to signal validation error, it's simple: if method returned a number lesser then 0, an error occured. How to tell then which parameter didn't validate?
I remember from my C days a lot of functions returned error codes like this:
-1 - x lesser then MinX
-2 - x greater then MaxX
-3 - y lesser then MinY
etc.
Is it really less readable then throwing and catching an exception?
Because the code is hard to read, you may have troubles debugging it, you will introduce new bugs when fixing bugs after a long time, it is more expensive in terms of resources and time, and it annoys you if you are debugging your code and the debugger halts on the occurence of every exception ;)
If you are using exception handlers for control flow, you are being too general and lazy. As someone else mentioned, you know something happened if you are handling processing in the handler, but what exactly? Essentially you are using the exception for an else statement, if you are using it for control flow.
If you don't know what possible state could occur, then you can use an exception handler for unexpected states, for example when you have to use a third-party library, or you have to catch everything in the UI to show a nice error message and log the exception.
However, if you do know what might go wrong, and you don't put an if statement or something to check for it, then you are just being lazy. Allowing the exception handler to be the catch-all for stuff you know could happen is lazy, and it will come back to haunt you later, because you will be trying to fix a situation in your exception handler based on a possibly false assumption.
If you put logic in your exception handler to determine what exactly happened, then you would be quite stupid for not putting that logic inside the try block.
Exception handlers are the last resort, for when you run out of ideas/ways to stop something from going wrong, or things are beyond your ability to control. Like, the server is down and times out and you can't prevent that exception from being thrown.
Finally, having all the checks done up front shows what you know or expect will occur and makes it explicit. Code should be clear in intent. What would you rather read?
You can use a hammer's claw to turn a screw, just like you can use exceptions for control flow. That doesn't mean it is the intended usage of the feature. The if statement expresses conditions, whose intended usage is controlling flow.
If you are using a feature in an unintended way while choosing to not use the feature designed for that purpose, there will be an associated cost. In this case, clarity and performance suffer for no real added value. What does using exceptions buy you over the widely-accepted if statement?
Said another way: just because you can doesn't mean you should.
As others have mentioned numerously, the principle of least astonishment will forbid that you use exceptions excessively for control flow only purposes. On the other hand, no rule is 100% correct, and there are always those cases where an exception is "just the right tool" - much like goto itself, by the way, which ships in the form of break and continue in languages like Java, which are often the perfect way to jump out of heavily nested loops, which aren't always avoidable.
The following blog post explains a rather complex but also rather interesting use-case for a non-local ControlFlowException:
http://blog.jooq.org/2013/04/28/rare-uses-of-a-controlflowexception
It explains how inside of jOOQ (a SQL abstraction library for Java), such exceptions are occasionally used to abort the SQL rendering process early when some "rare" condition is met.
Examples of such conditions are:
Too many bind values are encountered. Some databases do not support arbitrary numbers of bind values in their SQL statements (SQLite: 999, Ingres 10.1.0: 1024, Sybase ASE 15.5: 2000, SQL Server 2008: 2100). In those cases, jOOQ aborts the SQL rendering phase and re-renders the SQL statement with inlined bind values. Example:
// Pseudo-code attaching a "handler" that will
// abort query rendering once the maximum number
// of bind values was exceeded:
context.attachBindValueCounter();
String sql;
try {
// In most cases, this will succeed:
sql = query.render();
}
catch (ReRenderWithInlinedVariables e) {
sql = query.renderWithInlinedBindValues();
}
If we explicitly extracted the bind values from the query AST to count them every time, we'd waste valuable CPU cycles for those 99.9% of the queries that don't suffer from this problem.
Some logic is available only indirectly via an API that we want to execute only "partially". The UpdatableRecord.store() method generates an INSERT or UPDATE statement, depending on the Record's internal flags. From the "outside", we don't know what kind of logic is contained in store() (e.g. optimistic locking, event listener handling, etc.) so we don't want to repeat that logic when we store several records in a batch statement, where we'd like to have store() only generate the SQL statement, not actually execute it. Example:
// Pseudo-code attaching a "handler" that will
// prevent query execution and throw exceptions
// instead:
context.attachQueryCollector();
// Collect the SQL for every store operation
for (int i = 0; i < records.length; i++) {
try {
records[i].store();
}
// The attached handler will result in this
// exception being thrown rather than actually
// storing records to the database
catch (QueryCollectorException e) {
// The exception is thrown after the rendered
// SQL statement is available
queries.add(e.query());
}
}
If we had externalised the store() logic into "re-usable" API that can be customised to optionally not execute the SQL, we'd be looking into creating a rather hard to maintain, hardly re-usable API.
Conclusion
In essence, our usage of these non-local gotos is just along the lines of what [Mason Wheeler][5] said in his answer:
"I just encountered a situation that I cannot deal with properly at this point, because I don't have enough context to handle it, but the routine that called me (or something further up the call stack) ought to know how to handle it."
Both usages of ControlFlowExceptions were rather easy to implement compared to their alternatives, allowing us to reuse a wide range of logic without refactoring it out of the relevant internals.
But the feeling of this being a bit of a surprise to future maintainers remains. The code feels rather delicate and while it was the right choice in this case, we'd always prefer not to use exceptions for local control flow, where it is easy to avoid using ordinary branching through if - else.
Typically there is nothing wrong, per se, with handling an exception at a low level. An exception IS a valid message that provides a lot of detail for why an operation cannot be performed. And if you can handle it, you ought to.
In general if you know there is a high probability of failure that you can check for... you should do the check... i.e. if(obj != null) obj.method()
In your case, i'm not familiar enough with the C# library to know if date time has an easy way to check whether a timestamp is out of bounds. If it does, just call if(.isvalid(ts))
otherwise your code is basically fine.
So, basically it comes down to whichever way creates cleaner code... if the operation to guard against an expected exception is more complex than just handling the exception; than you have my permission to handle the exception instead of creating complex guards everywhere.
You might be interested in having a look at Common Lisp's condition system which is a sort of generalization of exceptions done right. Because you can unwind the stack or not in a controlled way, you get "restarts" as well, which are extremely handy.
This doesn't have anything much to do with best practices in other languages, but it shows you what can be done with some design thought in (roughly) the direction you are thinking of.
Of course there are still performance considerations if you're bouncing up and down the stack like a yo-yo, but it's a much more general idea than "oh crap, lets bail" kind of approach that most catch/throw exception systems embody.
I don't think there is anything wrong with using Exceptions for flow-control. Exceptions are somewhat similar to continuations and in statically typed languages, Exceptions are more powerful than continuations, so, if you need continuations but your language doesn't have them, you can use Exceptions to implement them.
Well, actually, if you need continuations and your language doesn't have them, you chose the wrong language and you should rather be using a different one. But sometimes you don't have a choice: client-side web programming is the prime example – there's just no way to get around JavaScript.
An example: Microsoft Volta is a project to allow writing web applications in straight-forward .NET, and let the framework take care of figuring out which bits need to run where. One consequence of this is that Volta needs to be able to compile CIL to JavaScript, so that you can run code on the client. However, there is a problem: .NET has multithreading, JavaScript doesn't. So, Volta implements continuations in JavaScript using JavaScript Exceptions, then implements .NET Threads using those continuations. That way, Volta applications that use threads can be compiled to run in an unmodified browser – no Silverlight needed.
But you won't always know what happens in the Method/s that you call. You won't know exactly where the exception was thrown. Without examining the exception object in greater detail....
I feel that there is nothing wrong with your example. On the contrary, it would be a sin to ignore the exception thrown by the called function.
In the JVM, throwing an exception is not that expensive, only creating the exception with new xyzException(...), because the latter involves a stack walk. So if you have some exceptions created in advance, you may throw them many times without costs. Of course, this way you can't pass data along with the exception, but I think that is a bad thing to do anyway.
There are a few general mechanisms via which a language could allow for a method to exit without returning a value and unwind to the next "catch" block:
Have the method examine the stack frame to determine the call site, and use the metadata for the call site to find either information about a try block within the calling method, or the location where the calling method stored the address of its caller; in the latter situation, examine metadata for the caller's caller to determine in the same fashion as the immediate caller, repeating until one finds a try block or the stack is empty. This approach adds very little overhead to the no-exception case (it does preclude some optimizations) but is expensive when an exception occurs.
Have the method return a "hidden" flag which distinguishes a normal return from an exception, and have the caller check that flag and branch to an "exception" routine if it's set. This routine adds 1-2 instructions to the no-exception case, but relatively little overhead when an exception occurs.
Have the caller place exception-handling information or code at a fixed address relative to the stacked return address. For example, with the ARM, instead of using the instruction "BL subroutine", one could use the sequence:
adr lr,next_instr
b subroutine
b handle_exception
next_instr:
To exit normally, the subroutine would simply do bx lr or pop {pc}; in case of an abnormal exit, the subroutine would either subtract 4 from LR before performing the return or use sub lr,#4,pc (depending upon the ARM variation, execution mode, etc.) This approach will malfunction very badly if the caller is not designed to accommodate it.
A language or framework which uses checked exceptions might benefit from having those handled with a mechanism like #2 or #3 above, while unchecked exceptions are handled using #1. Although the implementation of checked exceptions in Java is rather nuisancesome, they would not be a bad concept if there were a means by which a call site could say, essentially, "This method is declared as throwing XX, but I don't expect it ever to do so; if it does, rethrow as an "unchecked" exception. In a framework where checked exceptions were handled in such fashion, they could be an effective means of flow control for things like parsing methods which in some contexts may have a high likelihood of failure, but where failure should return fundamentally different information than success. I'm unaware of any frameworks that use such a pattern, however. Instead, the more common pattern is to use the first approach above (minimal cost for the no-exception case, but high cost when exceptions are thrown) for all exceptions.
One aesthetic reason:
A try always comes with a catch, whereas an if doesn't have to come with an else.
if (PerformCheckSucceeded())
DoSomething();
With try/catch, it becomes much more verbose.
try
{
PerformCheckSucceeded();
DoSomething();
}
catch
{
}
That's 6 lines of code too many.

Does inverting the "if" improve performance? [duplicate]

When I ran ReSharper on my code, for example:
if (some condition)
{
Some code...
}
ReSharper gave me the above warning (Invert "if" statement to reduce nesting), and suggested the following correction:
if (!some condition) return;
Some code...
I would like to understand why that's better. I always thought that using "return" in the middle of a method problematic, somewhat like "goto".
It is not only aesthetic, but it also reduces the maximum nesting level inside the method. This is generally regarded as a plus because it makes methods easier to understand (and indeed, many static analysis tools provide a measure of this as one of the indicators of code quality).
On the other hand, it also makes your method have multiple exit points, something that another group of people believes is a no-no.
Personally, I agree with ReSharper and the first group (in a language that has exceptions I find it silly to discuss "multiple exit points"; almost anything can throw, so there are numerous potential exit points in all methods).
Regarding performance: both versions should be equivalent (if not at the IL level, then certainly after the jitter is through with the code) in every language. Theoretically this depends on the compiler, but practically any widely used compiler of today is capable of handling much more advanced cases of code optimization than this.
A return in the middle of the method is not necessarily bad. It might be better to return immediately if it makes the intent of the code clearer. For example:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
};
}
return result;
};
In this case, if _isDead is true, we can immediately get out of the method. It might be better to structure it this way instead:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
I've picked this code from the refactoring catalog. This specific refactoring is called: Replace Nested Conditional with Guard Clauses.
This is a bit of a religious argument, but I agree with ReSharper that you should prefer less nesting. I believe that this outweighs the negatives of having multiple return paths from a function.
The key reason for having less nesting is to improve code readability and maintainability. Remember that many other developers will need to read your code in the future, and code with less indentation is generally much easier to read.
Preconditions are a great example of where it is okay to return early at the start of the function. Why should the readability of the rest of the function be affected by the presence of a precondition check?
As for the negatives about returning multiple times from a method - debuggers are pretty powerful now, and it's very easy to find out exactly where and when a particular function is returning.
Having multiple returns in a function is not going to affect the maintainance programmer's job.
Poor code readability will.
As others have mentioned, there shouldn't be a performance hit, but there are other considerations. Aside from those valid concerns, this also can open you up to gotchas in some circumstances. Suppose you were dealing with a double instead:
public void myfunction(double exampleParam){
if(exampleParam > 0){
//Body will *not* be executed if Double.IsNan(exampleParam)
}
}
Contrast that with the seemingly equivalent inversion:
public void myfunction(double exampleParam){
if(exampleParam <= 0)
return;
//Body *will* be executed if Double.IsNan(exampleParam)
}
So in certain circumstances what appears to be a a correctly inverted if might not be.
The idea of only returning at the end of a function came back from the days before languages had support for exceptions. It enabled programs to rely on being able to put clean-up code at the end of a method, and then being sure it would be called and some other programmer wouldn't hide a return in the method that caused the cleanup code to be skipped. Skipped cleanup code could result in a memory or resource leak.
However, in a language that supports exceptions, it provides no such guarantees. In a language that supports exceptions, the execution of any statement or expression can cause a control flow that causes the method to end. This means clean-up must be done through using the finally or using keywords.
Anyway, I'm saying I think a lot of people quote the 'only return at the end of a method' guideline without understanding why it was ever a good thing to do, and that reducing nesting to improve readability is probably a better aim.
I'd like to add that there is name for those inverted if's - Guard Clause. I use it whenever I can.
I hate reading code where there is if at the beginning, two screens of code and no else. Just invert if and return. That way nobody will waste time scrolling.
http://c2.com/cgi/wiki?GuardClause
It doesn't only affect aesthetics, but it also prevents code nesting.
It can actually function as a precondition to ensure that your data is valid as well.
This is of course subjective, but I think it strongly improves on two points:
It is now immediately obvious that your function has nothing left to do if condition holds.
It keeps the nesting level down. Nesting hurts readability more than you'd think.
Multiple return points were a problem in C (and to a lesser extent C++) because they forced you to duplicate clean-up code before each of the return points. With garbage collection, the try | finally construct and using blocks, there's really no reason why you should be afraid of them.
Ultimately it comes down to what you and your colleagues find easier to read.
Guard clauses or pre-conditions (as you can probably see) check to see if a certain condition is met and then breaks the flow of the program. They're great for places where you're really only interested in one outcome of an if statement. So rather than say:
if (something) {
// a lot of indented code
}
You reverse the condition and break if that reversed condition is fulfilled
if (!something) return false; // or another value to show your other code the function did not execute
// all the code from before, save a lot of tabs
return is nowhere near as dirty as goto. It allows you to pass a value to show the rest of your code that the function couldn't run.
You'll see the best examples of where this can be applied in nested conditions:
if (something) {
do-something();
if (something-else) {
do-another-thing();
} else {
do-something-else();
}
}
vs:
if (!something) return;
do-something();
if (!something-else) return do-something-else();
do-another-thing();
You'll find few people arguing the first is cleaner but of course, it's completely subjective. Some programmers like to know what conditions something is operating under by indentation, while I'd much rather keep method flow linear.
I won't suggest for one moment that precons will change your life or get you laid but you might find your code just that little bit easier to read.
Performance-wise, there will be no noticeable difference between the two approaches.
But coding is about more than performance. Clarity and maintainability are also very important. And, in cases like this where it doesn't affect performance, it is the only thing that matters.
There are competing schools of thought as to which approach is preferable.
One view is the one others have mentioned: the second approach reduces the nesting level, which improves code clarity. This is natural in an imperative style: when you have nothing left to do, you might as well return early.
Another view, from the perspective of a more functional style, is that a method should have only one exit point. Everything in a functional language is an expression. So if statements must always have an else clauses. Otherwise the if expression wouldn't always have a value. So in the functional style, the first approach is more natural.
There are several good points made here, but multiple return points can be unreadable as well, if the method is very lengthy. That being said, if you're going to use multiple return points just make sure that your method is short, otherwise the readability bonus of multiple return points may be lost.
Performance is in two parts. You have performance when the software is in production, but you also want to have performance while developing and debugging. The last thing a developer wants is to "wait" for something trivial. In the end, compiling this with optimization enabled will result in similar code. So it's good to know these little tricks that pay off in both scenarios.
The case in the question is clear, ReSharper is correct. Rather than nesting if statements, and creating new scope in code, you're setting a clear rule at the start of your method. It increases readability, it will be easier to maintain, and it reduces the amount of rules one has to sift through to find where they want to go.
Personally I prefer only 1 exit point. It's easy to accomplish if you keep your methods short and to the point, and it provides a predictable pattern for the next person who works on your code.
eg.
bool PerformDefaultOperation()
{
bool succeeded = false;
DataStructure defaultParameters;
if ((defaultParameters = this.GetApplicationDefaults()) != null)
{
succeeded = this.DoSomething(defaultParameters);
}
return succeeded;
}
This is also very useful if you just want to check the values of certain local variables within a function before it exits. All you need to do is place a breakpoint on the final return and you are guaranteed to hit it (unless an exception is thrown).
Avoiding multiple exit points can lead to performance gains. I am not sure about C# but in C++ the Named Return Value Optimization (Copy Elision, ISO C++ '03 12.8/15) depends on having a single exit point. This optimization avoids copy constructing your return value (in your specific example it doesn't matter). This could lead to considerable gains in performance in tight loops, as you are saving a constructor and a destructor each time the function is invoked.
But for 99% of the cases saving the additional constructor and destructor calls is not worth the loss of readability nested if blocks introduce (as others have pointed out).
Many good reasons about how the code looks like. But what about results?
Let's take a look to some C# code and its IL compiled form:
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length == 0) return;
if ((args.Length+2)/3 == 5) return;
Console.WriteLine("hey!!!");
}
}
This simple snippet can be compiled. You can open the generated .exe file with ildasm and check what is the result. I won't post all the assembler thing but I'll describe the results.
The generated IL code does the following:
If the first condition is false, jumps to the code where the second is.
If it's true jumps to the last instruction. (Note: the last instruction is a return).
In the second condition the same happens after the result is calculated. Compare and: got to the Console.WriteLine if false or to the end if this is true.
Print the message and return.
So it seems that the code will jump to the end. What if we do a normal if with nested code?
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length != 0 && (args.Length+2)/3 != 5)
{
Console.WriteLine("hey!!!");
}
}
}
The results are quite similar in IL instructions. The difference is that before there were two jumps per condition: if false go to next piece of code, if true go to the end. And now the IL code flows better and has 3 jumps (the compiler optimized this a bit):
First jump: when Length is 0 to a part where the code jumps again (Third jump) to the end.
Second: in the middle of the second condition to avoid one instruction.
Third: if the second condition is false, jump to the end.
Anyway, the program counter will always jump.
In theory, inverting if could lead to better performance if it increases branch prediction hit rate. In practice, I think it is very hard to know exactly how branch prediction will behave, especially after compiling, so I would not do it in my day-to-day development, except if I am writing assembly code.
More on branch prediction here.
That is simply controversial. There is no "agreement among programmers" on the question of early return. It's always subjective, as far as I know.
It's possible to make a performance argument, since it's better to have conditions that are written so they are most often true; it can also be argued that it is clearer. It does, on the other hand, create nested tests.
I don't think you will get a conclusive answer to this question.
There are a lot of insightful answers there already, but still, I would to direct to a slightly different situation: Instead of precondition, that should be put on top of a function indeed, think of a step-by-step initialization, where you have to check for each step to succeed and then continue with the next. In this case, you cannot check everything at the top.
I found my code really unreadable when writing an ASIO host application with Steinberg's ASIOSDK, as I followed the nesting paradigm. It went like eight levels deep, and I cannot see a design flaw there, as mentioned by Andrew Bullock above. Of course, I could have packed some inner code to another function, and then nested the remaining levels there to make it more readable, but this seems rather random to me.
By replacing nesting with guard clauses, I even discovered a misconception of mine regarding a portion of cleanup-code that should have occurred much earlier within the function instead of at the end. With nested branches, I would never have seen that, you could even say they led to my misconception.
So this might be another situation where inverted ifs can contribute to a clearer code.
It's a matter of opinion.
My normal approach would be to avoid single line ifs, and returns in the middle of a method.
You wouldn't want lines like it suggests everywhere in your method but there is something to be said for checking a bunch of assumptions at the top of your method, and only doing your actual work if they all pass.
In my opinion early return is fine if you are just returning void (or some useless return code you're never gonna check) and it might improve readability because you avoid nesting and at the same time you make explicit that your function is done.
If you are actually returning a returnValue - nesting is usually a better way to go cause you return your returnValue just in one place (at the end - duh), and it might make your code more maintainable in a whole lot of cases.
I'm not sure, but I think, that R# tries to avoid far jumps. When You have IF-ELSE, compiler does something like this:
Condition false -> far jump to false_condition_label
true_condition_label:
instruction1
...
instruction_n
false_condition_label:
instruction1
...
instruction_n
end block
If condition is true there is no jump and no rollout L1 cache, but jump to false_condition_label can be very far and processor must rollout his own cache. Synchronising cache is expensive. R# tries replace far jumps into short jumps and in this case there is bigger probability, that all instructions are already in cache.
I think it depends on what you prefer, as mentioned, theres no general agreement afaik.
To reduce annoyment, you may reduce this kind of warning to "Hint"
My idea is that the return "in the middle of a function" shouldn't be so "subjective".
The reason is quite simple, take this code:
function do_something( data ){
if (!is_valid_data( data ))
return false;
do_something_that_take_an_hour( data );
istance = new object_with_very_painful_constructor( data );
if ( istance is not valid ) {
error_message( );
return ;
}
connect_to_database ( );
get_some_other_data( );
return;
}
Maybe the first "return" it's not SO intuitive, but that's really saving.
There are too many "ideas" about clean codes, that simply need more practise to lose their "subjective" bad ideas.
There are several advantages to this sort of coding but for me the big win is, if you can return quick you can improve the speed of your application. IE I know that because of Precondition X that I can return quickly with an error. This gets rid of the error cases first and reduces the complexity of your code. In a lot of cases because the cpu pipeline can be now be cleaner it can stop pipeline crashes or switches. Secondly if you are in a loop, breaking or returning out quickly can save you a lots of cpu. Some programmers use loop invariants to do this sort of quick exit but in this you can broke your cpu pipeline and even create memory seek problem and mean the the cpu needs to load from outside cache. But basically I think you should do what you intended, that is end the loop or function not create a complex code path just to implement some abstract notion of correct code. If the only tool you have is a hammer then everything looks like a nail.

Validating function arguments?

On a regular basis, I validate my function arguments:
public static void Function(int i, string s)
{
Debug.Assert(i > 0);
Debug.Assert(s != null);
Debug.Assert(s.length > 0);
}
Of course the checks are "valid" in the context of the function.
Is this common industry practice? What is common practice concerning function argument validation?
The accepted practice is as follows if the values are not valid or will cause an exception later on:
if( i < 0 )
throw new ArgumentOutOfRangeException("i", "parameter i must be greater than 0");
if( string.IsNullOrEmpty(s) )
throw new ArgumentNullException("s","the paramater s needs to be set ...");
So the list of basic argument exceptions is as follows:
ArgumentException
ArgumentNullException
ArgumentOutOfRangeException
What you wrote are preconditions, and an essential element in Design by Contract. Google (or "StackOverflow":) for that term and you'll find quite a lot of good information about it, and some bad information, too. Note that the method includes also postconditions and the concept of class invariant.
Let's leave it clear that assertions are a valid mechanism.
Of course, they're usually (not always) not checked in Release mode, so this means that you have to test your code before releasing it.
If assertions are left enabled and an assertion is violated, the standard behaviour in some languages that use assertions (and in Eiffel in particular) is to throw an assertion violation exception.
Assertions left unchecked are not a convenient or advisable mechanism if you're publishing a code library, nor (obviously) a way to validate direct possibly incorrect input. If you have "possibly incorrect input" you have to design as part of the normal behaviour of your program an input validation layer; but you can still freely use assertions in the internal modules.
Other languages, like Java, have more of a tradition of explicitly checking arguments and throwing exceptions if they're wrong, mainly because these languages don't have a strong "assert" or "design by contract" tradition.
(It may seem strange to some, but I find the differences in tradition respectable, and not necessarily evil.)
See also this related question.
You should not be using asserts to validate data in a live application. It is my understanding that asserts are meant to test whether the function is being used in the proper way. Or that the function is returning the proper value I.e. the value that you are getting is what you expected. They are used a lot in testing frameworks. They are meant to be turned off when the system is deployed as they are slow. If you would like to handle invalid cases, you should do so explicitly as the poster above mentioned.
Any code that is callable over the network or via inter process communication absolutely must have parameter validation because otherwise it's a security vulnerability - but you have to throw an exception Debug.Assert just will not do because it only checks debug builds.
Any code that other people on your team will use also should have parameter validations, just because it will help them know it's their bug when they pass you an invalid value, again you should throw exceptions this time because you can add a nice description ot an exception with explanation what they did wrong and how to fix it.
Debug.Assert in your function is just to help YOU debug, it's a nice first line of defense but it's not "real" validation.
For public functions, especially API calls, you should be throwing exceptions. Consumers would probably appreciate knowing that there was a bug in their code, and an exception is the guaranteed way of doing it.
For internal or private functions, Debug.Assert is fine (but not necessary, IMO). You won't be taking in unknown parameters, and your tests should catch any invalid values by expected output. But, sometimes, Debug.Assert will let you zero in on or prevent a bug that much quicker.
For public functions that are not API calls, or internal methods subject to other folks calling them, you can go either way. I generally prefer exceptions for public methods, and (usually) let internal methods do without exceptions. If an internal method is particularly prone to misuse, then an exception is warranted.
While you want to validate arguments, you don't want 4 levels of validation that you have to keep in sync (and pay the perf penalty for). So, validate at the external interface, and just trust that you and your co-workers are able to call functions appropriately and/or fix the bug that inevitably results.
Most of the time I don't use Debug.Assert, I would do something like this.
public static void Function(int i, string s)
{
if (i > 0 || !String.IsNullOrEmpty(s))
Throw New ArgumentException("blah blah");
}
WARNING: This is air code, I havn't tested it.
You should use Assert to validate programmatic assumptions; that is, for the situation where
you're the only one calling that method
it should be an impossible state to get into
The Assert statements will allow you to double check that the impossible state is never reached. Use this where you would otherwise feel comfortable without validation.
For situations where the function is given bad arguments, but you can see that it's not impossible for it to receive those values (e.g. when someone else could call that code), you should throw exceptions (a la #Nathan W and #Robert Paulson) or fail gracefully (a la #Srdjan Pejic).
I try not to use Debug.Assert, rather I write guards. If the function parameter is not of expected value, I exit the function. Like this:
public static void Function(int i, string s)
{
if(i <= 0)
{
/*exit and warn calling code */
}
}
I find this reduces the amount of wrangling that need to happen.
I won't speak to industry standards, but you could combine the bottom two asserts into a single line:
Debug.Assert(!String.IsNullOrEmpty(s));

Categories

Resources