The IT department of a subsidiary of ours had a consulting company write them an ASP.NET application. Now it's having intermittent problems with mixing up who the current user is and has been known to show Joe some of Bob's data by mistake.
The consultants were brought back to troubleshoot and we were invited to listen in on their explanation. Two things stuck out.
First, the consultant lead provided this pseudo-code:
void MyFunction()
{
Session["UserID"] = SomeProprietarySessionManagementLookup();
Response.Redirect("SomeOtherPage.aspx");
}
He went on to say that the assignment of the session variable is asynchronous, which seemed untrue. Granted the call into the lookup function could do something asynchronously, but this seems unwise.
Given that alleged asynchronousness, his theory was that the session variable was not being assigned before the redirect's inevitable ThreadAbort exception was raised. This faulure then prevented SomeOtherPage from displaying the correct user's data.
Second, he gave an example of a coding best practice he recommends. Rather than writing:
int MyFunction(int x, int x)
{
try
{
return x / y;
}
catch(Exception ex)
{
// log it
throw;
}
}
the technique he recommended was:
int MyFunction(int x, int y, out bool isSuccessful)
{
isSuccessful = false;
if (y == 0)
return 0;
isSuccessful = true;
return x / y;
}
This will certainly work and could be better from a performance perspective in some situations.
However, from these and other discussion points it just seemed to us that this team was not well-versed technically.
Opinions?
Rule of thumb: If you need to ask if a consultant knows what he's doing, he probably doesn't ;)
And I tend to agree here. Obviously you haven't provided much, but they don't seem terribly competent.
I would agree. These guys seem quite incompetent.
(BTW, I'd check to see if in "SomeProprietarySessionManagementLookup," they're using static data. Saw this -- with behavior exactly as you describe on a project I inherited several months ago. It was a total head-slap moment when we finally saw it ... And wished we could get face to face with the guys who wrote it ... )
If the consultant has written an application that's supposed to be able to keep track of users and only show the correct data to the correct users and it doesn't do that, then clearly something's wrong. A good consultant would find the problem and fix it. A bad consultant would tell you that it was asynchronicity.
On the asynchronous part, the only way that could be true is if the assignment going on there is actually an indexer setter on Session that is hiding an asynchronous call with no callback indicating success/failure. This would seem to be a HORRIBLE design choice, and it looks like a core class in your framework, so I find it highly unlikely.
Usually asynchronous calls have a way to specify a callback so you can determine what the result is, or if the operation was successful. The documentation for Session should be pretty clear though on if it is actually hiding an asynchronous call, but yeah... doesn't look like the consultant knows what he is talking about...
The method call that is being assigned to the Session indexer cannot be asynch, because to get a value asynchronously, you HAVE to use a callback... no way around that, so if there is no explicit callback, it's definitely not asynch (well, internally there could be an asynchronous call, but the caller of the method would perceive it as synchronous, so it is irrelevant if the method internally for example invokes a web service asynchronously).
For the second point, I think this would be much better, and keep the same functionality essentially:
int MyFunction(int x, int y)
{
if (y == 0)
{
// log it
throw new DivideByZeroException("Divide by zero attempted!");
}
return x / y;
}
For the first point, that does indeed seem bizarre.
On the second one, it's reasonable to try to avoid division by 0 - it's entirely avoidable and that avoidance is simple. However, using an out parameter to indicate success is only reasonable in certain cases, such as int.TryParse and DateTime.TryParseExact - where the caller can't easily determine whether or not their arguments are reasonable. Even then, the return value is usually the success/failure and the out parameter is the result of the method.
Asp.net sessions, if you're using the built-in providers, won't accidentally give you someone else's session. SomeProprietarySessionManagementLookup() is the likely culprit and is returning bad values or just not working.
Session["UserID"] = SomeProprietarySessionManagementLookup();
First of all assigning the return value from an asynchronously SomeProprietarySessionManagementLookup() just wont work. The consultants code probably looks like:
public void SomeProprietarySessionManagementLookup()
{
// do some async lookup
Action<object> d = delegate(object val)
{
LookupSession(); // long running thing that looks up the user.
Session["UserID"] = 1234; // Setting session manually
};
d.BeginInvoke(null,null,null);
}
The consultant isn't totally full of BS, but they have written some buggy code. Response.Redirect() does throw a ThreadAbort, and if the proprietary method is asynchronous, asp.net doesn't know to wait for the asynchronous method to write back to the session before asp.net itself saves the session. This is probably why it sometimes works and sometimes doesn't.
Their code might work if the asp.net session is in-process, but a state server or db server wouldn't. It's timing dependent.
I tested the following. We use state server in development. This code works because the session is written to before the main thread finishes.
Action<object> d = delegate(object val)
{
System.Threading.Thread.Sleep(1000); // waits a little
Session["rubbish"] = DateTime.Now;
};
d.BeginInvoke(null, null, null);
System.Threading.Thread.Sleep(5000); // waits a lot
object stuff = Session["rubbish"];
if( stuff == null ) stuff = "not there";
divStuff.InnerHtml = Convert.ToString(stuff);
This next snippet of code doesn't work because the session was already saved back to state server by the time the asynchronous method gets around to setting a session value.
Action<object> d = delegate(object val)
{
System.Threading.Thread.Sleep(5000); // waits a lot
Session["rubbish"] = DateTime.Now;
};
d.BeginInvoke(null, null, null);
// wait removed - ends immediately.
object stuff = Session["rubbish"];
if( stuff == null ) stuff = "not there";
divStuff.InnerHtml = Convert.ToString(stuff);
The first step is for the consultant to make their code synchronous because their performance trick didn't work at all. If that fixes it, have the consultant properly implement using the Asynchronous Programming Design Pattern
I agree with him in part -- it's definitely better to check y for zero rather than catching the (expensive) exception. The out bool isSuccessful seems really dated to me, but whatever.
re: the asynchronous sessionid buffoonery -- may or may not be true, but it sounds like the consultant is blowing smoke for cover.
Cody's rule of thumb is dead right. If you have to ask, he probably doesn't.
It seems like point two its patently incorrect. .NET's standards explain that if a method fails it should throw an exception, which seems closer to the original; not the consulstant's suggestion. Assuming the exception is accurately & specifically describing the failure.
The consultants created the code in the first place right? And it doesn't work. I think you have quite a bit of dirt on them already.
The asynchronous answer sounds like BS, but there may be something in it. Presumably they have offered a suitable solution as well as pseudo-code describing the problem they themselves created. I would be more tempted to judge them on their solution rather than their expression of the problem. If their understanding is flawed their new solution won't work either. Then you'll know they are idiots. (In fact look round to see if you have a similar proof in any other areas of their code already)
The other one is a code style issue. There are a lot of different ways to cope with that. I personally don't like that style, but there will be circumstances under which it is suitable.
They're wrong on the async.
The assignment happens and then the page redirects. The function can start something asynchronously and return (and could even conceivably alter the Session in its own way), but whatever it does return has to be assigned in the code you gave before the redirect.
They're wrong on that defensive coding style in any low-level code and even in a higher-level function unless it's a specific business case that the 0 or NULL or empty string or whatever should be handled that way - in which case, it's always successful (that successful flag is a nasty code smell) and not an exception. Exceptions are for exceptions. You don't want to mask behaviors like this by coddling the callers of the functions. Catch things early and throw exceptions. I think Maguire covered this in Writing Solid Code or McConnell in Code Complete. Either way, it smells.
This guy does not know what he is doing. The obvious culprit is right here:
Session["UserID"] = SomeProprietarySessionManagementLookup();
I have to agree with John Rudy. My gut tells me the problem is in SomeProprietarySessionManagementLookup().
.. and your consultants do not sound to sure of themselves.
Storing in Session in not async. So that isn't true unless that function is async. But even so, since it isn't calling a BeginCall and have something to call on completion, the next line of code wouldn't execute until the Session line is complete.
For the second statement, while that could be used, it isn't exactly a best practice and you have a few things to note with it. You save the cost of throwing an exception, but wouldn't you want to know that you are trying to divide by zero instead of just moving past it?
I don't think that is a solid suggestion at all.
Quite strange. On the second item it may or may not be faster. It certainly isn't the same functionality though.
Typical "consultant" bollocks:
The problem is with whatever SomeProprietarySessionManagementLookup is doing
Exceptions are only expensive if they're thrown. Don't be afraid of try..catch, but throws should only occur in exceptional circumstances. If variable y shouldn't be zero then an ArgumentOutOfRangeException would be appropriate.
I'm guessing your consultant is suggesting use a status variable instead of exception for error handling is a better practice? I don't agree. How often does people forgot or too lazy to do error checking for return values? Also, pass/fail variable is not informative. There are more things can go wrong other than divide by zero like integer x/y is too big or x is NaN. When things go wrong, status variable cannot tell you what went wrong, but exception can. Exception is for exceptional case, and divide by zero or NaN are definitely exceptional cases.
The session thing is possible. It's a bug, beyond doubt, but it could be that the write arrives at whatever custom session state provider you're using after the next read. The session state provider API accommodates locking to prevent this sort of thing, but if the implementor has just ignored all that, your consultant could be telling the truth.
The second issue is also kinda valid. It's not quite idiomatic - it's a slightly reversed version of things like int.TryParse, which are there to avoid performance issues caused by throwing lots of exceptions. But unless you're calling that code an awful lot, it's unlikely it'll make a noticeable difference (compared to say, one less database query per page etc). It's certainly not something you should do by default.
If SomeProprietarySessionManagementLookup(); is doing an asynchronous assignment it would more likely look like this:
SomeProprietarySessionManagementLookup(Session["UserID"]);
The very fact that the code is assigning the result to Session["UserID"] would suggest that it is not supposed to be asynchronous and the result should be obtained before Response.Redirect is called. If SomeProprietarySessionManagementLookup is returning before its result is calculated they have a design flaw anyway.
The throw an exception or use an out parameter is a matter of opinion and circumstance and in actual practice won't amount to a hill of beans which ever way you do it. For the performance hit of exceptions to become an issue you would need to be calling the function a huge number of times which would probably be a problem in itself.
If the consultants deployed their ASP.NET application on your server(s), then they may have deployed it in uncompiled form, which means there would be a bunch of *.cs files floating around that you could look at.
If all you can find is compiled .NET assemblies (DLLs and EXEs) of theirs, then you should still be able to decompile them into somewhat readable source code. I'll bet if you look through the code you'll find them using static variables in their proprietary lookup code. You'd then have something very concrete to show your bosses.
This entire answer stream is full of typical programmer attitudes. It reminds me of Joel's 'Things you should never do' article (rewrite from scratch.) We don't really know anything about the system, other than there's a bug, and some guy posted some code online. There are so many unknowns that it is ridiculous to say "This guy does not know what he is doing."
Rather than pile on the Consultant, you could just as easily pile on the person who procured their services. No consultant is perfect, nor is a hiring manager ... but at the end of the day the real direction you should be taking is very clear: instead of trying to find fault you should expend energy into working collaboratively to find solutions. No matter how skilled someone is at their roles and responsibilities they will certainly have deficiencies. If you determine there is a pattern of incompentencies then you may choose to transition to another resource going forward, but assigning blame has never solved a single problem in history.
On the second point, I would not use exceptions here. Exceptions are reserved for exceptional cases.
However, division of anything by zero certainly does not equal zero (in math, at least), so this would be case specific.
Related
To avoid all standard-answers I could have Googled on, I will provide an example you all can attack at will.
C# and Java (and too many others) have with plenty of types some of ‘overflow’ behaviour I don’t like at all (e.g type.MaxValue + type.SmallestValue == type.MinValue for example : int.MaxValue + 1 == int.MinValue).
But, seen my vicious nature, I’ll add some insult to this injury by expanding this behaviour to, let’s say an Overridden DateTime type. (I know DateTime is sealed in .NET, but for the sake of this example, I’m using a pseudo language that is exactly like C#, except for the fact that DateTime isn’t sealed).
The overridden Add method:
/// <summary>
/// Increments this date with a timespan, but loops when
/// the maximum value for datetime is exceeded.
/// </summary>
/// <param name="ts">The timespan to (try to) add</param>
/// <returns>The Date, incremented with the given timespan.
/// If DateTime.MaxValue is exceeded, the sum wil 'overflow' and
/// continue from DateTime.MinValue.
/// </returns>
public DateTime override Add(TimeSpan ts)
{
try
{
return base.Add(ts);
}
catch (ArgumentOutOfRangeException nb)
{
// calculate how much the MaxValue is exceeded
// regular program flow
TimeSpan saldo = ts - (base.MaxValue - this);
return DateTime.MinValue.Add(saldo)
}
catch(Exception anyOther)
{
// 'real' exception handling.
}
}
Of course an if could solve this just as easy, but the fact remains that I just fail to see why you couldn’t use exceptions (logically that is, I can see that when performance is an issue that in certain cases exceptions should be avoided).
I think in many cases they are more clear than if-structures and don’t break any contract the method is making.
IMHO the “Never use them for regular program flow” reaction everybody seems to have is not that well underbuild as the strength of that reaction can justify.
Or am I mistaken?
I've read other posts, dealing with all kind of special cases, but my point is there's nothing wrong with it if you are both:
Clear
Honour the contract of your method
Shoot me.
Have you ever tried to debug a program raising five exceptions per second in the normal course of operation ?
I have.
The program was quite complex (it was a distributed calculation server), and a slight modification at one side of the program could easily break something in a totally different place.
I wish I could just have launched the program and wait for exceptions to occur, but there were around 200 exceptions during the start-up in the normal course of operations
My point : if you use exceptions for normal situations, how do you locate unusual (ie exceptional) situations ?
Of course, there are other strong reasons not to use exceptions too much, especially performance-wise
Exceptions are basically non-local goto statements with all the consequences of the latter. Using exceptions for flow control violates a principle of least astonishment, make programs hard to read (remember that programs are written for programmers first).
Moreover, this is not what compiler vendors expect. They expect exceptions to be thrown rarely, and they usually let the throw code be quite inefficient. Throwing exceptions is one of the most expensive operations in .NET.
However, some languages (notably Python) use exceptions as flow-control constructs. For example, iterators raise a StopIteration exception if there are no further items. Even standard language constructs (such as for) rely on this.
My rule of thumb is:
If you can do anything to recover from an error, catch exceptions
If the error is a very common one (eg. user tried to log in with the wrong password), use returnvalues
If you can't do anything to recover from an error, leave it uncaught (Or catch it in your main-catcher to do some semi-graceful shutdown of the application)
The problem I see with exceptions is from a purely syntax point of view (I'm pretty sure the perfomance overhead is minimal). I don't like try-blocks all over the place.
Take this example:
try
{
DoSomeMethod(); //Can throw Exception1
DoSomeOtherMethod(); //Can throw Exception1 and Exception2
}
catch(Exception1)
{
//Okay something messed up, but is it SomeMethod or SomeOtherMethod?
}
.. Another example could be when you need to assign something to a handle using a factory, and that factory could throw an exception:
Class1 myInstance;
try
{
myInstance = Class1Factory.Build();
}
catch(SomeException)
{
// Couldn't instantiate class, do something else..
}
myInstance.BestMethodEver(); // Will throw a compile-time error, saying that myInstance is uninitalized, which it potentially is.. :(
Soo, personally, I think you should keep exceptions for rare error-conditions (out of memory etc.) and use returnvalues (valueclasses, structs or enums) to do your error checking instead.
Hope I understood your question correct :)
A first reaction to a lot of answers :
you're writing for the programmers and the principle of least astonishment
Of course! But an if just isnot more clear all the time.
It shouldn't be astonishing eg : divide (1/x) catch (divisionByZero) is more clear than any if to me (at Conrad and others) . The fact this kind of programming isn't expected is purely conventional, and indeed, still relevant. Maybe in my example an if would be clearer.
But DivisionByZero and FileNotFound for that matter are clearer than ifs.
Of course if it's less performant and needed a zillion time per sec, you should of course avoid it, but still i haven't read any good reason to avoid the overal design.
As far as the principle of least astonishment goes : there's a danger of circular reasoning here : suppose a whole community uses a bad design, this design will become expected! Therefore the principle cannot be a grail and should be concidered carefully.
exceptions for normal situations, how do you locate unusual (ie exceptional) situations ?
In many reactions sth. like this shines trough. Just catch them, no? Your method should be clear, well documented, and hounouring it's contract. I don't get that question I must admit.
Debugging on all exceptions : the same, that's just done sometimes because the design not to use exceptions is common. My question was : why is it common in the first place?
Before exceptions, in C, there were setjmp and longjmp that could be used to accomplish a similar unrolling of the stack frame.
Then the same construct was given a name: "Exception". And most of the answers rely on the meaning of this name to argue about its usage, claiming that exceptions are intended to be used in exceptional conditions. That was never the intent in the original longjmp. There were just situations where you needed to break control flow across many stack frames.
Exceptions are slightly more general in that you can use them within the same stack frame too. This raises analogies with goto that I believe are wrong. Gotos are a tightly coupled pair (and so are setjmp and longjmp). Exceptions follow a loosely coupled publish/subscribe that is much cleaner! Therefore using them within the same stack frame is hardly the same thing as using gotos.
The third source of confusion relates to whether they are checked or unchecked exceptions. Of course, unchecked exceptions seem particularly awful to use for control flow and perhaps a lot of other things.
Checked exceptions however are great for control flow, once you get over all the Victorian hangups and live a little.
My favorite usage is a sequence of throw new Success() in a long fragment of code that tries one thing after the other until it finds what it is looking for. Each thing -- each piece of logic -- may have arbritrary nesting so break's are out as also any kind of condition tests. The if-else pattern is brittle. If I edit out an else or mess up the syntax in some other way, then there is a hairy bug.
Using throw new Success() linearizes the code flow. I use locally defined Success classes -- checked of course -- so that if I forget to catch it the code won't compile. And I don't catch another method's Successes.
Sometimes my code checks for one thing after the other and only succeeds if everything is OK. In this case I have a similar linearization using throw new Failure().
Using a separate function messes with the natural level of compartmentalization. So the return solution is not optimal. I prefer to have a page or two of code in one place for cognitive reasons. I don't believe in ultra-finely divided code.
What JVMs or compilers do is less relevant to me unless there is a hotspot. I cannot believe there is any fundamental reason for compilers to not detect locally thrown and caught Exceptions and simply treat them as very efficient gotos at the machine code level.
As far as using them across functions for control flow -- i. e. for common cases rather than exceptional ones -- I cannot see how they would be less efficient than multiple break, condition tests, returns to wade through three stack frames as opposed to just restore the stack pointer.
I personally do not use the pattern across stack frames and I can see how it would require design sophistication to do so elegantly. But used sparingly it should be fine.
Lastly, regarding surprising virgin programmers, it is not a compelling reason. If you gently introduce them to the practice, they will learn to love it. I remember C++ used to surprise and scare the heck out of C programmers.
The standard anwser is that exceptions are not regular and should be used in exceptional cases.
One reason, which is important to me, is that when I read a try-catch control structure in a software I maintain or debug, I try to find out why the original coder used an exception handling instead of an if-else structure. And I expect to find a good answer.
Remember that you write code not only for the computer but also for other coders. There is a semantic associated to an exception handler that you cannot throw away just because the machine doesn't mind.
Josh Bloch deals with this topic extensively in Effective Java. His suggestions are illuminating and should apply to .NET as well (except for the details).
In particular, exceptions should be used for exceptional circumstances. The reasons for this are usability-related, mainly. For a given method to be maximally usable, its input and output conditions should be maximally constrained.
For example, the second method is easier to use than the first:
/**
* Adds two positive numbers.
*
* #param addend1 greater than zero
* #param addend2 greater than zero
* #throws AdditionException if addend1 or addend2 is less than or equal to zero
*/
int addPositiveNumbers(int addend1, int addend2) throws AdditionException{
if( addend1 <= 0 ){
throw new AdditionException("addend1 is <= 0");
}
else if( addend2 <= 0 ){
throw new AdditionException("addend2 is <= 0");
}
return addend1 + addend2;
}
/**
* Adds two positive numbers.
*
* #param addend1 greater than zero
* #param addend2 greater than zero
*/
public int addPositiveNumbers(int addend1, int addend2) {
if( addend1 <= 0 ){
throw new IllegalArgumentException("addend1 is <= 0");
}
else if( addend2 <= 0 ){
throw new IllegalArgumentException("addend2 is <= 0");
}
return addend1 + addend2;
}
In either case, you need to check to make sure that the caller is using your API appropriately. But in the second case, you require it (implicitly). The soft Exceptions will still be thrown if the user didn't read the javadoc, but:
You don't need to document it.
You don't need to test for it (depending upon how aggresive your
unit testing strategy is).
You don't require the caller to handle three use cases.
The ground-level point is that Exceptions should not be used as return codes, largely because you've complicated not only YOUR API, but the caller's API as well.
Doing the right thing comes at a cost, of course. The cost is that everyone needs to understand that they need to read and follow the documentation. Hopefully that is the case anyway.
How about performance? While load testing a .NET web app we topped out at 100 simulated users per web server until we fixed a commonly-occuring exception and that number increased to 500 users.
I think that you can use Exceptions for flow control. There is, however, a flipside of this technique. Creating Exceptions is a costly thing, because they have to create a stack trace. So if you want to use Exceptions more often than for just signalling an exceptional situation you have to make sure that building the stack traces doesn't negatively influence your performance.
The best way to cut down the cost of creating exceptions is to override the fillInStackTrace() method like this:
public Throwable fillInStackTrace() { return this; }
Such an exception will have no stacktraces filled in.
Here are best practices I described in my blog post:
Throw an exception to state an unexpected situation in your software.
Use return values for input validation.
If you know how to deal with exceptions a library throws, catch them at the lowest level possible.
If you have an unexpected exception, discard current operation completely. Don’t pretend you know how to deal with them.
I don't really see how you're controlling program flow in the code you cited. You'll never see another exception besides the ArgumentOutOfRange exception. (So your second catch clause will never be hit). All you're doing is using an extremely costly throw to mimic an if statement.
Also you aren't performing the more sinister of operations where you just throw an exception purely for it to be caught somewhere else to perform flow control. You're actually handling an exceptional case.
Apart from the reasons stated, one reason not to use exceptions for flow control is that it can greatly complicate the debugging process.
For example, when I'm trying to track down a bug in VS I'll typically turn on "break on all exceptions". If you're using exceptions for flow control then I'm going to be breaking in the debugger on a regular basis and will have to keep ignoring these non-exceptional exceptions until I get to the real problem. This is likely to drive someone mad!!
Lets assume you have a method that does some calculations. There are many input parameters it has to validate, then to return a number greater then 0.
Using return values to signal validation error, it's simple: if method returned a number lesser then 0, an error occured. How to tell then which parameter didn't validate?
I remember from my C days a lot of functions returned error codes like this:
-1 - x lesser then MinX
-2 - x greater then MaxX
-3 - y lesser then MinY
etc.
Is it really less readable then throwing and catching an exception?
Because the code is hard to read, you may have troubles debugging it, you will introduce new bugs when fixing bugs after a long time, it is more expensive in terms of resources and time, and it annoys you if you are debugging your code and the debugger halts on the occurence of every exception ;)
If you are using exception handlers for control flow, you are being too general and lazy. As someone else mentioned, you know something happened if you are handling processing in the handler, but what exactly? Essentially you are using the exception for an else statement, if you are using it for control flow.
If you don't know what possible state could occur, then you can use an exception handler for unexpected states, for example when you have to use a third-party library, or you have to catch everything in the UI to show a nice error message and log the exception.
However, if you do know what might go wrong, and you don't put an if statement or something to check for it, then you are just being lazy. Allowing the exception handler to be the catch-all for stuff you know could happen is lazy, and it will come back to haunt you later, because you will be trying to fix a situation in your exception handler based on a possibly false assumption.
If you put logic in your exception handler to determine what exactly happened, then you would be quite stupid for not putting that logic inside the try block.
Exception handlers are the last resort, for when you run out of ideas/ways to stop something from going wrong, or things are beyond your ability to control. Like, the server is down and times out and you can't prevent that exception from being thrown.
Finally, having all the checks done up front shows what you know or expect will occur and makes it explicit. Code should be clear in intent. What would you rather read?
You can use a hammer's claw to turn a screw, just like you can use exceptions for control flow. That doesn't mean it is the intended usage of the feature. The if statement expresses conditions, whose intended usage is controlling flow.
If you are using a feature in an unintended way while choosing to not use the feature designed for that purpose, there will be an associated cost. In this case, clarity and performance suffer for no real added value. What does using exceptions buy you over the widely-accepted if statement?
Said another way: just because you can doesn't mean you should.
As others have mentioned numerously, the principle of least astonishment will forbid that you use exceptions excessively for control flow only purposes. On the other hand, no rule is 100% correct, and there are always those cases where an exception is "just the right tool" - much like goto itself, by the way, which ships in the form of break and continue in languages like Java, which are often the perfect way to jump out of heavily nested loops, which aren't always avoidable.
The following blog post explains a rather complex but also rather interesting use-case for a non-local ControlFlowException:
http://blog.jooq.org/2013/04/28/rare-uses-of-a-controlflowexception
It explains how inside of jOOQ (a SQL abstraction library for Java), such exceptions are occasionally used to abort the SQL rendering process early when some "rare" condition is met.
Examples of such conditions are:
Too many bind values are encountered. Some databases do not support arbitrary numbers of bind values in their SQL statements (SQLite: 999, Ingres 10.1.0: 1024, Sybase ASE 15.5: 2000, SQL Server 2008: 2100). In those cases, jOOQ aborts the SQL rendering phase and re-renders the SQL statement with inlined bind values. Example:
// Pseudo-code attaching a "handler" that will
// abort query rendering once the maximum number
// of bind values was exceeded:
context.attachBindValueCounter();
String sql;
try {
// In most cases, this will succeed:
sql = query.render();
}
catch (ReRenderWithInlinedVariables e) {
sql = query.renderWithInlinedBindValues();
}
If we explicitly extracted the bind values from the query AST to count them every time, we'd waste valuable CPU cycles for those 99.9% of the queries that don't suffer from this problem.
Some logic is available only indirectly via an API that we want to execute only "partially". The UpdatableRecord.store() method generates an INSERT or UPDATE statement, depending on the Record's internal flags. From the "outside", we don't know what kind of logic is contained in store() (e.g. optimistic locking, event listener handling, etc.) so we don't want to repeat that logic when we store several records in a batch statement, where we'd like to have store() only generate the SQL statement, not actually execute it. Example:
// Pseudo-code attaching a "handler" that will
// prevent query execution and throw exceptions
// instead:
context.attachQueryCollector();
// Collect the SQL for every store operation
for (int i = 0; i < records.length; i++) {
try {
records[i].store();
}
// The attached handler will result in this
// exception being thrown rather than actually
// storing records to the database
catch (QueryCollectorException e) {
// The exception is thrown after the rendered
// SQL statement is available
queries.add(e.query());
}
}
If we had externalised the store() logic into "re-usable" API that can be customised to optionally not execute the SQL, we'd be looking into creating a rather hard to maintain, hardly re-usable API.
Conclusion
In essence, our usage of these non-local gotos is just along the lines of what [Mason Wheeler][5] said in his answer:
"I just encountered a situation that I cannot deal with properly at this point, because I don't have enough context to handle it, but the routine that called me (or something further up the call stack) ought to know how to handle it."
Both usages of ControlFlowExceptions were rather easy to implement compared to their alternatives, allowing us to reuse a wide range of logic without refactoring it out of the relevant internals.
But the feeling of this being a bit of a surprise to future maintainers remains. The code feels rather delicate and while it was the right choice in this case, we'd always prefer not to use exceptions for local control flow, where it is easy to avoid using ordinary branching through if - else.
Typically there is nothing wrong, per se, with handling an exception at a low level. An exception IS a valid message that provides a lot of detail for why an operation cannot be performed. And if you can handle it, you ought to.
In general if you know there is a high probability of failure that you can check for... you should do the check... i.e. if(obj != null) obj.method()
In your case, i'm not familiar enough with the C# library to know if date time has an easy way to check whether a timestamp is out of bounds. If it does, just call if(.isvalid(ts))
otherwise your code is basically fine.
So, basically it comes down to whichever way creates cleaner code... if the operation to guard against an expected exception is more complex than just handling the exception; than you have my permission to handle the exception instead of creating complex guards everywhere.
You might be interested in having a look at Common Lisp's condition system which is a sort of generalization of exceptions done right. Because you can unwind the stack or not in a controlled way, you get "restarts" as well, which are extremely handy.
This doesn't have anything much to do with best practices in other languages, but it shows you what can be done with some design thought in (roughly) the direction you are thinking of.
Of course there are still performance considerations if you're bouncing up and down the stack like a yo-yo, but it's a much more general idea than "oh crap, lets bail" kind of approach that most catch/throw exception systems embody.
I don't think there is anything wrong with using Exceptions for flow-control. Exceptions are somewhat similar to continuations and in statically typed languages, Exceptions are more powerful than continuations, so, if you need continuations but your language doesn't have them, you can use Exceptions to implement them.
Well, actually, if you need continuations and your language doesn't have them, you chose the wrong language and you should rather be using a different one. But sometimes you don't have a choice: client-side web programming is the prime example – there's just no way to get around JavaScript.
An example: Microsoft Volta is a project to allow writing web applications in straight-forward .NET, and let the framework take care of figuring out which bits need to run where. One consequence of this is that Volta needs to be able to compile CIL to JavaScript, so that you can run code on the client. However, there is a problem: .NET has multithreading, JavaScript doesn't. So, Volta implements continuations in JavaScript using JavaScript Exceptions, then implements .NET Threads using those continuations. That way, Volta applications that use threads can be compiled to run in an unmodified browser – no Silverlight needed.
But you won't always know what happens in the Method/s that you call. You won't know exactly where the exception was thrown. Without examining the exception object in greater detail....
I feel that there is nothing wrong with your example. On the contrary, it would be a sin to ignore the exception thrown by the called function.
In the JVM, throwing an exception is not that expensive, only creating the exception with new xyzException(...), because the latter involves a stack walk. So if you have some exceptions created in advance, you may throw them many times without costs. Of course, this way you can't pass data along with the exception, but I think that is a bad thing to do anyway.
There are a few general mechanisms via which a language could allow for a method to exit without returning a value and unwind to the next "catch" block:
Have the method examine the stack frame to determine the call site, and use the metadata for the call site to find either information about a try block within the calling method, or the location where the calling method stored the address of its caller; in the latter situation, examine metadata for the caller's caller to determine in the same fashion as the immediate caller, repeating until one finds a try block or the stack is empty. This approach adds very little overhead to the no-exception case (it does preclude some optimizations) but is expensive when an exception occurs.
Have the method return a "hidden" flag which distinguishes a normal return from an exception, and have the caller check that flag and branch to an "exception" routine if it's set. This routine adds 1-2 instructions to the no-exception case, but relatively little overhead when an exception occurs.
Have the caller place exception-handling information or code at a fixed address relative to the stacked return address. For example, with the ARM, instead of using the instruction "BL subroutine", one could use the sequence:
adr lr,next_instr
b subroutine
b handle_exception
next_instr:
To exit normally, the subroutine would simply do bx lr or pop {pc}; in case of an abnormal exit, the subroutine would either subtract 4 from LR before performing the return or use sub lr,#4,pc (depending upon the ARM variation, execution mode, etc.) This approach will malfunction very badly if the caller is not designed to accommodate it.
A language or framework which uses checked exceptions might benefit from having those handled with a mechanism like #2 or #3 above, while unchecked exceptions are handled using #1. Although the implementation of checked exceptions in Java is rather nuisancesome, they would not be a bad concept if there were a means by which a call site could say, essentially, "This method is declared as throwing XX, but I don't expect it ever to do so; if it does, rethrow as an "unchecked" exception. In a framework where checked exceptions were handled in such fashion, they could be an effective means of flow control for things like parsing methods which in some contexts may have a high likelihood of failure, but where failure should return fundamentally different information than success. I'm unaware of any frameworks that use such a pattern, however. Instead, the more common pattern is to use the first approach above (minimal cost for the no-exception case, but high cost when exceptions are thrown) for all exceptions.
One aesthetic reason:
A try always comes with a catch, whereas an if doesn't have to come with an else.
if (PerformCheckSucceeded())
DoSomething();
With try/catch, it becomes much more verbose.
try
{
PerformCheckSucceeded();
DoSomething();
}
catch
{
}
That's 6 lines of code too many.
When I ran ReSharper on my code, for example:
if (some condition)
{
Some code...
}
ReSharper gave me the above warning (Invert "if" statement to reduce nesting), and suggested the following correction:
if (!some condition) return;
Some code...
I would like to understand why that's better. I always thought that using "return" in the middle of a method problematic, somewhat like "goto".
It is not only aesthetic, but it also reduces the maximum nesting level inside the method. This is generally regarded as a plus because it makes methods easier to understand (and indeed, many static analysis tools provide a measure of this as one of the indicators of code quality).
On the other hand, it also makes your method have multiple exit points, something that another group of people believes is a no-no.
Personally, I agree with ReSharper and the first group (in a language that has exceptions I find it silly to discuss "multiple exit points"; almost anything can throw, so there are numerous potential exit points in all methods).
Regarding performance: both versions should be equivalent (if not at the IL level, then certainly after the jitter is through with the code) in every language. Theoretically this depends on the compiler, but practically any widely used compiler of today is capable of handling much more advanced cases of code optimization than this.
A return in the middle of the method is not necessarily bad. It might be better to return immediately if it makes the intent of the code clearer. For example:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
};
}
return result;
};
In this case, if _isDead is true, we can immediately get out of the method. It might be better to structure it this way instead:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
I've picked this code from the refactoring catalog. This specific refactoring is called: Replace Nested Conditional with Guard Clauses.
This is a bit of a religious argument, but I agree with ReSharper that you should prefer less nesting. I believe that this outweighs the negatives of having multiple return paths from a function.
The key reason for having less nesting is to improve code readability and maintainability. Remember that many other developers will need to read your code in the future, and code with less indentation is generally much easier to read.
Preconditions are a great example of where it is okay to return early at the start of the function. Why should the readability of the rest of the function be affected by the presence of a precondition check?
As for the negatives about returning multiple times from a method - debuggers are pretty powerful now, and it's very easy to find out exactly where and when a particular function is returning.
Having multiple returns in a function is not going to affect the maintainance programmer's job.
Poor code readability will.
As others have mentioned, there shouldn't be a performance hit, but there are other considerations. Aside from those valid concerns, this also can open you up to gotchas in some circumstances. Suppose you were dealing with a double instead:
public void myfunction(double exampleParam){
if(exampleParam > 0){
//Body will *not* be executed if Double.IsNan(exampleParam)
}
}
Contrast that with the seemingly equivalent inversion:
public void myfunction(double exampleParam){
if(exampleParam <= 0)
return;
//Body *will* be executed if Double.IsNan(exampleParam)
}
So in certain circumstances what appears to be a a correctly inverted if might not be.
The idea of only returning at the end of a function came back from the days before languages had support for exceptions. It enabled programs to rely on being able to put clean-up code at the end of a method, and then being sure it would be called and some other programmer wouldn't hide a return in the method that caused the cleanup code to be skipped. Skipped cleanup code could result in a memory or resource leak.
However, in a language that supports exceptions, it provides no such guarantees. In a language that supports exceptions, the execution of any statement or expression can cause a control flow that causes the method to end. This means clean-up must be done through using the finally or using keywords.
Anyway, I'm saying I think a lot of people quote the 'only return at the end of a method' guideline without understanding why it was ever a good thing to do, and that reducing nesting to improve readability is probably a better aim.
I'd like to add that there is name for those inverted if's - Guard Clause. I use it whenever I can.
I hate reading code where there is if at the beginning, two screens of code and no else. Just invert if and return. That way nobody will waste time scrolling.
http://c2.com/cgi/wiki?GuardClause
It doesn't only affect aesthetics, but it also prevents code nesting.
It can actually function as a precondition to ensure that your data is valid as well.
This is of course subjective, but I think it strongly improves on two points:
It is now immediately obvious that your function has nothing left to do if condition holds.
It keeps the nesting level down. Nesting hurts readability more than you'd think.
Multiple return points were a problem in C (and to a lesser extent C++) because they forced you to duplicate clean-up code before each of the return points. With garbage collection, the try | finally construct and using blocks, there's really no reason why you should be afraid of them.
Ultimately it comes down to what you and your colleagues find easier to read.
Guard clauses or pre-conditions (as you can probably see) check to see if a certain condition is met and then breaks the flow of the program. They're great for places where you're really only interested in one outcome of an if statement. So rather than say:
if (something) {
// a lot of indented code
}
You reverse the condition and break if that reversed condition is fulfilled
if (!something) return false; // or another value to show your other code the function did not execute
// all the code from before, save a lot of tabs
return is nowhere near as dirty as goto. It allows you to pass a value to show the rest of your code that the function couldn't run.
You'll see the best examples of where this can be applied in nested conditions:
if (something) {
do-something();
if (something-else) {
do-another-thing();
} else {
do-something-else();
}
}
vs:
if (!something) return;
do-something();
if (!something-else) return do-something-else();
do-another-thing();
You'll find few people arguing the first is cleaner but of course, it's completely subjective. Some programmers like to know what conditions something is operating under by indentation, while I'd much rather keep method flow linear.
I won't suggest for one moment that precons will change your life or get you laid but you might find your code just that little bit easier to read.
Performance-wise, there will be no noticeable difference between the two approaches.
But coding is about more than performance. Clarity and maintainability are also very important. And, in cases like this where it doesn't affect performance, it is the only thing that matters.
There are competing schools of thought as to which approach is preferable.
One view is the one others have mentioned: the second approach reduces the nesting level, which improves code clarity. This is natural in an imperative style: when you have nothing left to do, you might as well return early.
Another view, from the perspective of a more functional style, is that a method should have only one exit point. Everything in a functional language is an expression. So if statements must always have an else clauses. Otherwise the if expression wouldn't always have a value. So in the functional style, the first approach is more natural.
There are several good points made here, but multiple return points can be unreadable as well, if the method is very lengthy. That being said, if you're going to use multiple return points just make sure that your method is short, otherwise the readability bonus of multiple return points may be lost.
Performance is in two parts. You have performance when the software is in production, but you also want to have performance while developing and debugging. The last thing a developer wants is to "wait" for something trivial. In the end, compiling this with optimization enabled will result in similar code. So it's good to know these little tricks that pay off in both scenarios.
The case in the question is clear, ReSharper is correct. Rather than nesting if statements, and creating new scope in code, you're setting a clear rule at the start of your method. It increases readability, it will be easier to maintain, and it reduces the amount of rules one has to sift through to find where they want to go.
Personally I prefer only 1 exit point. It's easy to accomplish if you keep your methods short and to the point, and it provides a predictable pattern for the next person who works on your code.
eg.
bool PerformDefaultOperation()
{
bool succeeded = false;
DataStructure defaultParameters;
if ((defaultParameters = this.GetApplicationDefaults()) != null)
{
succeeded = this.DoSomething(defaultParameters);
}
return succeeded;
}
This is also very useful if you just want to check the values of certain local variables within a function before it exits. All you need to do is place a breakpoint on the final return and you are guaranteed to hit it (unless an exception is thrown).
Avoiding multiple exit points can lead to performance gains. I am not sure about C# but in C++ the Named Return Value Optimization (Copy Elision, ISO C++ '03 12.8/15) depends on having a single exit point. This optimization avoids copy constructing your return value (in your specific example it doesn't matter). This could lead to considerable gains in performance in tight loops, as you are saving a constructor and a destructor each time the function is invoked.
But for 99% of the cases saving the additional constructor and destructor calls is not worth the loss of readability nested if blocks introduce (as others have pointed out).
Many good reasons about how the code looks like. But what about results?
Let's take a look to some C# code and its IL compiled form:
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length == 0) return;
if ((args.Length+2)/3 == 5) return;
Console.WriteLine("hey!!!");
}
}
This simple snippet can be compiled. You can open the generated .exe file with ildasm and check what is the result. I won't post all the assembler thing but I'll describe the results.
The generated IL code does the following:
If the first condition is false, jumps to the code where the second is.
If it's true jumps to the last instruction. (Note: the last instruction is a return).
In the second condition the same happens after the result is calculated. Compare and: got to the Console.WriteLine if false or to the end if this is true.
Print the message and return.
So it seems that the code will jump to the end. What if we do a normal if with nested code?
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length != 0 && (args.Length+2)/3 != 5)
{
Console.WriteLine("hey!!!");
}
}
}
The results are quite similar in IL instructions. The difference is that before there were two jumps per condition: if false go to next piece of code, if true go to the end. And now the IL code flows better and has 3 jumps (the compiler optimized this a bit):
First jump: when Length is 0 to a part where the code jumps again (Third jump) to the end.
Second: in the middle of the second condition to avoid one instruction.
Third: if the second condition is false, jump to the end.
Anyway, the program counter will always jump.
In theory, inverting if could lead to better performance if it increases branch prediction hit rate. In practice, I think it is very hard to know exactly how branch prediction will behave, especially after compiling, so I would not do it in my day-to-day development, except if I am writing assembly code.
More on branch prediction here.
That is simply controversial. There is no "agreement among programmers" on the question of early return. It's always subjective, as far as I know.
It's possible to make a performance argument, since it's better to have conditions that are written so they are most often true; it can also be argued that it is clearer. It does, on the other hand, create nested tests.
I don't think you will get a conclusive answer to this question.
There are a lot of insightful answers there already, but still, I would to direct to a slightly different situation: Instead of precondition, that should be put on top of a function indeed, think of a step-by-step initialization, where you have to check for each step to succeed and then continue with the next. In this case, you cannot check everything at the top.
I found my code really unreadable when writing an ASIO host application with Steinberg's ASIOSDK, as I followed the nesting paradigm. It went like eight levels deep, and I cannot see a design flaw there, as mentioned by Andrew Bullock above. Of course, I could have packed some inner code to another function, and then nested the remaining levels there to make it more readable, but this seems rather random to me.
By replacing nesting with guard clauses, I even discovered a misconception of mine regarding a portion of cleanup-code that should have occurred much earlier within the function instead of at the end. With nested branches, I would never have seen that, you could even say they led to my misconception.
So this might be another situation where inverted ifs can contribute to a clearer code.
It's a matter of opinion.
My normal approach would be to avoid single line ifs, and returns in the middle of a method.
You wouldn't want lines like it suggests everywhere in your method but there is something to be said for checking a bunch of assumptions at the top of your method, and only doing your actual work if they all pass.
In my opinion early return is fine if you are just returning void (or some useless return code you're never gonna check) and it might improve readability because you avoid nesting and at the same time you make explicit that your function is done.
If you are actually returning a returnValue - nesting is usually a better way to go cause you return your returnValue just in one place (at the end - duh), and it might make your code more maintainable in a whole lot of cases.
I'm not sure, but I think, that R# tries to avoid far jumps. When You have IF-ELSE, compiler does something like this:
Condition false -> far jump to false_condition_label
true_condition_label:
instruction1
...
instruction_n
false_condition_label:
instruction1
...
instruction_n
end block
If condition is true there is no jump and no rollout L1 cache, but jump to false_condition_label can be very far and processor must rollout his own cache. Synchronising cache is expensive. R# tries replace far jumps into short jumps and in this case there is bigger probability, that all instructions are already in cache.
I think it depends on what you prefer, as mentioned, theres no general agreement afaik.
To reduce annoyment, you may reduce this kind of warning to "Hint"
My idea is that the return "in the middle of a function" shouldn't be so "subjective".
The reason is quite simple, take this code:
function do_something( data ){
if (!is_valid_data( data ))
return false;
do_something_that_take_an_hour( data );
istance = new object_with_very_painful_constructor( data );
if ( istance is not valid ) {
error_message( );
return ;
}
connect_to_database ( );
get_some_other_data( );
return;
}
Maybe the first "return" it's not SO intuitive, but that's really saving.
There are too many "ideas" about clean codes, that simply need more practise to lose their "subjective" bad ideas.
There are several advantages to this sort of coding but for me the big win is, if you can return quick you can improve the speed of your application. IE I know that because of Precondition X that I can return quickly with an error. This gets rid of the error cases first and reduces the complexity of your code. In a lot of cases because the cpu pipeline can be now be cleaner it can stop pipeline crashes or switches. Secondly if you are in a loop, breaking or returning out quickly can save you a lots of cpu. Some programmers use loop invariants to do this sort of quick exit but in this you can broke your cpu pipeline and even create memory seek problem and mean the the cpu needs to load from outside cache. But basically I think you should do what you intended, that is end the loop or function not create a complex code path just to implement some abstract notion of correct code. If the only tool you have is a hammer then everything looks like a nail.
if I have a specific exception that I expect when it is going to occur;
and to handle it for example I chose to display an error message upon its occurance, which would be better to do, and why?
Explanatory code:
try
{
string result = dictionary[key];
}
catch (KeyNotFoundException e)
{
//display error
}
or:
if(!dictionary.ContainsKey(key))
{
//display error
}
Generally, exceptions are used to indicate exceptional conditions - something that would not normally occur, but your program still needs to handle gracefully (eg a file being inaccessible or readonly, a network connection going down). Normal control flow, like checking for a value in a dictionary, should not use exceptions if there is an equivalent function that has the same effect without using exceptions.
Having extra try/catch statements in the code also makes it less readable, and having exception handlers around a block of code puts certain limitations on the CLR that can cause worse performance.
In your example, if it is expected that the dictionary will have a certain key value, I would do something like
string result;
if (!dictionary.TryGetValue(key, out result)
{
// display error
return; // or throw a specific exception if it really is a fatal error
}
// continue normal processing
This is a lot clearer than just having an exception handler round an element access
Neither.
The second option is better than the first. As you expect this to happen normally, it's better to avoid the exception. Exceptions should preferrably only be used for exceptional situations, i.e. something that you can't easily predict and test for.
The best option however is the TryGetValue method, as it does both check and fetch:
if (dictionary.TryGetValue(key, out result)) {
// use the result
} else {
// display error
}
The second approuch is better. Exception throwing can be very expensive.
The second approach is probably better. Remember that exceptions are used for exceptional circumstances. Use this principle to guide your decision:
If you require that the key exists in the dictionary as an application invariant, then assume that it is there and deal with the exception if it isn't there.
If your application code doesn't require that the entry exist in the dictionary, then call ContainsKey() first.
My guess is that the latter is probably the correct course of action.
Disclaimer: I generally eschew the advice that performance should be the primary consideration here. Only let performance impact your decision once you have proven that you have a bottleneck! Anything before that is premature optimization and will lead to unnecessarily complication application code.
The second approach is better for at least 3 reasons:
1) It is clearer. As a reader of your code, I expect an exception to indicate that something has gone wrong, even if it's handled.
2) When debugging with Visual Studio it's common to break on all exceptions, that makes it mildly annoying to deal with code which always throws an exception.
3) The second version is faster, but the effect is very small unless you are throwing many exceptions a second in a time-critical piece of code.
The second approach is better because throwing and hanlding exception has its performance hit. Throw rates above 100 per second are likely to noticeably impact the perfor-
mance of most applications. Consider Exceptions and Performance.
Exception handling is most useful when you need to provide an easy way out of a difficult situation - it can greatly simplify the code and decrease the potential for corner-case bugs.
It offers little advantage in very simple situations like this, and due to its performance penalty should not be used in such cases.
It all depends on what you're application is doing and what the specific code is doing.
As thecoop says exceptions should be used for exceptional conditions. As an illustration take writing to a file. If checking for the existence of the file would slow your application down and/or the absence of the file is a serious problem then allow the exception to occur and trap that. If it's not so critical or recreating the file isn't a problem then do the file existence check first.
I've seen it argued that all error handling should be done via exceptions (most recently in Clean Code by Robert Martin) but I don't agree.
It depends greatly on what the dictionary is meant to do. If there is a high chance that the key will not be found because of program design, then you should do the trygetvalue. However, if the design of your program is such that not finding the key is an exceptional event, you should use the exception handler method.
Just a reminder. It is not an exception.
An exception is something sounds like "no /etc on my UNIX machine".
If you get wrong sense, you'll write wrong code shown as above.
When I ran ReSharper on my code, for example:
if (some condition)
{
Some code...
}
ReSharper gave me the above warning (Invert "if" statement to reduce nesting), and suggested the following correction:
if (!some condition) return;
Some code...
I would like to understand why that's better. I always thought that using "return" in the middle of a method problematic, somewhat like "goto".
It is not only aesthetic, but it also reduces the maximum nesting level inside the method. This is generally regarded as a plus because it makes methods easier to understand (and indeed, many static analysis tools provide a measure of this as one of the indicators of code quality).
On the other hand, it also makes your method have multiple exit points, something that another group of people believes is a no-no.
Personally, I agree with ReSharper and the first group (in a language that has exceptions I find it silly to discuss "multiple exit points"; almost anything can throw, so there are numerous potential exit points in all methods).
Regarding performance: both versions should be equivalent (if not at the IL level, then certainly after the jitter is through with the code) in every language. Theoretically this depends on the compiler, but practically any widely used compiler of today is capable of handling much more advanced cases of code optimization than this.
A return in the middle of the method is not necessarily bad. It might be better to return immediately if it makes the intent of the code clearer. For example:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
};
}
return result;
};
In this case, if _isDead is true, we can immediately get out of the method. It might be better to structure it this way instead:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
I've picked this code from the refactoring catalog. This specific refactoring is called: Replace Nested Conditional with Guard Clauses.
This is a bit of a religious argument, but I agree with ReSharper that you should prefer less nesting. I believe that this outweighs the negatives of having multiple return paths from a function.
The key reason for having less nesting is to improve code readability and maintainability. Remember that many other developers will need to read your code in the future, and code with less indentation is generally much easier to read.
Preconditions are a great example of where it is okay to return early at the start of the function. Why should the readability of the rest of the function be affected by the presence of a precondition check?
As for the negatives about returning multiple times from a method - debuggers are pretty powerful now, and it's very easy to find out exactly where and when a particular function is returning.
Having multiple returns in a function is not going to affect the maintainance programmer's job.
Poor code readability will.
As others have mentioned, there shouldn't be a performance hit, but there are other considerations. Aside from those valid concerns, this also can open you up to gotchas in some circumstances. Suppose you were dealing with a double instead:
public void myfunction(double exampleParam){
if(exampleParam > 0){
//Body will *not* be executed if Double.IsNan(exampleParam)
}
}
Contrast that with the seemingly equivalent inversion:
public void myfunction(double exampleParam){
if(exampleParam <= 0)
return;
//Body *will* be executed if Double.IsNan(exampleParam)
}
So in certain circumstances what appears to be a a correctly inverted if might not be.
The idea of only returning at the end of a function came back from the days before languages had support for exceptions. It enabled programs to rely on being able to put clean-up code at the end of a method, and then being sure it would be called and some other programmer wouldn't hide a return in the method that caused the cleanup code to be skipped. Skipped cleanup code could result in a memory or resource leak.
However, in a language that supports exceptions, it provides no such guarantees. In a language that supports exceptions, the execution of any statement or expression can cause a control flow that causes the method to end. This means clean-up must be done through using the finally or using keywords.
Anyway, I'm saying I think a lot of people quote the 'only return at the end of a method' guideline without understanding why it was ever a good thing to do, and that reducing nesting to improve readability is probably a better aim.
I'd like to add that there is name for those inverted if's - Guard Clause. I use it whenever I can.
I hate reading code where there is if at the beginning, two screens of code and no else. Just invert if and return. That way nobody will waste time scrolling.
http://c2.com/cgi/wiki?GuardClause
It doesn't only affect aesthetics, but it also prevents code nesting.
It can actually function as a precondition to ensure that your data is valid as well.
This is of course subjective, but I think it strongly improves on two points:
It is now immediately obvious that your function has nothing left to do if condition holds.
It keeps the nesting level down. Nesting hurts readability more than you'd think.
Multiple return points were a problem in C (and to a lesser extent C++) because they forced you to duplicate clean-up code before each of the return points. With garbage collection, the try | finally construct and using blocks, there's really no reason why you should be afraid of them.
Ultimately it comes down to what you and your colleagues find easier to read.
Guard clauses or pre-conditions (as you can probably see) check to see if a certain condition is met and then breaks the flow of the program. They're great for places where you're really only interested in one outcome of an if statement. So rather than say:
if (something) {
// a lot of indented code
}
You reverse the condition and break if that reversed condition is fulfilled
if (!something) return false; // or another value to show your other code the function did not execute
// all the code from before, save a lot of tabs
return is nowhere near as dirty as goto. It allows you to pass a value to show the rest of your code that the function couldn't run.
You'll see the best examples of where this can be applied in nested conditions:
if (something) {
do-something();
if (something-else) {
do-another-thing();
} else {
do-something-else();
}
}
vs:
if (!something) return;
do-something();
if (!something-else) return do-something-else();
do-another-thing();
You'll find few people arguing the first is cleaner but of course, it's completely subjective. Some programmers like to know what conditions something is operating under by indentation, while I'd much rather keep method flow linear.
I won't suggest for one moment that precons will change your life or get you laid but you might find your code just that little bit easier to read.
Performance-wise, there will be no noticeable difference between the two approaches.
But coding is about more than performance. Clarity and maintainability are also very important. And, in cases like this where it doesn't affect performance, it is the only thing that matters.
There are competing schools of thought as to which approach is preferable.
One view is the one others have mentioned: the second approach reduces the nesting level, which improves code clarity. This is natural in an imperative style: when you have nothing left to do, you might as well return early.
Another view, from the perspective of a more functional style, is that a method should have only one exit point. Everything in a functional language is an expression. So if statements must always have an else clauses. Otherwise the if expression wouldn't always have a value. So in the functional style, the first approach is more natural.
There are several good points made here, but multiple return points can be unreadable as well, if the method is very lengthy. That being said, if you're going to use multiple return points just make sure that your method is short, otherwise the readability bonus of multiple return points may be lost.
Performance is in two parts. You have performance when the software is in production, but you also want to have performance while developing and debugging. The last thing a developer wants is to "wait" for something trivial. In the end, compiling this with optimization enabled will result in similar code. So it's good to know these little tricks that pay off in both scenarios.
The case in the question is clear, ReSharper is correct. Rather than nesting if statements, and creating new scope in code, you're setting a clear rule at the start of your method. It increases readability, it will be easier to maintain, and it reduces the amount of rules one has to sift through to find where they want to go.
Personally I prefer only 1 exit point. It's easy to accomplish if you keep your methods short and to the point, and it provides a predictable pattern for the next person who works on your code.
eg.
bool PerformDefaultOperation()
{
bool succeeded = false;
DataStructure defaultParameters;
if ((defaultParameters = this.GetApplicationDefaults()) != null)
{
succeeded = this.DoSomething(defaultParameters);
}
return succeeded;
}
This is also very useful if you just want to check the values of certain local variables within a function before it exits. All you need to do is place a breakpoint on the final return and you are guaranteed to hit it (unless an exception is thrown).
Avoiding multiple exit points can lead to performance gains. I am not sure about C# but in C++ the Named Return Value Optimization (Copy Elision, ISO C++ '03 12.8/15) depends on having a single exit point. This optimization avoids copy constructing your return value (in your specific example it doesn't matter). This could lead to considerable gains in performance in tight loops, as you are saving a constructor and a destructor each time the function is invoked.
But for 99% of the cases saving the additional constructor and destructor calls is not worth the loss of readability nested if blocks introduce (as others have pointed out).
Many good reasons about how the code looks like. But what about results?
Let's take a look to some C# code and its IL compiled form:
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length == 0) return;
if ((args.Length+2)/3 == 5) return;
Console.WriteLine("hey!!!");
}
}
This simple snippet can be compiled. You can open the generated .exe file with ildasm and check what is the result. I won't post all the assembler thing but I'll describe the results.
The generated IL code does the following:
If the first condition is false, jumps to the code where the second is.
If it's true jumps to the last instruction. (Note: the last instruction is a return).
In the second condition the same happens after the result is calculated. Compare and: got to the Console.WriteLine if false or to the end if this is true.
Print the message and return.
So it seems that the code will jump to the end. What if we do a normal if with nested code?
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length != 0 && (args.Length+2)/3 != 5)
{
Console.WriteLine("hey!!!");
}
}
}
The results are quite similar in IL instructions. The difference is that before there were two jumps per condition: if false go to next piece of code, if true go to the end. And now the IL code flows better and has 3 jumps (the compiler optimized this a bit):
First jump: when Length is 0 to a part where the code jumps again (Third jump) to the end.
Second: in the middle of the second condition to avoid one instruction.
Third: if the second condition is false, jump to the end.
Anyway, the program counter will always jump.
In theory, inverting if could lead to better performance if it increases branch prediction hit rate. In practice, I think it is very hard to know exactly how branch prediction will behave, especially after compiling, so I would not do it in my day-to-day development, except if I am writing assembly code.
More on branch prediction here.
That is simply controversial. There is no "agreement among programmers" on the question of early return. It's always subjective, as far as I know.
It's possible to make a performance argument, since it's better to have conditions that are written so they are most often true; it can also be argued that it is clearer. It does, on the other hand, create nested tests.
I don't think you will get a conclusive answer to this question.
There are a lot of insightful answers there already, but still, I would to direct to a slightly different situation: Instead of precondition, that should be put on top of a function indeed, think of a step-by-step initialization, where you have to check for each step to succeed and then continue with the next. In this case, you cannot check everything at the top.
I found my code really unreadable when writing an ASIO host application with Steinberg's ASIOSDK, as I followed the nesting paradigm. It went like eight levels deep, and I cannot see a design flaw there, as mentioned by Andrew Bullock above. Of course, I could have packed some inner code to another function, and then nested the remaining levels there to make it more readable, but this seems rather random to me.
By replacing nesting with guard clauses, I even discovered a misconception of mine regarding a portion of cleanup-code that should have occurred much earlier within the function instead of at the end. With nested branches, I would never have seen that, you could even say they led to my misconception.
So this might be another situation where inverted ifs can contribute to a clearer code.
It's a matter of opinion.
My normal approach would be to avoid single line ifs, and returns in the middle of a method.
You wouldn't want lines like it suggests everywhere in your method but there is something to be said for checking a bunch of assumptions at the top of your method, and only doing your actual work if they all pass.
In my opinion early return is fine if you are just returning void (or some useless return code you're never gonna check) and it might improve readability because you avoid nesting and at the same time you make explicit that your function is done.
If you are actually returning a returnValue - nesting is usually a better way to go cause you return your returnValue just in one place (at the end - duh), and it might make your code more maintainable in a whole lot of cases.
I'm not sure, but I think, that R# tries to avoid far jumps. When You have IF-ELSE, compiler does something like this:
Condition false -> far jump to false_condition_label
true_condition_label:
instruction1
...
instruction_n
false_condition_label:
instruction1
...
instruction_n
end block
If condition is true there is no jump and no rollout L1 cache, but jump to false_condition_label can be very far and processor must rollout his own cache. Synchronising cache is expensive. R# tries replace far jumps into short jumps and in this case there is bigger probability, that all instructions are already in cache.
I think it depends on what you prefer, as mentioned, theres no general agreement afaik.
To reduce annoyment, you may reduce this kind of warning to "Hint"
My idea is that the return "in the middle of a function" shouldn't be so "subjective".
The reason is quite simple, take this code:
function do_something( data ){
if (!is_valid_data( data ))
return false;
do_something_that_take_an_hour( data );
istance = new object_with_very_painful_constructor( data );
if ( istance is not valid ) {
error_message( );
return ;
}
connect_to_database ( );
get_some_other_data( );
return;
}
Maybe the first "return" it's not SO intuitive, but that's really saving.
There are too many "ideas" about clean codes, that simply need more practise to lose their "subjective" bad ideas.
There are several advantages to this sort of coding but for me the big win is, if you can return quick you can improve the speed of your application. IE I know that because of Precondition X that I can return quickly with an error. This gets rid of the error cases first and reduces the complexity of your code. In a lot of cases because the cpu pipeline can be now be cleaner it can stop pipeline crashes or switches. Secondly if you are in a loop, breaking or returning out quickly can save you a lots of cpu. Some programmers use loop invariants to do this sort of quick exit but in this you can broke your cpu pipeline and even create memory seek problem and mean the the cpu needs to load from outside cache. But basically I think you should do what you intended, that is end the loop or function not create a complex code path just to implement some abstract notion of correct code. If the only tool you have is a hammer then everything looks like a nail.
I do not currently have this issue, but you never know, and thought experiments are always fun.
Ignoring the obvious problems that you would have to have with your architecture to even be attempting this, let's assume that you had some horribly-written code of someone else's design, and you needed to do a bunch of wide and varied operations in the same code block, e.g.:
WidgetMaker.SetAlignment(57);
contactForm["Title"] = txtTitle.Text;
Casserole.Season(true, false);
((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true;
Multiplied by a hundred. Some of these might work, others might go badly wrong. What you need is the C# equivalent of "on error resume next", otherwise you're going to end up copying and pasting try-catches around the many lines of code.
How would you attempt to tackle this problem?
public delegate void VoidDelegate();
public static class Utils
{
public static void Try(VoidDelegate v) {
try {
v();
}
catch {}
}
}
Utils.Try( () => WidgetMaker.SetAlignment(57) );
Utils.Try( () => contactForm["Title"] = txtTitle.Text );
Utils.Try( () => Casserole.Season(true, false) );
Utils.Try( () => ((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true );
Refactor into individual, well-named methods:
AdjustFormWidgets();
SetContactTitle(txtTitle.Text);
SeasonCasserole();
Each of those is protected appropriately.
I would say do nothing.
Yup thats right, do NOTHING.
You have clearly identified two things to me:
You know the architecture is borked.
There is a ton of this crap.
I say:
Do nothing.
Add a global error handler to send you an email every time it goes boom.
Wait until something falls over (or fails a test)
Correct that (Refactoring as necessary within the scope of the page).
Repeat every time a problem occurs.
You will have this cleared up in no time if it is that bad. Yeah I know it sounds sucky and you may be pulling your hair out with bugfixes to begin with, but it will allow you to fix the needy/buggy code before the (large) amount of code that may actually be working no matter how crappy it looks.
Once you start winning the war, you will have a better handle on the code (due to all your refactoring) you will have a better idea for a winning design for it..
Trying to wrap all of it in bubble wrap is probably going to take just a long to do and you will still not be any closer to fixing the problems.
It's pretty obvious that you'd write the code in VB.NET, which actually does have On Error Resume Next, and export it in a DLL to C#. Anything else is just being a glutton
for punishment.
Fail Fast
To elaborate, I guess I am questioning the question. If an exception is thrown, why would you want your code to simply continue as if nothing has happened? Either you expect exceptions in certain situations, in which case you write a try-catch block around that code and handle them, or there is an unexpected error, in which case you should prefer your application to abort, or retry, or fail. Not carry on like a wounded zombie moaning 'brains'.
This is one of the things that having a preprocessor is useful for. You could define a macro that swallows exceptions, then with a quick script add that macro to all lines.
So, if this were C++, you could do something like this:
#define ATTEMPT(x) try { x; } catch (...) { }
// ...
ATTEMPT(WidgetMaker.SetAlignment(57));
ATTEMPT(contactForm["Title"] = txtTitle.Text);
ATTEMPT(Casserole.Season(true, false));
ATTEMPT(((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true);
Unfortunately, not many languages seem to include a preprocessor like C/C++ did.
You could create your own preprocessor and add it as a pre-build step. If you felt like completely automating it you could probably write a preprocessor that would take the actual code file and add the try/catch stuff in on its own (so you don't have to add those ATTEMPT() blocks to the code manually). Making sure it only modified the lines it's supposed to could be difficult though (have to skip variable declarations, loop constructs, etc to that you don't break the build).
However, I think these are horrible ideas and should never be done, but the question was asked. :)
Really, you shouldn't ever do this. You need to find what's causing the error and fix it. Swallowing/ignoring errors is a bad thing to do, so I think the correct answer here is "Fix the bug, don't ignore it!". :)
On Error Resume Next is a really bad idea in the C# world. Nor would adding the equivalent to On Error Resume Next actually help you. All it would do is leave you in a bad state which could cause more subtle errors, data loss and possibly data corruption.
But to give the questioner his due, you could add a global handler and check the TargetSite to see which method borked. Then you could at least know what line it borked on. The next part would be to try and figure out how to set the "next statement" the same way the debugger does it. Hopefully your stack won't have unwound at this point or you can re-create it, but it's certainly worth a shot. However, given this approach the code would have to run in Debug mode every time so that you would have your debug symbols included.
As someone mentioned, VB allows this. How about doing it the same way in C#? Enter trusty reflector:
This:
Sub Main()
On Error Resume Next
Dim i As Integer = 0
Dim y As Integer = CInt(5 / i)
End Sub
Translates into this:
public static void Main()
{
// This item is obfuscated and can not be translated.
int VB$ResumeTarget;
try
{
int VB$CurrentStatement;
Label_0001:
ProjectData.ClearProjectError();
int VB$ActiveHandler = -2;
Label_0009:
VB$CurrentStatement = 2;
int i = 0;
Label_000E:
VB$CurrentStatement = 3;
int y = (int) Math.Round((double) (5.0 / ((double) i)));
goto Label_008F;
Label_0029:
VB$ResumeTarget = 0;
switch ((VB$ResumeTarget + 1))
{
case 1:
goto Label_0001;
case 2:
goto Label_0009;
case 3:
goto Label_000E;
case 4:
goto Label_008F;
default:
goto Label_0084;
}
Label_0049:
VB$ResumeTarget = VB$CurrentStatement;
switch (((VB$ActiveHandler > -2) ? VB$ActiveHandler : 1))
{
case 0:
goto Label_0084;
case 1:
goto Label_0029;
}
}
catch (object obj1) when (?)
{
ProjectData.SetProjectError((Exception) obj1);
goto Label_0049;
}
Label_0084:
throw ProjectData.CreateProjectError(-2146828237);
Label_008F:
if (VB$ResumeTarget != 0)
{
ProjectData.ClearProjectError();
}
}
Rewrite the code. Try to find sets of statements which logically depend on each other, so that if one fails then the next ones make no sense, and hive them off into their own functions and put try-catches round them, if you want to ignore the result of that and continue.
This may help you in identifing the pieces that have the most problems.
# JB King
Thanks for reminding me. The Logging application block has a Instrumentation Event that can be used to trace events, you can find more info on the MS Enterprise library docs.
Using (New InstEvent)
<series of statements>
End Using
All of the steps in this using will be traced to a log file, and you can parse that out to see where the log breaks (ex is thrown) and id the high offenders.
Refactoring is really your best bet, but if you have a lot, this may help you pinpoint the worst offenders.
You could use goto, but it's still messy.
I've actually wanted a sort of single statement try-catch for a while. It would be helpful in certain cases, like adding logging code or something that you don't want to interrupt the main program flow if it fails.
I suspect something could be done with some of the features associated with linq, but don't really have time to look into it at the moment. If you could just find a way to wrap a statement as an anonymous function, then use another one to call that within a try-catch block it would work... but not sure if that's possible just yet.
If you can get the compiler to give you an expression tree for this code, then you could modify that expression tree by replacing each statement with a new try-catch block that wraps the original statement. This isn't as far-fetched as it sounds; for LINQ, C# acquired the ability to capture lambda expressions as expression trees that can be manipulated in user code at runtime.
This approach is not possible today with .NET 3.5 -- if for no other reason than the lack of a "try" statement in System.Linq.Expressions. However, it may very well be viable in a future version of C# once the merge of the DLR and LINQ expression trees is complete.
Why not use the reflection in c#? You could create a class that reflects on the code and use line #s as the hint for what to put in each individual try/catch block. This has a few advantages:
Its slightly less ugly as it doesn't really you require mangle your source code and you can use it only during debug modes.
You learn something interesting about c# while implementing it.
I however would recommend against any of this, unless of course you are taking over maintance of someelses work and you need to get a handle on the exceptions so you can fix them. Might be fun to write though.
Fun question; very terrible.
It'd be nice if you could use a macro. But this is blasted C#, so you might solve it with some preprocessor work or some external tool to wrap your lines in individual try-catch blocks. Not sure if you meant you didn't want to manually wrap them or that you wanted to avoid try-catch entirely.
Messing around with this, I tried labeling every line and jumping back from a single catch, without much luck. However, Christopher uncovered the correct way to do this. There's some interesting additional discussion of this at Dot Net Thoughts and at Mike Stall's .NET Blog.
EDIT: Of course. The try-catch / switch-goto solution listed won't actually compile since the try labels are out-of-scope in catch. Anyone know what's missing to make something like this compile?
You could automate this with a compiler preprocess step or maybe hack up Mike Stall's Inline IL tool to inject some error-ignorance.
(Orion Adrian's answer about examining the Exception and trying to set the next instruction is interesting too.)
All in all, it seems like an interesting and instructive exercise. Of course, you'd have to decide at what point the effort to simulate ON ERROR RESUME NEXT outweighs the effort to fix the code. :-)
Catch the errors in the UnhandledException Event of the application. That way, unhandled execptions can even be logged as to the sender and whatever other information the developer would reasonable.
Unfortunately you are probably out of luck. On Error Resume Next is a legacy option that is generally heavily discouraged, and does not have an equivalent to my knowledge in C#.
I would recommend leaving the code in VB (It sounds like that was the source, given your specific request for OnError ResumeNext) and interfacing with or from a C# dll or exe that implements whatever new code you need. Then preform refactoring to cause the code to be safe, and convert this safe code to C# as you do this.
You could look at integrating the Enterprise Library's Exception Handling component for one idea of how to handle unhandled exceptions.
If this is for ASP.Net applications, there is a function in the Global.asax called, "Application_Error" that gets called in most cases with catastrophic failure being the other case usually.
Ignoring all the reasons you'd want to avoid doing this.......
If it were simply a need to keep # of lines down, you could try something like:
int totalMethodCount = xxx;
for(int counter = 0; counter < totalMethodCount; counter++) {
try {
if (counter == 0) WidgetMaker.SetAlignment(57);
if (counter == 1) contactForm["Title"] = txtTitle.Text;
if (counter == 2) Casserole.Season(true, false);
if (counter == 3) ((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true;
} catch (Exception ex) {
// log here
}
}
However, you'd have to keep an eye on variable scope if you try to reuse any of the results of the calls.
Hilite each line, one at a time, 'Surround with' try/catch. That avoids the copying pasting you mentioned