Related
I recently came across this code in a project - which I assume was there by mistake:
if(condition)
{
//Whatever...
};
Note the semi colon after the closing brace.
Does anyone know what the effect of this is?
I assume it does not have any effect, but would have thought it would have caused a compiler error.
This is a simple question with a simple answer, but I just wanted to add something relevant. Often people understand that it does nothing and particularly for the case that you presented, the semi-colon is an unnecessary line termination.
But what is the rationale behind it ?
Actually, those empty statements are allowed for statement like these:
// Use an empty statement as the body of the while-loop.
while (Method())
;
I agree that it does nothing. But it can help certain loops conform to the syntactic requirements of the language and I think this is what people should understand from it. As other said, I agree you can remove it, I just wanted to underline why C# allows it.
Further clarification
An empty statement is used when you don't need to perform an operation where a statement is required. It simply transfers control to the end point of the statement. It has no effect at all, it is pure syntactic sugar.
As stated by #PaulF, in the example above, you could use an empty block ({}) instead. It would be totally valid and have the same effect.
Again, it all comes down to style. You don't need it, but it can certainly help you conform to whatever rules of your coding environments.
Common use-cases (where one could see empty statements)
While loop with empty body (same case that I underlined above)
void ProcessMessages()
{
while (ProcessMessage())
; // Statement needed here.
}
goto statements (rarely use but still valid)
void F()
{
//...
if (done) goto exit;
//...
exit:
; // Statement needed here.
}
From MSDN
Class declaration (Props to #EricLippert for bringing this one)
class SomeClass
{
...
};
Note that in this case, as stated by #EricLippert in the comments section, this is simply a courtesy to C++ programmers who are used to typing semis after classes; C++ requires this.
Even though the general use of empty statements is debatable mainly because of the confusion they can bring, in my opinion, syntactically speaking they have a place in C#. We must not forget that C# is an increment of C++ (which mostly explain the # aka. four "+" symbols in a two-by-two grid) and for historical reasons, allowing empty statements was facilitating the transition.
It doesn't seem to have any effect, though I wouldn't recommend writing code that way.
In the event that you ever want to add an else or else if after the ;, it won't compile.
Ex:
if(5>1) {
//whatever
}; else {
//whatever
}
This will not compile (note the ; before else)
That is something that Visual Studio will compile as valid syntax for an empty statement, as it is just a statement termination. Your code will compile and the extra ; will not be an issue.
It can be deleted to clean up the code if you want to, but leaving it in will not cause any adverse effect.
Hope this helps.
When I ran ReSharper on my code, for example:
if (some condition)
{
Some code...
}
ReSharper gave me the above warning (Invert "if" statement to reduce nesting), and suggested the following correction:
if (!some condition) return;
Some code...
I would like to understand why that's better. I always thought that using "return" in the middle of a method problematic, somewhat like "goto".
It is not only aesthetic, but it also reduces the maximum nesting level inside the method. This is generally regarded as a plus because it makes methods easier to understand (and indeed, many static analysis tools provide a measure of this as one of the indicators of code quality).
On the other hand, it also makes your method have multiple exit points, something that another group of people believes is a no-no.
Personally, I agree with ReSharper and the first group (in a language that has exceptions I find it silly to discuss "multiple exit points"; almost anything can throw, so there are numerous potential exit points in all methods).
Regarding performance: both versions should be equivalent (if not at the IL level, then certainly after the jitter is through with the code) in every language. Theoretically this depends on the compiler, but practically any widely used compiler of today is capable of handling much more advanced cases of code optimization than this.
A return in the middle of the method is not necessarily bad. It might be better to return immediately if it makes the intent of the code clearer. For example:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
};
}
return result;
};
In this case, if _isDead is true, we can immediately get out of the method. It might be better to structure it this way instead:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
I've picked this code from the refactoring catalog. This specific refactoring is called: Replace Nested Conditional with Guard Clauses.
This is a bit of a religious argument, but I agree with ReSharper that you should prefer less nesting. I believe that this outweighs the negatives of having multiple return paths from a function.
The key reason for having less nesting is to improve code readability and maintainability. Remember that many other developers will need to read your code in the future, and code with less indentation is generally much easier to read.
Preconditions are a great example of where it is okay to return early at the start of the function. Why should the readability of the rest of the function be affected by the presence of a precondition check?
As for the negatives about returning multiple times from a method - debuggers are pretty powerful now, and it's very easy to find out exactly where and when a particular function is returning.
Having multiple returns in a function is not going to affect the maintainance programmer's job.
Poor code readability will.
As others have mentioned, there shouldn't be a performance hit, but there are other considerations. Aside from those valid concerns, this also can open you up to gotchas in some circumstances. Suppose you were dealing with a double instead:
public void myfunction(double exampleParam){
if(exampleParam > 0){
//Body will *not* be executed if Double.IsNan(exampleParam)
}
}
Contrast that with the seemingly equivalent inversion:
public void myfunction(double exampleParam){
if(exampleParam <= 0)
return;
//Body *will* be executed if Double.IsNan(exampleParam)
}
So in certain circumstances what appears to be a a correctly inverted if might not be.
The idea of only returning at the end of a function came back from the days before languages had support for exceptions. It enabled programs to rely on being able to put clean-up code at the end of a method, and then being sure it would be called and some other programmer wouldn't hide a return in the method that caused the cleanup code to be skipped. Skipped cleanup code could result in a memory or resource leak.
However, in a language that supports exceptions, it provides no such guarantees. In a language that supports exceptions, the execution of any statement or expression can cause a control flow that causes the method to end. This means clean-up must be done through using the finally or using keywords.
Anyway, I'm saying I think a lot of people quote the 'only return at the end of a method' guideline without understanding why it was ever a good thing to do, and that reducing nesting to improve readability is probably a better aim.
I'd like to add that there is name for those inverted if's - Guard Clause. I use it whenever I can.
I hate reading code where there is if at the beginning, two screens of code and no else. Just invert if and return. That way nobody will waste time scrolling.
http://c2.com/cgi/wiki?GuardClause
It doesn't only affect aesthetics, but it also prevents code nesting.
It can actually function as a precondition to ensure that your data is valid as well.
This is of course subjective, but I think it strongly improves on two points:
It is now immediately obvious that your function has nothing left to do if condition holds.
It keeps the nesting level down. Nesting hurts readability more than you'd think.
Multiple return points were a problem in C (and to a lesser extent C++) because they forced you to duplicate clean-up code before each of the return points. With garbage collection, the try | finally construct and using blocks, there's really no reason why you should be afraid of them.
Ultimately it comes down to what you and your colleagues find easier to read.
Guard clauses or pre-conditions (as you can probably see) check to see if a certain condition is met and then breaks the flow of the program. They're great for places where you're really only interested in one outcome of an if statement. So rather than say:
if (something) {
// a lot of indented code
}
You reverse the condition and break if that reversed condition is fulfilled
if (!something) return false; // or another value to show your other code the function did not execute
// all the code from before, save a lot of tabs
return is nowhere near as dirty as goto. It allows you to pass a value to show the rest of your code that the function couldn't run.
You'll see the best examples of where this can be applied in nested conditions:
if (something) {
do-something();
if (something-else) {
do-another-thing();
} else {
do-something-else();
}
}
vs:
if (!something) return;
do-something();
if (!something-else) return do-something-else();
do-another-thing();
You'll find few people arguing the first is cleaner but of course, it's completely subjective. Some programmers like to know what conditions something is operating under by indentation, while I'd much rather keep method flow linear.
I won't suggest for one moment that precons will change your life or get you laid but you might find your code just that little bit easier to read.
Performance-wise, there will be no noticeable difference between the two approaches.
But coding is about more than performance. Clarity and maintainability are also very important. And, in cases like this where it doesn't affect performance, it is the only thing that matters.
There are competing schools of thought as to which approach is preferable.
One view is the one others have mentioned: the second approach reduces the nesting level, which improves code clarity. This is natural in an imperative style: when you have nothing left to do, you might as well return early.
Another view, from the perspective of a more functional style, is that a method should have only one exit point. Everything in a functional language is an expression. So if statements must always have an else clauses. Otherwise the if expression wouldn't always have a value. So in the functional style, the first approach is more natural.
There are several good points made here, but multiple return points can be unreadable as well, if the method is very lengthy. That being said, if you're going to use multiple return points just make sure that your method is short, otherwise the readability bonus of multiple return points may be lost.
Performance is in two parts. You have performance when the software is in production, but you also want to have performance while developing and debugging. The last thing a developer wants is to "wait" for something trivial. In the end, compiling this with optimization enabled will result in similar code. So it's good to know these little tricks that pay off in both scenarios.
The case in the question is clear, ReSharper is correct. Rather than nesting if statements, and creating new scope in code, you're setting a clear rule at the start of your method. It increases readability, it will be easier to maintain, and it reduces the amount of rules one has to sift through to find where they want to go.
Personally I prefer only 1 exit point. It's easy to accomplish if you keep your methods short and to the point, and it provides a predictable pattern for the next person who works on your code.
eg.
bool PerformDefaultOperation()
{
bool succeeded = false;
DataStructure defaultParameters;
if ((defaultParameters = this.GetApplicationDefaults()) != null)
{
succeeded = this.DoSomething(defaultParameters);
}
return succeeded;
}
This is also very useful if you just want to check the values of certain local variables within a function before it exits. All you need to do is place a breakpoint on the final return and you are guaranteed to hit it (unless an exception is thrown).
Avoiding multiple exit points can lead to performance gains. I am not sure about C# but in C++ the Named Return Value Optimization (Copy Elision, ISO C++ '03 12.8/15) depends on having a single exit point. This optimization avoids copy constructing your return value (in your specific example it doesn't matter). This could lead to considerable gains in performance in tight loops, as you are saving a constructor and a destructor each time the function is invoked.
But for 99% of the cases saving the additional constructor and destructor calls is not worth the loss of readability nested if blocks introduce (as others have pointed out).
Many good reasons about how the code looks like. But what about results?
Let's take a look to some C# code and its IL compiled form:
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length == 0) return;
if ((args.Length+2)/3 == 5) return;
Console.WriteLine("hey!!!");
}
}
This simple snippet can be compiled. You can open the generated .exe file with ildasm and check what is the result. I won't post all the assembler thing but I'll describe the results.
The generated IL code does the following:
If the first condition is false, jumps to the code where the second is.
If it's true jumps to the last instruction. (Note: the last instruction is a return).
In the second condition the same happens after the result is calculated. Compare and: got to the Console.WriteLine if false or to the end if this is true.
Print the message and return.
So it seems that the code will jump to the end. What if we do a normal if with nested code?
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length != 0 && (args.Length+2)/3 != 5)
{
Console.WriteLine("hey!!!");
}
}
}
The results are quite similar in IL instructions. The difference is that before there were two jumps per condition: if false go to next piece of code, if true go to the end. And now the IL code flows better and has 3 jumps (the compiler optimized this a bit):
First jump: when Length is 0 to a part where the code jumps again (Third jump) to the end.
Second: in the middle of the second condition to avoid one instruction.
Third: if the second condition is false, jump to the end.
Anyway, the program counter will always jump.
In theory, inverting if could lead to better performance if it increases branch prediction hit rate. In practice, I think it is very hard to know exactly how branch prediction will behave, especially after compiling, so I would not do it in my day-to-day development, except if I am writing assembly code.
More on branch prediction here.
That is simply controversial. There is no "agreement among programmers" on the question of early return. It's always subjective, as far as I know.
It's possible to make a performance argument, since it's better to have conditions that are written so they are most often true; it can also be argued that it is clearer. It does, on the other hand, create nested tests.
I don't think you will get a conclusive answer to this question.
There are a lot of insightful answers there already, but still, I would to direct to a slightly different situation: Instead of precondition, that should be put on top of a function indeed, think of a step-by-step initialization, where you have to check for each step to succeed and then continue with the next. In this case, you cannot check everything at the top.
I found my code really unreadable when writing an ASIO host application with Steinberg's ASIOSDK, as I followed the nesting paradigm. It went like eight levels deep, and I cannot see a design flaw there, as mentioned by Andrew Bullock above. Of course, I could have packed some inner code to another function, and then nested the remaining levels there to make it more readable, but this seems rather random to me.
By replacing nesting with guard clauses, I even discovered a misconception of mine regarding a portion of cleanup-code that should have occurred much earlier within the function instead of at the end. With nested branches, I would never have seen that, you could even say they led to my misconception.
So this might be another situation where inverted ifs can contribute to a clearer code.
It's a matter of opinion.
My normal approach would be to avoid single line ifs, and returns in the middle of a method.
You wouldn't want lines like it suggests everywhere in your method but there is something to be said for checking a bunch of assumptions at the top of your method, and only doing your actual work if they all pass.
In my opinion early return is fine if you are just returning void (or some useless return code you're never gonna check) and it might improve readability because you avoid nesting and at the same time you make explicit that your function is done.
If you are actually returning a returnValue - nesting is usually a better way to go cause you return your returnValue just in one place (at the end - duh), and it might make your code more maintainable in a whole lot of cases.
I'm not sure, but I think, that R# tries to avoid far jumps. When You have IF-ELSE, compiler does something like this:
Condition false -> far jump to false_condition_label
true_condition_label:
instruction1
...
instruction_n
false_condition_label:
instruction1
...
instruction_n
end block
If condition is true there is no jump and no rollout L1 cache, but jump to false_condition_label can be very far and processor must rollout his own cache. Synchronising cache is expensive. R# tries replace far jumps into short jumps and in this case there is bigger probability, that all instructions are already in cache.
I think it depends on what you prefer, as mentioned, theres no general agreement afaik.
To reduce annoyment, you may reduce this kind of warning to "Hint"
My idea is that the return "in the middle of a function" shouldn't be so "subjective".
The reason is quite simple, take this code:
function do_something( data ){
if (!is_valid_data( data ))
return false;
do_something_that_take_an_hour( data );
istance = new object_with_very_painful_constructor( data );
if ( istance is not valid ) {
error_message( );
return ;
}
connect_to_database ( );
get_some_other_data( );
return;
}
Maybe the first "return" it's not SO intuitive, but that's really saving.
There are too many "ideas" about clean codes, that simply need more practise to lose their "subjective" bad ideas.
There are several advantages to this sort of coding but for me the big win is, if you can return quick you can improve the speed of your application. IE I know that because of Precondition X that I can return quickly with an error. This gets rid of the error cases first and reduces the complexity of your code. In a lot of cases because the cpu pipeline can be now be cleaner it can stop pipeline crashes or switches. Secondly if you are in a loop, breaking or returning out quickly can save you a lots of cpu. Some programmers use loop invariants to do this sort of quick exit but in this you can broke your cpu pipeline and even create memory seek problem and mean the the cpu needs to load from outside cache. But basically I think you should do what you intended, that is end the loop or function not create a complex code path just to implement some abstract notion of correct code. If the only tool you have is a hammer then everything looks like a nail.
If I do this I get a System.StackOverflowException:
private string abc = "";
public string Abc
{
get
{
return Abc; // Note the mistaken capitalization
}
}
I understand why -- the property is referencing itself, leading to an infinite loop. (See previous questions here and here).
What I'm wondering (and what I didn't see answered in those previous questions) is why doesn't the C# compiler catch this mistake? It checks for some other kinds of circular reference (classes inheriting from themselves, etc.), right? Is it just that this mistake wasn't common enough to be worth checking for? Or is there some situation I'm not thinking of, when you'd want a property to actually reference itself in this way?
You can see the "official" reason in the last comment here.
Posted by Microsoft on 14/11/2008 at
19:52
Thanks for the suggestion for
Visual Studio!
You are right that we could easily
detect property recursion, but we
can't guarantee that there is nothing
useful being accomplished by the
recursion. The body of the property
could set other fields on your object
which change the behavior of the next
recursion, could change its behavior
based on user input from the console,
or could even behave differently based
on random values. In these cases, a
self-recursive property could indeed
terminate the recursion, but we have
no way to determine if that's the case
at compile-time (without solving the
halting problem!).
For the reasons above (and the
breaking change it would take to
disallow this), we wouldn't be able to
prohibit self-recursive properties.
Alex Turner
Program Manager
Visual C# Compiler
Another point in addition to Alex's explanation is that we try to give warnings for code which does something that you probably didn't intend, such that you could accidentally ship with the bug.
In this particular case, how much time would the warning actually save you? A single test run. You'll find this bug the moment you test the code, because it always immediately crashes and dies horribly. The warning wouldn't actually buy you much of a benefit here. The likelihood that there is some subtle bug in a recursive property evaluation is low.
By contrast, we do give a warning if you do something like this:
int customerId;
...
this.customerId= this.customerId;
There's no horrible crash-and-die, and the code is valid code; it assigns a value to a field. But since this is nonsensical code, you probably didn't mean to do it. Since it's not going to die horribly, we give a warning that there's something here that you probably didn't intend and might not otherwise discover via a crash.
Property referring to itself does not always lead to infinite recursion and stack overflow. For example, this works fine:
int count = 0;
public string Abc
{
count++;
if (count < 1) return Abc;
return "Foo";
}
Above is a dummy example, but I'm sure one could come up with useful recursive code that is similar. Compiler cannot determine if infinite recursion will happen (halting problem).
Generating a warning in the simple case would be helpful.
They probably considered it would unnecessary complicate the compiler without any real gain.
You will discover this typo easily the first time you call this property.
First of all, you'll get a warning for unused variable abc.
Second, there is nothing bad in teh recursion, provided that it's not endless recursion. For example, the code might adjust some inner variables and than call the same getter recursively. There is however for the compiler no easy way at all to prove that some recursion is endless or not (the task is at least NP). The compiler could catch some easy cases, but then the consumers would be surprised that the more complicated cases get through the compiler's checks.
The other cases cases that it checks for (except recursive constructor) are invalid IL.
In addition, all of those cases, even recursive constructors) are guarenteed to fail.
However, it is possible, albeit unlikely, to intentionally create a useful recursive property (using if statements).
When I ran ReSharper on my code, for example:
if (some condition)
{
Some code...
}
ReSharper gave me the above warning (Invert "if" statement to reduce nesting), and suggested the following correction:
if (!some condition) return;
Some code...
I would like to understand why that's better. I always thought that using "return" in the middle of a method problematic, somewhat like "goto".
It is not only aesthetic, but it also reduces the maximum nesting level inside the method. This is generally regarded as a plus because it makes methods easier to understand (and indeed, many static analysis tools provide a measure of this as one of the indicators of code quality).
On the other hand, it also makes your method have multiple exit points, something that another group of people believes is a no-no.
Personally, I agree with ReSharper and the first group (in a language that has exceptions I find it silly to discuss "multiple exit points"; almost anything can throw, so there are numerous potential exit points in all methods).
Regarding performance: both versions should be equivalent (if not at the IL level, then certainly after the jitter is through with the code) in every language. Theoretically this depends on the compiler, but practically any widely used compiler of today is capable of handling much more advanced cases of code optimization than this.
A return in the middle of the method is not necessarily bad. It might be better to return immediately if it makes the intent of the code clearer. For example:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
};
}
return result;
};
In this case, if _isDead is true, we can immediately get out of the method. It might be better to structure it this way instead:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
I've picked this code from the refactoring catalog. This specific refactoring is called: Replace Nested Conditional with Guard Clauses.
This is a bit of a religious argument, but I agree with ReSharper that you should prefer less nesting. I believe that this outweighs the negatives of having multiple return paths from a function.
The key reason for having less nesting is to improve code readability and maintainability. Remember that many other developers will need to read your code in the future, and code with less indentation is generally much easier to read.
Preconditions are a great example of where it is okay to return early at the start of the function. Why should the readability of the rest of the function be affected by the presence of a precondition check?
As for the negatives about returning multiple times from a method - debuggers are pretty powerful now, and it's very easy to find out exactly where and when a particular function is returning.
Having multiple returns in a function is not going to affect the maintainance programmer's job.
Poor code readability will.
As others have mentioned, there shouldn't be a performance hit, but there are other considerations. Aside from those valid concerns, this also can open you up to gotchas in some circumstances. Suppose you were dealing with a double instead:
public void myfunction(double exampleParam){
if(exampleParam > 0){
//Body will *not* be executed if Double.IsNan(exampleParam)
}
}
Contrast that with the seemingly equivalent inversion:
public void myfunction(double exampleParam){
if(exampleParam <= 0)
return;
//Body *will* be executed if Double.IsNan(exampleParam)
}
So in certain circumstances what appears to be a a correctly inverted if might not be.
The idea of only returning at the end of a function came back from the days before languages had support for exceptions. It enabled programs to rely on being able to put clean-up code at the end of a method, and then being sure it would be called and some other programmer wouldn't hide a return in the method that caused the cleanup code to be skipped. Skipped cleanup code could result in a memory or resource leak.
However, in a language that supports exceptions, it provides no such guarantees. In a language that supports exceptions, the execution of any statement or expression can cause a control flow that causes the method to end. This means clean-up must be done through using the finally or using keywords.
Anyway, I'm saying I think a lot of people quote the 'only return at the end of a method' guideline without understanding why it was ever a good thing to do, and that reducing nesting to improve readability is probably a better aim.
I'd like to add that there is name for those inverted if's - Guard Clause. I use it whenever I can.
I hate reading code where there is if at the beginning, two screens of code and no else. Just invert if and return. That way nobody will waste time scrolling.
http://c2.com/cgi/wiki?GuardClause
It doesn't only affect aesthetics, but it also prevents code nesting.
It can actually function as a precondition to ensure that your data is valid as well.
This is of course subjective, but I think it strongly improves on two points:
It is now immediately obvious that your function has nothing left to do if condition holds.
It keeps the nesting level down. Nesting hurts readability more than you'd think.
Multiple return points were a problem in C (and to a lesser extent C++) because they forced you to duplicate clean-up code before each of the return points. With garbage collection, the try | finally construct and using blocks, there's really no reason why you should be afraid of them.
Ultimately it comes down to what you and your colleagues find easier to read.
Guard clauses or pre-conditions (as you can probably see) check to see if a certain condition is met and then breaks the flow of the program. They're great for places where you're really only interested in one outcome of an if statement. So rather than say:
if (something) {
// a lot of indented code
}
You reverse the condition and break if that reversed condition is fulfilled
if (!something) return false; // or another value to show your other code the function did not execute
// all the code from before, save a lot of tabs
return is nowhere near as dirty as goto. It allows you to pass a value to show the rest of your code that the function couldn't run.
You'll see the best examples of where this can be applied in nested conditions:
if (something) {
do-something();
if (something-else) {
do-another-thing();
} else {
do-something-else();
}
}
vs:
if (!something) return;
do-something();
if (!something-else) return do-something-else();
do-another-thing();
You'll find few people arguing the first is cleaner but of course, it's completely subjective. Some programmers like to know what conditions something is operating under by indentation, while I'd much rather keep method flow linear.
I won't suggest for one moment that precons will change your life or get you laid but you might find your code just that little bit easier to read.
Performance-wise, there will be no noticeable difference between the two approaches.
But coding is about more than performance. Clarity and maintainability are also very important. And, in cases like this where it doesn't affect performance, it is the only thing that matters.
There are competing schools of thought as to which approach is preferable.
One view is the one others have mentioned: the second approach reduces the nesting level, which improves code clarity. This is natural in an imperative style: when you have nothing left to do, you might as well return early.
Another view, from the perspective of a more functional style, is that a method should have only one exit point. Everything in a functional language is an expression. So if statements must always have an else clauses. Otherwise the if expression wouldn't always have a value. So in the functional style, the first approach is more natural.
There are several good points made here, but multiple return points can be unreadable as well, if the method is very lengthy. That being said, if you're going to use multiple return points just make sure that your method is short, otherwise the readability bonus of multiple return points may be lost.
Performance is in two parts. You have performance when the software is in production, but you also want to have performance while developing and debugging. The last thing a developer wants is to "wait" for something trivial. In the end, compiling this with optimization enabled will result in similar code. So it's good to know these little tricks that pay off in both scenarios.
The case in the question is clear, ReSharper is correct. Rather than nesting if statements, and creating new scope in code, you're setting a clear rule at the start of your method. It increases readability, it will be easier to maintain, and it reduces the amount of rules one has to sift through to find where they want to go.
Personally I prefer only 1 exit point. It's easy to accomplish if you keep your methods short and to the point, and it provides a predictable pattern for the next person who works on your code.
eg.
bool PerformDefaultOperation()
{
bool succeeded = false;
DataStructure defaultParameters;
if ((defaultParameters = this.GetApplicationDefaults()) != null)
{
succeeded = this.DoSomething(defaultParameters);
}
return succeeded;
}
This is also very useful if you just want to check the values of certain local variables within a function before it exits. All you need to do is place a breakpoint on the final return and you are guaranteed to hit it (unless an exception is thrown).
Avoiding multiple exit points can lead to performance gains. I am not sure about C# but in C++ the Named Return Value Optimization (Copy Elision, ISO C++ '03 12.8/15) depends on having a single exit point. This optimization avoids copy constructing your return value (in your specific example it doesn't matter). This could lead to considerable gains in performance in tight loops, as you are saving a constructor and a destructor each time the function is invoked.
But for 99% of the cases saving the additional constructor and destructor calls is not worth the loss of readability nested if blocks introduce (as others have pointed out).
Many good reasons about how the code looks like. But what about results?
Let's take a look to some C# code and its IL compiled form:
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length == 0) return;
if ((args.Length+2)/3 == 5) return;
Console.WriteLine("hey!!!");
}
}
This simple snippet can be compiled. You can open the generated .exe file with ildasm and check what is the result. I won't post all the assembler thing but I'll describe the results.
The generated IL code does the following:
If the first condition is false, jumps to the code where the second is.
If it's true jumps to the last instruction. (Note: the last instruction is a return).
In the second condition the same happens after the result is calculated. Compare and: got to the Console.WriteLine if false or to the end if this is true.
Print the message and return.
So it seems that the code will jump to the end. What if we do a normal if with nested code?
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length != 0 && (args.Length+2)/3 != 5)
{
Console.WriteLine("hey!!!");
}
}
}
The results are quite similar in IL instructions. The difference is that before there were two jumps per condition: if false go to next piece of code, if true go to the end. And now the IL code flows better and has 3 jumps (the compiler optimized this a bit):
First jump: when Length is 0 to a part where the code jumps again (Third jump) to the end.
Second: in the middle of the second condition to avoid one instruction.
Third: if the second condition is false, jump to the end.
Anyway, the program counter will always jump.
In theory, inverting if could lead to better performance if it increases branch prediction hit rate. In practice, I think it is very hard to know exactly how branch prediction will behave, especially after compiling, so I would not do it in my day-to-day development, except if I am writing assembly code.
More on branch prediction here.
That is simply controversial. There is no "agreement among programmers" on the question of early return. It's always subjective, as far as I know.
It's possible to make a performance argument, since it's better to have conditions that are written so they are most often true; it can also be argued that it is clearer. It does, on the other hand, create nested tests.
I don't think you will get a conclusive answer to this question.
There are a lot of insightful answers there already, but still, I would to direct to a slightly different situation: Instead of precondition, that should be put on top of a function indeed, think of a step-by-step initialization, where you have to check for each step to succeed and then continue with the next. In this case, you cannot check everything at the top.
I found my code really unreadable when writing an ASIO host application with Steinberg's ASIOSDK, as I followed the nesting paradigm. It went like eight levels deep, and I cannot see a design flaw there, as mentioned by Andrew Bullock above. Of course, I could have packed some inner code to another function, and then nested the remaining levels there to make it more readable, but this seems rather random to me.
By replacing nesting with guard clauses, I even discovered a misconception of mine regarding a portion of cleanup-code that should have occurred much earlier within the function instead of at the end. With nested branches, I would never have seen that, you could even say they led to my misconception.
So this might be another situation where inverted ifs can contribute to a clearer code.
It's a matter of opinion.
My normal approach would be to avoid single line ifs, and returns in the middle of a method.
You wouldn't want lines like it suggests everywhere in your method but there is something to be said for checking a bunch of assumptions at the top of your method, and only doing your actual work if they all pass.
In my opinion early return is fine if you are just returning void (or some useless return code you're never gonna check) and it might improve readability because you avoid nesting and at the same time you make explicit that your function is done.
If you are actually returning a returnValue - nesting is usually a better way to go cause you return your returnValue just in one place (at the end - duh), and it might make your code more maintainable in a whole lot of cases.
I'm not sure, but I think, that R# tries to avoid far jumps. When You have IF-ELSE, compiler does something like this:
Condition false -> far jump to false_condition_label
true_condition_label:
instruction1
...
instruction_n
false_condition_label:
instruction1
...
instruction_n
end block
If condition is true there is no jump and no rollout L1 cache, but jump to false_condition_label can be very far and processor must rollout his own cache. Synchronising cache is expensive. R# tries replace far jumps into short jumps and in this case there is bigger probability, that all instructions are already in cache.
I think it depends on what you prefer, as mentioned, theres no general agreement afaik.
To reduce annoyment, you may reduce this kind of warning to "Hint"
My idea is that the return "in the middle of a function" shouldn't be so "subjective".
The reason is quite simple, take this code:
function do_something( data ){
if (!is_valid_data( data ))
return false;
do_something_that_take_an_hour( data );
istance = new object_with_very_painful_constructor( data );
if ( istance is not valid ) {
error_message( );
return ;
}
connect_to_database ( );
get_some_other_data( );
return;
}
Maybe the first "return" it's not SO intuitive, but that's really saving.
There are too many "ideas" about clean codes, that simply need more practise to lose their "subjective" bad ideas.
There are several advantages to this sort of coding but for me the big win is, if you can return quick you can improve the speed of your application. IE I know that because of Precondition X that I can return quickly with an error. This gets rid of the error cases first and reduces the complexity of your code. In a lot of cases because the cpu pipeline can be now be cleaner it can stop pipeline crashes or switches. Secondly if you are in a loop, breaking or returning out quickly can save you a lots of cpu. Some programmers use loop invariants to do this sort of quick exit but in this you can broke your cpu pipeline and even create memory seek problem and mean the the cpu needs to load from outside cache. But basically I think you should do what you intended, that is end the loop or function not create a complex code path just to implement some abstract notion of correct code. If the only tool you have is a hammer then everything looks like a nail.
I do not currently have this issue, but you never know, and thought experiments are always fun.
Ignoring the obvious problems that you would have to have with your architecture to even be attempting this, let's assume that you had some horribly-written code of someone else's design, and you needed to do a bunch of wide and varied operations in the same code block, e.g.:
WidgetMaker.SetAlignment(57);
contactForm["Title"] = txtTitle.Text;
Casserole.Season(true, false);
((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true;
Multiplied by a hundred. Some of these might work, others might go badly wrong. What you need is the C# equivalent of "on error resume next", otherwise you're going to end up copying and pasting try-catches around the many lines of code.
How would you attempt to tackle this problem?
public delegate void VoidDelegate();
public static class Utils
{
public static void Try(VoidDelegate v) {
try {
v();
}
catch {}
}
}
Utils.Try( () => WidgetMaker.SetAlignment(57) );
Utils.Try( () => contactForm["Title"] = txtTitle.Text );
Utils.Try( () => Casserole.Season(true, false) );
Utils.Try( () => ((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true );
Refactor into individual, well-named methods:
AdjustFormWidgets();
SetContactTitle(txtTitle.Text);
SeasonCasserole();
Each of those is protected appropriately.
I would say do nothing.
Yup thats right, do NOTHING.
You have clearly identified two things to me:
You know the architecture is borked.
There is a ton of this crap.
I say:
Do nothing.
Add a global error handler to send you an email every time it goes boom.
Wait until something falls over (or fails a test)
Correct that (Refactoring as necessary within the scope of the page).
Repeat every time a problem occurs.
You will have this cleared up in no time if it is that bad. Yeah I know it sounds sucky and you may be pulling your hair out with bugfixes to begin with, but it will allow you to fix the needy/buggy code before the (large) amount of code that may actually be working no matter how crappy it looks.
Once you start winning the war, you will have a better handle on the code (due to all your refactoring) you will have a better idea for a winning design for it..
Trying to wrap all of it in bubble wrap is probably going to take just a long to do and you will still not be any closer to fixing the problems.
It's pretty obvious that you'd write the code in VB.NET, which actually does have On Error Resume Next, and export it in a DLL to C#. Anything else is just being a glutton
for punishment.
Fail Fast
To elaborate, I guess I am questioning the question. If an exception is thrown, why would you want your code to simply continue as if nothing has happened? Either you expect exceptions in certain situations, in which case you write a try-catch block around that code and handle them, or there is an unexpected error, in which case you should prefer your application to abort, or retry, or fail. Not carry on like a wounded zombie moaning 'brains'.
This is one of the things that having a preprocessor is useful for. You could define a macro that swallows exceptions, then with a quick script add that macro to all lines.
So, if this were C++, you could do something like this:
#define ATTEMPT(x) try { x; } catch (...) { }
// ...
ATTEMPT(WidgetMaker.SetAlignment(57));
ATTEMPT(contactForm["Title"] = txtTitle.Text);
ATTEMPT(Casserole.Season(true, false));
ATTEMPT(((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true);
Unfortunately, not many languages seem to include a preprocessor like C/C++ did.
You could create your own preprocessor and add it as a pre-build step. If you felt like completely automating it you could probably write a preprocessor that would take the actual code file and add the try/catch stuff in on its own (so you don't have to add those ATTEMPT() blocks to the code manually). Making sure it only modified the lines it's supposed to could be difficult though (have to skip variable declarations, loop constructs, etc to that you don't break the build).
However, I think these are horrible ideas and should never be done, but the question was asked. :)
Really, you shouldn't ever do this. You need to find what's causing the error and fix it. Swallowing/ignoring errors is a bad thing to do, so I think the correct answer here is "Fix the bug, don't ignore it!". :)
On Error Resume Next is a really bad idea in the C# world. Nor would adding the equivalent to On Error Resume Next actually help you. All it would do is leave you in a bad state which could cause more subtle errors, data loss and possibly data corruption.
But to give the questioner his due, you could add a global handler and check the TargetSite to see which method borked. Then you could at least know what line it borked on. The next part would be to try and figure out how to set the "next statement" the same way the debugger does it. Hopefully your stack won't have unwound at this point or you can re-create it, but it's certainly worth a shot. However, given this approach the code would have to run in Debug mode every time so that you would have your debug symbols included.
As someone mentioned, VB allows this. How about doing it the same way in C#? Enter trusty reflector:
This:
Sub Main()
On Error Resume Next
Dim i As Integer = 0
Dim y As Integer = CInt(5 / i)
End Sub
Translates into this:
public static void Main()
{
// This item is obfuscated and can not be translated.
int VB$ResumeTarget;
try
{
int VB$CurrentStatement;
Label_0001:
ProjectData.ClearProjectError();
int VB$ActiveHandler = -2;
Label_0009:
VB$CurrentStatement = 2;
int i = 0;
Label_000E:
VB$CurrentStatement = 3;
int y = (int) Math.Round((double) (5.0 / ((double) i)));
goto Label_008F;
Label_0029:
VB$ResumeTarget = 0;
switch ((VB$ResumeTarget + 1))
{
case 1:
goto Label_0001;
case 2:
goto Label_0009;
case 3:
goto Label_000E;
case 4:
goto Label_008F;
default:
goto Label_0084;
}
Label_0049:
VB$ResumeTarget = VB$CurrentStatement;
switch (((VB$ActiveHandler > -2) ? VB$ActiveHandler : 1))
{
case 0:
goto Label_0084;
case 1:
goto Label_0029;
}
}
catch (object obj1) when (?)
{
ProjectData.SetProjectError((Exception) obj1);
goto Label_0049;
}
Label_0084:
throw ProjectData.CreateProjectError(-2146828237);
Label_008F:
if (VB$ResumeTarget != 0)
{
ProjectData.ClearProjectError();
}
}
Rewrite the code. Try to find sets of statements which logically depend on each other, so that if one fails then the next ones make no sense, and hive them off into their own functions and put try-catches round them, if you want to ignore the result of that and continue.
This may help you in identifing the pieces that have the most problems.
# JB King
Thanks for reminding me. The Logging application block has a Instrumentation Event that can be used to trace events, you can find more info on the MS Enterprise library docs.
Using (New InstEvent)
<series of statements>
End Using
All of the steps in this using will be traced to a log file, and you can parse that out to see where the log breaks (ex is thrown) and id the high offenders.
Refactoring is really your best bet, but if you have a lot, this may help you pinpoint the worst offenders.
You could use goto, but it's still messy.
I've actually wanted a sort of single statement try-catch for a while. It would be helpful in certain cases, like adding logging code or something that you don't want to interrupt the main program flow if it fails.
I suspect something could be done with some of the features associated with linq, but don't really have time to look into it at the moment. If you could just find a way to wrap a statement as an anonymous function, then use another one to call that within a try-catch block it would work... but not sure if that's possible just yet.
If you can get the compiler to give you an expression tree for this code, then you could modify that expression tree by replacing each statement with a new try-catch block that wraps the original statement. This isn't as far-fetched as it sounds; for LINQ, C# acquired the ability to capture lambda expressions as expression trees that can be manipulated in user code at runtime.
This approach is not possible today with .NET 3.5 -- if for no other reason than the lack of a "try" statement in System.Linq.Expressions. However, it may very well be viable in a future version of C# once the merge of the DLR and LINQ expression trees is complete.
Why not use the reflection in c#? You could create a class that reflects on the code and use line #s as the hint for what to put in each individual try/catch block. This has a few advantages:
Its slightly less ugly as it doesn't really you require mangle your source code and you can use it only during debug modes.
You learn something interesting about c# while implementing it.
I however would recommend against any of this, unless of course you are taking over maintance of someelses work and you need to get a handle on the exceptions so you can fix them. Might be fun to write though.
Fun question; very terrible.
It'd be nice if you could use a macro. But this is blasted C#, so you might solve it with some preprocessor work or some external tool to wrap your lines in individual try-catch blocks. Not sure if you meant you didn't want to manually wrap them or that you wanted to avoid try-catch entirely.
Messing around with this, I tried labeling every line and jumping back from a single catch, without much luck. However, Christopher uncovered the correct way to do this. There's some interesting additional discussion of this at Dot Net Thoughts and at Mike Stall's .NET Blog.
EDIT: Of course. The try-catch / switch-goto solution listed won't actually compile since the try labels are out-of-scope in catch. Anyone know what's missing to make something like this compile?
You could automate this with a compiler preprocess step or maybe hack up Mike Stall's Inline IL tool to inject some error-ignorance.
(Orion Adrian's answer about examining the Exception and trying to set the next instruction is interesting too.)
All in all, it seems like an interesting and instructive exercise. Of course, you'd have to decide at what point the effort to simulate ON ERROR RESUME NEXT outweighs the effort to fix the code. :-)
Catch the errors in the UnhandledException Event of the application. That way, unhandled execptions can even be logged as to the sender and whatever other information the developer would reasonable.
Unfortunately you are probably out of luck. On Error Resume Next is a legacy option that is generally heavily discouraged, and does not have an equivalent to my knowledge in C#.
I would recommend leaving the code in VB (It sounds like that was the source, given your specific request for OnError ResumeNext) and interfacing with or from a C# dll or exe that implements whatever new code you need. Then preform refactoring to cause the code to be safe, and convert this safe code to C# as you do this.
You could look at integrating the Enterprise Library's Exception Handling component for one idea of how to handle unhandled exceptions.
If this is for ASP.Net applications, there is a function in the Global.asax called, "Application_Error" that gets called in most cases with catastrophic failure being the other case usually.
Ignoring all the reasons you'd want to avoid doing this.......
If it were simply a need to keep # of lines down, you could try something like:
int totalMethodCount = xxx;
for(int counter = 0; counter < totalMethodCount; counter++) {
try {
if (counter == 0) WidgetMaker.SetAlignment(57);
if (counter == 1) contactForm["Title"] = txtTitle.Text;
if (counter == 2) Casserole.Season(true, false);
if (counter == 3) ((RecordKeeper)Session["CasseroleTracker"]).Seasoned = true;
} catch (Exception ex) {
// log here
}
}
However, you'd have to keep an eye on variable scope if you try to reuse any of the results of the calls.
Hilite each line, one at a time, 'Surround with' try/catch. That avoids the copying pasting you mentioned