I've never seen a for loop initialized this way and don't understand why it would be written this way?
I'm doing some research into connecting to an IMAP server in .NET and started looking at code from a library named ImapX. I found the for loop in a method that writes data to a NetworkStream and then appears to read the response within the funky for loop. I don't want to copy and paste someone else's code verbatim, but here's the gist:
public bool SendData(string data)
{
try
{
this.imapStreamWriter.Write(data);
for (bool flag = true; flag; flag = false)
{
var s = this.imapStreamReader.ReadLine();
}
}
catch (Exception)
{
return false;
}
return true;
}
Again, this isn't the exact code, but it's the general idea. That's all the method does, it doesn't use the server response, it just returns true if no exception was thrown. I just don't understand how or why the for loop is being used this way; can anyone explain what advantages initializing this offers, if any?
It's a horrible way of executing a loop once. If you change the flag initalizer to something which isn't always true, it may have slightly more sense, but not a lot. I've entirely-unseriously suggested this code before now:
Animal animal = ...;
for (Dog dog = animal as Dog; dog != null; dog = null)
{
// Use dog...
}
... as a way of using an as operator without "polluting" the outer scope. But it's language silliness, and not something I'd ever really use.
I had previously assumed you were looking at original source code, but from your comments it sounds like you're looking at Reflector's reconstituted C# based on the MSIL. It's important to understand that this does not necessarily bear a close resemblance to the original code as written.
At the MSIL level, there is no such thing as a for loop or a while loop. There are just conditional branch instructions. (See here: http://weblogs.asp.net/kennykerr/archive/2004/09/23/introduction-to-msil-part-6-common-language-constructs.aspx ) Any tool trying to reconstitute C# must make a number of guesses for how the code was originally structured. It seems probable that this wasn't originally written as a for loop, but that Reflector detected that the IL could have been generated by a for loop, and its heuristics made a bad guess that it should be a for loop and not something else.
If I look at the same code in ILSpy, it is rendered as a while loop. It's still redundant, but it looks a lot less weird. Furthermore, it's possible that the original code actually did something that has been optimized out, perhaps calls to [Conditional] methods or code marked with #if directives. Then again, perhaps the original code used to do something else but parts were commented out - the comments are not kept in the IL. Or maybe there was more code there in the past and it was just plain deleted.
In short, there are many ways what you see in Reflector can be very different from what was originally written. You should consider it a prettier presentation of IL, not an example of C# as it is written by humans.
I had this as a comment, but I suppose it's the answer as well.
The "loop" is pointless. It initializes flag to true and is declared to execute only while the flag is still true. However, flag is explicitly being set to false after the first iteration. So in this case it's guaranteed to run once and only once.
I suspect the author was trying to be sure that control flow would pause on ReadLine() - but it does that anyway until user input is received.
The for-loop is pretty much a no-op, as noted by others.
It does do one thing, though, that may or may not matter: it introduces a namespace scope. The scope of the variable s is the body of the for-loop. s goes out of scope once the for-loop is exited. You could get the same effect by simply creating a block, thus:
{
var s = this.imapStreamReader.ReadLine();
}
Which is still pretty silly.
No idea what the original author was trying to accomplish with this -- perhaps trying to ensure that his s was destroyed/garbage-collected/disposed -- not that this technique would work (it wouldn't).
I have this method (modified code) :
public static void PublishXmlForCustomTypes(MyOwnClass DefaultOutputInformation)
{
if (DefaultOutputInformation != null)
{
///lot of code
}
}
and my whole code was inside the if statement and after thinking about it, I changed to this :
public static void PublishXmlForCustomTypes(MyOwnClass DefaultOutputInformation)
{
if (DefaultOutputInformation == null)
{
return;
}
///lot of code
}
As far as I tested it, it seems to be strictly equivalent but is that really the case ?
I mean, the "return" statement get us out of the method right ?
This is strictly equivalent and the second version is the way to go :)
Yes, that's absolutely fine.
Some people dogmatically stick to "one exit point per method" - which was appropriate when it was relatively tricky to make sure you always did the right amount of clean-up at the end of a function in C, for example... but it's not really necessary in C#.
Personally I think it's appropriate to return as soon as you know that you've done all the work you really want to in a method. Use try/finally or using statements to perform any extra "clean up however I exit" work.
yes return gets you out of the method; if you have a finally block and you call return from the try block, the finally block is executed anyway.
Yes, the return statement ends the method.
Yes, the return will exit you out of the code. It's generally good practice as the very first step in a function to verify that the parameters that were passed in are what you think they are and exit (via the return or throwing an exception) so that you don't do any unnecessary processing only to have to abort later in the function.
Yes, your assumptions is correct.
For some background, learn about duality.
Yes, it is exactly the same, you can read the MSDN documentation about the keyword return to fully understand how it works : http://msdn.microsoft.com/en-us/library/1h3swy84.aspx
As to decide which way is better : both are good, but the second version makes it more readable because then your whole code isn't inside an if block. This way, you can see what the condition does really easily instead of reading the whole code of the method.
Indeed the return gets you out of the method, so it is equivalent to the first way you used. Which way is better depends on your code, although generally I would prefer the second version.
Looking at the revised code, the second one is the way to go. While being functionally equivalent, think about the case where you passed in 4 different variables to a function that you want to check. Instead of having a do a nasty 4 level if statement with {'s everywhere, the second method allows you to clean up the appearance of the code and not add unnecessary levels of brackets. If you're writing in C/C++, you can even make this a macro such as VERYIFY_NOT_NULL(x) and make the code nice and neat.
Readable/maintainable code trumps nano-seconds of performance 99% of the time.
I guess in almost every programm sometimes methods don't need to be called all the time but under specific conditions only.
It it very easy to check if a method must be called. A simple if-statment can do the trick.
if (value == true)
{
DoSomething();
}
But if you have many conditions the validation can get complicated and the code gets longer and longer.
So I wrote code with the method called every time
and the method itself will check and validate if her code needs to be executed.
DoSomething(value);
... then ...
public void DoSomething(bool value)
{
if (value == true)
{
// Do Something here ...
}
}
Now I have two ways of doing things. I am not exactly sure which way is the right way.
Or maybe there is even another option?
Clean Code — A Handbook of Agile Software Craftsmanship promotes not to write methods accepting a single boolean parameter because each method should do one thing and one thing only. If a method takes a boolean parameter to decide what to do, it automatically does two things: deciding what to do and actually doing something. The method should be refactored into two separate methods doing something and a single method deciding which of the two methods to call.
Furthermore, evaluating a boolean value using value == true is redundant and unnecessary. The value itself represents a boolean state (true / false) and does not need to be compared to true again. That said, the best practice is using if (value) instead of if (value == true) (or if ((value == true) == true; this seems idiotic but does not differ much from the approach of if (value == true)).
I find the answer to this question to be fairly obvious - unless I'm missing something. Adapt to each situation.
The called function should do what it's intended to do. If its intention is to work on some set of arguments, by all means do the checking inside the function.
If you plan to call the function conditionally, do the checking outside.
Moving the check inside just so you can save some extra verification is not a good idea I think, since others might want to call your function and not know whether it actually works given their parameters. I say, unless checking inside is imperative, leave the checking outside.
EDIT:
I just re-read your question...
You basically have:
void foo(bool actuallyExecuteFoo)
{
////
}
Really? REALLY?
But if you have many conditions the validation can get complicated and the code gets longer and longer.
If the validation is complicated, it means that the logic underneath is complicated. Expect your code to be as complicated as your logic - it has to be somewhere out there, right? Just think how to write it in a clean way. And the clean way is not always the shortest way.
I recommend this variant:
if (value == true)
{
DoSomething();
}
Why? Because:
the code calling DoSomething is then more clear (*), as it explicitly shows when the logic of DoSomething should be executed and when not,
DoSomething itself depends on less parameters (which makes it more generic and reusable).
*) Yes, "more clear" actually means "longer" here, but it also means "explicit" and "self-documenting". The shorter variant actually tries to hide some logic, which makes the code less clear.
Check out this for C# not sure why you need C++ :)
If the method can throw a Argument/ArgumentNull Exception, then you should validate it before calling the method. Also, Microsofts Code Contracts, will tell you if you need to validate the input before calling the method, for any code using contracts (basically assertions for static analysis).
The general rule is not to validate the input more than necessary. If something isn't valid, you should throw a exception (C#), or return a error (C++). Not executing the code due to a invalid input, without telling why, makes it near-impossible for the next-developer to figure out what the problem is.
I would recommend the second way, but I have some remarks:
You do not need to check if (value == true), just check if (value) instead.
Return earlier, what I mean if (!value) { return; }
Second way will take more execution time, though code will look better. Why don't you use macro?
#define DOSOMETHING(value) if (value) {DoSomething();}
Replace all
if (value == true) {DoSomething(); }
with macro DOSOMETHING(value)
Your purpose will be solved and code will look better
So I came across some code this morning that looked like this:
try
{
x = SomeThingDangerous();
return x;
}
catch (Exception ex)
{
throw new DangerousException(ex);
}
finally
{
CleanUpDangerousStuff();
}
Now this code compiles fine and works as it should, but it just doesn't feel right to return from within a try block, especially if there's an associated finally.
My main issue is what happens if the finally throws an exception of it's own? You've got a returned variable but also an exception to deal with... so I'm interested to know what others think about returning from within a try block?
No, it's not a bad practice. Putting return where it makes sense improves readability and maintainability and makes your code simpler to understand. You shouldn't care as finally block will get executed if a return statement is encountered.
The finally will be executed no matter what, so it doesn't matter.
Personally, I would avoid this kind of coding as I don't feel like seeing return statements before finally statements.
My mind is simple and it process things rather linearly. Therefore when I walk through the code for dry running, I will have tendency to think that once I can reach the return statement, everything follow doesn't matter which obviously is pretty wrong in this case (not that it would affect the return statement but what the side effects could be).
Thus, I would arrange the code so that the return statement always appear after the finally statements.
This may answer your question
What really happens in a try { return x; } finally { x = null; } statement?
From reading that question it sounds like you can have another try catch structure in the finally statement if you think it might throw an exception. The compiler will figure out when to return the value.
That said, it might be better to restructure your code anyway just so it doesn't confuse you later on or someone else who may be unaware of this as well.
Functionally there is no difference.
However there is one reason for not doing this. Longer methods with several exit points are often more difficult to read and analyze. But that objection has more to do with return statements than catch and finally blocks.
In your example either way is equivalent, I wouldn't even be suprised if the compiler generated the same code. If an exception happens in the finally block you have the same issues whether you put the return statement in block or outside of it.
The real question is stylistically which is best. I like to write my methods so that there is only one return statement, this way it is easier to see the flow out of the method, it follows that I also like to put the return statement last so it is easy to see that it is the end of the method and this what it returns.
I think with the return statement so neatly placed as the last statement, others are less likely to come and sprinkle multiple returns statements into other parts of the method.
When I ran ReSharper on my code, for example:
if (some condition)
{
Some code...
}
ReSharper gave me the above warning (Invert "if" statement to reduce nesting), and suggested the following correction:
if (!some condition) return;
Some code...
I would like to understand why that's better. I always thought that using "return" in the middle of a method problematic, somewhat like "goto".
It is not only aesthetic, but it also reduces the maximum nesting level inside the method. This is generally regarded as a plus because it makes methods easier to understand (and indeed, many static analysis tools provide a measure of this as one of the indicators of code quality).
On the other hand, it also makes your method have multiple exit points, something that another group of people believes is a no-no.
Personally, I agree with ReSharper and the first group (in a language that has exceptions I find it silly to discuss "multiple exit points"; almost anything can throw, so there are numerous potential exit points in all methods).
Regarding performance: both versions should be equivalent (if not at the IL level, then certainly after the jitter is through with the code) in every language. Theoretically this depends on the compiler, but practically any widely used compiler of today is capable of handling much more advanced cases of code optimization than this.
A return in the middle of the method is not necessarily bad. It might be better to return immediately if it makes the intent of the code clearer. For example:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
};
}
return result;
};
In this case, if _isDead is true, we can immediately get out of the method. It might be better to structure it this way instead:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
};
I've picked this code from the refactoring catalog. This specific refactoring is called: Replace Nested Conditional with Guard Clauses.
This is a bit of a religious argument, but I agree with ReSharper that you should prefer less nesting. I believe that this outweighs the negatives of having multiple return paths from a function.
The key reason for having less nesting is to improve code readability and maintainability. Remember that many other developers will need to read your code in the future, and code with less indentation is generally much easier to read.
Preconditions are a great example of where it is okay to return early at the start of the function. Why should the readability of the rest of the function be affected by the presence of a precondition check?
As for the negatives about returning multiple times from a method - debuggers are pretty powerful now, and it's very easy to find out exactly where and when a particular function is returning.
Having multiple returns in a function is not going to affect the maintainance programmer's job.
Poor code readability will.
As others have mentioned, there shouldn't be a performance hit, but there are other considerations. Aside from those valid concerns, this also can open you up to gotchas in some circumstances. Suppose you were dealing with a double instead:
public void myfunction(double exampleParam){
if(exampleParam > 0){
//Body will *not* be executed if Double.IsNan(exampleParam)
}
}
Contrast that with the seemingly equivalent inversion:
public void myfunction(double exampleParam){
if(exampleParam <= 0)
return;
//Body *will* be executed if Double.IsNan(exampleParam)
}
So in certain circumstances what appears to be a a correctly inverted if might not be.
The idea of only returning at the end of a function came back from the days before languages had support for exceptions. It enabled programs to rely on being able to put clean-up code at the end of a method, and then being sure it would be called and some other programmer wouldn't hide a return in the method that caused the cleanup code to be skipped. Skipped cleanup code could result in a memory or resource leak.
However, in a language that supports exceptions, it provides no such guarantees. In a language that supports exceptions, the execution of any statement or expression can cause a control flow that causes the method to end. This means clean-up must be done through using the finally or using keywords.
Anyway, I'm saying I think a lot of people quote the 'only return at the end of a method' guideline without understanding why it was ever a good thing to do, and that reducing nesting to improve readability is probably a better aim.
I'd like to add that there is name for those inverted if's - Guard Clause. I use it whenever I can.
I hate reading code where there is if at the beginning, two screens of code and no else. Just invert if and return. That way nobody will waste time scrolling.
http://c2.com/cgi/wiki?GuardClause
It doesn't only affect aesthetics, but it also prevents code nesting.
It can actually function as a precondition to ensure that your data is valid as well.
This is of course subjective, but I think it strongly improves on two points:
It is now immediately obvious that your function has nothing left to do if condition holds.
It keeps the nesting level down. Nesting hurts readability more than you'd think.
Multiple return points were a problem in C (and to a lesser extent C++) because they forced you to duplicate clean-up code before each of the return points. With garbage collection, the try | finally construct and using blocks, there's really no reason why you should be afraid of them.
Ultimately it comes down to what you and your colleagues find easier to read.
Guard clauses or pre-conditions (as you can probably see) check to see if a certain condition is met and then breaks the flow of the program. They're great for places where you're really only interested in one outcome of an if statement. So rather than say:
if (something) {
// a lot of indented code
}
You reverse the condition and break if that reversed condition is fulfilled
if (!something) return false; // or another value to show your other code the function did not execute
// all the code from before, save a lot of tabs
return is nowhere near as dirty as goto. It allows you to pass a value to show the rest of your code that the function couldn't run.
You'll see the best examples of where this can be applied in nested conditions:
if (something) {
do-something();
if (something-else) {
do-another-thing();
} else {
do-something-else();
}
}
vs:
if (!something) return;
do-something();
if (!something-else) return do-something-else();
do-another-thing();
You'll find few people arguing the first is cleaner but of course, it's completely subjective. Some programmers like to know what conditions something is operating under by indentation, while I'd much rather keep method flow linear.
I won't suggest for one moment that precons will change your life or get you laid but you might find your code just that little bit easier to read.
Performance-wise, there will be no noticeable difference between the two approaches.
But coding is about more than performance. Clarity and maintainability are also very important. And, in cases like this where it doesn't affect performance, it is the only thing that matters.
There are competing schools of thought as to which approach is preferable.
One view is the one others have mentioned: the second approach reduces the nesting level, which improves code clarity. This is natural in an imperative style: when you have nothing left to do, you might as well return early.
Another view, from the perspective of a more functional style, is that a method should have only one exit point. Everything in a functional language is an expression. So if statements must always have an else clauses. Otherwise the if expression wouldn't always have a value. So in the functional style, the first approach is more natural.
There are several good points made here, but multiple return points can be unreadable as well, if the method is very lengthy. That being said, if you're going to use multiple return points just make sure that your method is short, otherwise the readability bonus of multiple return points may be lost.
Performance is in two parts. You have performance when the software is in production, but you also want to have performance while developing and debugging. The last thing a developer wants is to "wait" for something trivial. In the end, compiling this with optimization enabled will result in similar code. So it's good to know these little tricks that pay off in both scenarios.
The case in the question is clear, ReSharper is correct. Rather than nesting if statements, and creating new scope in code, you're setting a clear rule at the start of your method. It increases readability, it will be easier to maintain, and it reduces the amount of rules one has to sift through to find where they want to go.
Personally I prefer only 1 exit point. It's easy to accomplish if you keep your methods short and to the point, and it provides a predictable pattern for the next person who works on your code.
eg.
bool PerformDefaultOperation()
{
bool succeeded = false;
DataStructure defaultParameters;
if ((defaultParameters = this.GetApplicationDefaults()) != null)
{
succeeded = this.DoSomething(defaultParameters);
}
return succeeded;
}
This is also very useful if you just want to check the values of certain local variables within a function before it exits. All you need to do is place a breakpoint on the final return and you are guaranteed to hit it (unless an exception is thrown).
Avoiding multiple exit points can lead to performance gains. I am not sure about C# but in C++ the Named Return Value Optimization (Copy Elision, ISO C++ '03 12.8/15) depends on having a single exit point. This optimization avoids copy constructing your return value (in your specific example it doesn't matter). This could lead to considerable gains in performance in tight loops, as you are saving a constructor and a destructor each time the function is invoked.
But for 99% of the cases saving the additional constructor and destructor calls is not worth the loss of readability nested if blocks introduce (as others have pointed out).
Many good reasons about how the code looks like. But what about results?
Let's take a look to some C# code and its IL compiled form:
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length == 0) return;
if ((args.Length+2)/3 == 5) return;
Console.WriteLine("hey!!!");
}
}
This simple snippet can be compiled. You can open the generated .exe file with ildasm and check what is the result. I won't post all the assembler thing but I'll describe the results.
The generated IL code does the following:
If the first condition is false, jumps to the code where the second is.
If it's true jumps to the last instruction. (Note: the last instruction is a return).
In the second condition the same happens after the result is calculated. Compare and: got to the Console.WriteLine if false or to the end if this is true.
Print the message and return.
So it seems that the code will jump to the end. What if we do a normal if with nested code?
using System;
public class Test {
public static void Main(string[] args) {
if (args.Length != 0 && (args.Length+2)/3 != 5)
{
Console.WriteLine("hey!!!");
}
}
}
The results are quite similar in IL instructions. The difference is that before there were two jumps per condition: if false go to next piece of code, if true go to the end. And now the IL code flows better and has 3 jumps (the compiler optimized this a bit):
First jump: when Length is 0 to a part where the code jumps again (Third jump) to the end.
Second: in the middle of the second condition to avoid one instruction.
Third: if the second condition is false, jump to the end.
Anyway, the program counter will always jump.
In theory, inverting if could lead to better performance if it increases branch prediction hit rate. In practice, I think it is very hard to know exactly how branch prediction will behave, especially after compiling, so I would not do it in my day-to-day development, except if I am writing assembly code.
More on branch prediction here.
That is simply controversial. There is no "agreement among programmers" on the question of early return. It's always subjective, as far as I know.
It's possible to make a performance argument, since it's better to have conditions that are written so they are most often true; it can also be argued that it is clearer. It does, on the other hand, create nested tests.
I don't think you will get a conclusive answer to this question.
There are a lot of insightful answers there already, but still, I would to direct to a slightly different situation: Instead of precondition, that should be put on top of a function indeed, think of a step-by-step initialization, where you have to check for each step to succeed and then continue with the next. In this case, you cannot check everything at the top.
I found my code really unreadable when writing an ASIO host application with Steinberg's ASIOSDK, as I followed the nesting paradigm. It went like eight levels deep, and I cannot see a design flaw there, as mentioned by Andrew Bullock above. Of course, I could have packed some inner code to another function, and then nested the remaining levels there to make it more readable, but this seems rather random to me.
By replacing nesting with guard clauses, I even discovered a misconception of mine regarding a portion of cleanup-code that should have occurred much earlier within the function instead of at the end. With nested branches, I would never have seen that, you could even say they led to my misconception.
So this might be another situation where inverted ifs can contribute to a clearer code.
It's a matter of opinion.
My normal approach would be to avoid single line ifs, and returns in the middle of a method.
You wouldn't want lines like it suggests everywhere in your method but there is something to be said for checking a bunch of assumptions at the top of your method, and only doing your actual work if they all pass.
In my opinion early return is fine if you are just returning void (or some useless return code you're never gonna check) and it might improve readability because you avoid nesting and at the same time you make explicit that your function is done.
If you are actually returning a returnValue - nesting is usually a better way to go cause you return your returnValue just in one place (at the end - duh), and it might make your code more maintainable in a whole lot of cases.
I'm not sure, but I think, that R# tries to avoid far jumps. When You have IF-ELSE, compiler does something like this:
Condition false -> far jump to false_condition_label
true_condition_label:
instruction1
...
instruction_n
false_condition_label:
instruction1
...
instruction_n
end block
If condition is true there is no jump and no rollout L1 cache, but jump to false_condition_label can be very far and processor must rollout his own cache. Synchronising cache is expensive. R# tries replace far jumps into short jumps and in this case there is bigger probability, that all instructions are already in cache.
I think it depends on what you prefer, as mentioned, theres no general agreement afaik.
To reduce annoyment, you may reduce this kind of warning to "Hint"
My idea is that the return "in the middle of a function" shouldn't be so "subjective".
The reason is quite simple, take this code:
function do_something( data ){
if (!is_valid_data( data ))
return false;
do_something_that_take_an_hour( data );
istance = new object_with_very_painful_constructor( data );
if ( istance is not valid ) {
error_message( );
return ;
}
connect_to_database ( );
get_some_other_data( );
return;
}
Maybe the first "return" it's not SO intuitive, but that's really saving.
There are too many "ideas" about clean codes, that simply need more practise to lose their "subjective" bad ideas.
There are several advantages to this sort of coding but for me the big win is, if you can return quick you can improve the speed of your application. IE I know that because of Precondition X that I can return quickly with an error. This gets rid of the error cases first and reduces the complexity of your code. In a lot of cases because the cpu pipeline can be now be cleaner it can stop pipeline crashes or switches. Secondly if you are in a loop, breaking or returning out quickly can save you a lots of cpu. Some programmers use loop invariants to do this sort of quick exit but in this you can broke your cpu pipeline and even create memory seek problem and mean the the cpu needs to load from outside cache. But basically I think you should do what you intended, that is end the loop or function not create a complex code path just to implement some abstract notion of correct code. If the only tool you have is a hammer then everything looks like a nail.