there is some code to execute after validation.
consider a variable SOQualityStandards = true;
this variable is validated before the execution of code.
i have come across two ways of checking SOQualityStandards
one is
if(SOQualityStandards)
{
//code to execute
}
and the other is
if(!SOQualityStandards) return;
//code to execute
is there any performance difference between both. which one should i consider.
They have the same semantics (assuming there is no other code in the function after the if-block in the first example).
I find the first to be clearer, but that is a matter of personal preference.
The compiler will consider those two options to be the same and could transform one into the other or visa versa, so performance considerations are irrelivent. Even if performence were affected, I think the readability/maintainability is a larger issue in these questions anyway.
I tend to do the return at the beginning in cases like this, because it reduces the indentations and mental burden in reading the rest of the method. The states tested in returns at the beginning become states that no longer need to be considered in understanding the method. Large if blocks, on the other hand, require mentally tracking the state differences throughout the method.
This becomes especially important if there are several tests that need to be done to guard an interior block of code.
There's no difference in the two approaches. It's a matter of personal choice.
Choosing between the two is just personal preference (see Sven's answer and bomslang's answer). Micro-optimization is in most cases completely unnecessary.
You shouldn't optimize execute-time until you see it's a problem. You could be spending that valuable time to add other functionality or to come up with improvements to the system architecture.
In the case you actually need to optimize, loops and recursive functions are generally the first place to look.
If you would need further optimization than that, single line variable checks and manipulations would still be some of the last things to optimize.
Remember (as Jeffrey said) readability and maintainability are in most cases the most important factors.
Related
Please ignore code readability in this question.
In terms of performance, should the following code be written like this:
int maxResults = criteria.MaxResults;
if (maxResults > 0)
{
while (accounts.Count > maxResults)
accounts.RemoveAt(maxResults);
}
or like this:
if (criteria.MaxResults > 0)
{
while (accounts.Count > criteria.MaxResults)
accounts.RemoveAt(criteria.MaxResults);
}
?
Edit: criteria is a class, and MaxResults is a simple integer property (i.e., public int MaxResults { get { return _maxResults; } }.
Does the C# compiler treat MaxResults as a black box and evaluate it every time? Or is it smart enough to figure out that I've got 3 calls to the same property with no modification of that property between the calls? What if MaxResults was a field?
One of the laws of optimization is precalculation, so I instinctively wrote this code like the first listing, but I'm curious if this kind of thing is being done for me automatically (again, ignore code readability).
(Note: I'm not interested in hearing the 'micro-optimization' argument, which may be valid in the specific case I've posted. I'd just like some theory behind what's going on or not going on.)
First off, the only way to actually answer performance questions is to actually try it both ways and test the results in realistic conditions.
That said, the other answers which say that "the compiler" does not do this optimization because the property might have side effects are both right and wrong. The problem with the question (aside from the fundamental problem that it simply cannot be answered without actually trying it and measuring the result) is that "the compiler" is actually two compilers: the C# compiler, which compiles to MSIL, and the JIT compiler, which compiles IL to machine code.
The C# compiler never ever does this sort of optimization; as noted, doing so would require that the compiler peer into the code being called and verify that the result it computes does not change over the lifetime of the callee's code. The C# compiler does not do so.
The JIT compiler might. No reason why it couldn't. It has all the code sitting right there. It is completely free to inline the property getter, and if the jitter determines that the inlined property getter returns a value that can be cached in a register and re-used, then it is free to do so. (If you don't want it to do so because the value could be modified on another thread then you already have a race condition bug; fix the bug before you worry about performance.)
Whether the jitter actually does inline the property fetch and then enregister the value, I have no idea. I know practically nothing about the jitter. But it is allowed to do so if it sees fit. If you are curious about whether it does so or not, you can either (1) ask someone who is on the team that wrote the jitter, or (2) examine the jitted code in the debugger.
And finally, let me take this opportunity to note that computing results once, storing the result and re-using it is not always an optimization. This is a surprisingly complicated question. There are all kinds of things to optimize for:
execution time
executable code size -- this has a major effect on executable time because big code takes longer to load, increases the working set size, puts pressure on processor caches, RAM and the page file. Small slow code is often in the long run faster than big fast code in important metrics like startup time and cache locality.
register allocation -- this also has a major effect on execution time, particularly in architectures like x86 which have a small number of available registers. Enregistering a value for fast re-use can mean that there are fewer registers available for other operations that need optimization; perhaps optimizing those operations instead would be a net win.
and so on. It get real complicated real fast.
In short, you cannot possibly know whether writing the code to cache the result rather than recomputing it is actually (1) faster, or (2) better performing. Better performance does not always mean making execution of a particular routine faster. Better performance is about figuring out what resources are important to the user -- execution time, memory, working set, startup time, and so on -- and optimizing for those things. You cannot do that without (1) talking to your customers to find out what they care about, and (2) actually measuring to see if your changes are having a measurable effect in the desired direction.
If MaxResults is a property then no, it will not optimize it, because the getter may have complex logic, say:
private int _maxResults;
public int MaxReuslts {
get { return _maxResults++; }
set { _maxResults = value; }
}
See how the behavior would change if it in-lines your code?
If there's no logic...either method you wrote is fine, it's a very minute difference and all about how readable it is TO YOU (or your team)...you're the one looking at it.
Your two code samples are only guaranteed to have the same result in single-threaded environments, which .Net isn't, and if MaxResults is a field (not a property). The compiler can't assume, unless you use the synchronization features, that criteria.MaxResults won't change during the course of your loop. If it's a property, it can't assume that using the property doesn't have side effects.
Eric Lippert points out quite correctly that it depends a lot on what you mean by "the compiler". The C# -> IL compiler? Or the IL -> machine code (JIT) compiler? And he's right to point out that the JIT may well be able to optimize the property getter, since it has all of the information (whereas the C# -> IL compiler doesn't, necessarily). It won't change the situation with multiple threads, but it's a good point nonetheless.
It will be called and evaluated every time. The compiler has no way of determining if a method (or getter) is deterministic and pure (no side effects).
Note that actual evaluation of the property may be inlined by the JIT compiler, making it effectively as fast as a simple field.
It's good practise to make property evaluation an inexpensive operation. If you do some heavy calculation in the getter, consider caching the result manually, or changing it to a method.
why not test it?
just set up 2 console apps make it look 10 million times and compare the results ... remember to run them as properly released apps that have been installed properly or else you cannot gurantee that you are not just running the msil.
Really you are probably going to get about 5 answers saying 'you shouldn't worry about optimisation'. they clearly do not write routines that need to be as fast as possible before being readable (eg games).
If this piece of code is part of a loop that is executed billions of times then this optimisation could be worthwhile. For instance max results could be an overridden method and so you may need to discuss virtual method calls.
Really the ONLY way to answer any of these questions is to figure out is this is a piece of code that will benefit from optimisation. Then you need to know the kinds of things that are increasing the time to execute. Really us mere mortals cannot do this a priori and so have to simply try 2-3 different versions of the code and then test it.
If criteria is a class type, I doubt it would be optimized, because another thread could always change that value in the meantime. For structs I'm not sure, but my gut feeling is that it won't be optimized, but I think it wouldn't make much difference in performance in that case anyhow.
Consider the following code sample:
private void AddEnvelope(MailMessage mail)
{
if (this.CopyEnvelope)
{
// Perform a few operations
}
}
vs
private void AddEnvelope(MailMessage mail)
{
if (!this.CopyEnvelope) return;
// Perform a few operations
}
Will the bottom code execute any faster? Why would ReSharper make this recommendation?
Update
Having thought about this question the answer might seem obvious to some. But lots of us developers were never in the habit of nesting zounds of if statements in the first place...
It doesn't matter. Stop agonizing over performance issues that don't exist - use a profiler to identify areas in your code that DO exhibit issues, and fix them. Proactive optimization - before you know that there is a problem - is by definition a waste of time.
Updated Answer:
It's a code maintainability suggestion. Easier to read than nesting the rest of the code in an IF statement. Examples/discussion of this can be seen at the following links:
Flattening Arrow Code
Replace Nested Conditional With Guard Clauses
Code Contracts section "Legacy Requires Statements"
Original Answer:
It will actually run (very negligibly) slower from having to perform a
NOT operation.
So much in fact, some people actually consider that prettier way to
code as it avoids an extra level of indentation for the bulk of the
code.
It's a refactor of a conditional that encompasses the entire method contents to a Guard Clause. It has nothing to do with optimization.
I like the comments about optimizing things like this, to add a little more to it...
The only time I can think of that it makes sense to optimize your if statements is when you have the results of TWO or more longish running methods that need to be combined to determine to do something else. You would only want to execute the second operation if the first operation yielded results that would pass the condition. Putting the one that is most likely to return false first will generally be a smarter choice. This is because if it is false, the second one will not be evaluated at all. Again, only worth worrying about if the operations are significant and you can predict which is more likely to pass or fail. Invert this for OR... if true, it will only evaluate the first, and so optimize that way.
i.e.
if (ThisOneUsuallyPasses() && ThisOneUsuallyFails())
isn't so good as
if (ThisOneUsuallyFails() && ThisOneUsuallyPasses())
because it's only on the odd case that the first one actually works that you have to look at the second. There's some other flavors of this you can derive, but I think you should get the point.
Better to worry about how you use strings, collections, index your database, and allocate objects than spend a lot of time worrying about single condition if statements if you are worrying about perf.
In general, what the bottom code you give will do is give you an opportunity to avoid a huge block of code inside an if statement which can lead to silly typo driven errors. Old school thinking was that you should only have one point that you return from a method to avoid a different breed of coder error. Current thinking (at least by some of the tool vendors such as jetbrains resharper, etc) seems to be that wrapping the least amount of code inside of conditional statements is better. Anything more than that would be subjective so I'll leave it at that.
This kind of "Optimizations" are not worth the time spent on refactoring your code, because all modern compilers does already enough small optimizations that they make this kind of tips trivial.
As mentioned above Performance Optimization is done through profilers to calculate how your system is performing and the potential bottlenecks before applying the performance fix, and then after the performance fix to see if your fix is any good.
Required reading: Cyclomatic_complexity
Cyclomatic Complexity is a quantitative measure of the number of linearly independent
paths through a program's source code
Which means, every time you branch using and if statement you increase the Cyclomatic Complexity by 1.
To test each linearly independent path through the program; in this case, the number of test cases will equal the cyclomatic complexity of the program.
Which means, if you want to test your code completely, for each if statement you would have to introduce a new test case.
So, by introducing more if statements the complexity of your code increases, as does the number of test cases required to test it.
By removing if statements, your code complexity decreases as does the number of test cases required to test.
I need to get three objects out of a function, my instinct is to create a new type to return the three refs. Or if the refs were the same type I could use an array. However pass-by-ref is easier:
private void Mutate_AddNode_GetGenes(ref NeuronGene newNeuronGene, ref ConnectionGene newConnectionGene1, ref ConnectionGene newConnectionGene2)
{
}
There's obviously nothing wrong with this but I hesitate to use this approach, mostly I think for reasons of aesthetics and psycholgical bias. Are there actually any good reasons to use one of these approaches over the others? Perhaps a performance issue with creating extra wrapper objects or pushing parameters onto the stack. Note that in my particular case this is CPU intensive code. CPU cycles matter.
Is there a more elegant C#2 of C#3 approach?
Thanks.
For almost all computing problems, you will not notice the CPU difference. Since your sample code has the word "Gene" in it, you may actually fall into the rare category of code that would notice.
Creating and destroying objects just to wrap other objects would cost a bit of performance (they need to be created and garbage collected after all).
Aesthetically I would not create an object just to group unrelated objects, but if they logically belong together it is perfectly fine to define a containing object.
If you're worrying about the performance of a wrapping type (which is a lot cleaner, IMHO), you should use a struct. Current 32-bits implementations of .NET (and the upcomming 64-bits 4.0) support inlining / optimizing away of structs in many cases, so you'd probably see no performance difference whatsoever between a struct and ref arguments.
Worrying about the relative execution speed of those two options is probably a premature optimization. Focus on getting the algorithm correct first, and having clean, maintainable code. When that's done, you can run a profiler on it and optimize the 20% of the code that takes 80% of the CPU time. Even if this method ends up being in that 20%, the difference between the two calling styles is probably to small to register.
So, performance issues aside, I'd probably use a container class. Since this method takes only those three parameters, and (presumably) modifies each one, it sounds like it would make sense to have it as a method of the container class, with three member variables instead of ref parameters.
Lately, I have taken to the pattern of having a lot of diagnostic logging in parts of my code, that makes use of lambda expressions/anonymous delegates like so:
MyEventManager.LogVerbose( LogCategory.SomeCategory, () => String.Format(msg_string, GetParam1(), GetParam2(), GetParam3() );
Notice that the second argument to LogVerbose is a lambda expression which evaluates to a string. The reason for this is that if verbose logging is not actually enabled, LogVerbose should exit having done as little work as possible, in order to minimize performance impact. The construction of the error message string may, in some cases, take time or resources, and if the lambda expression is never evaluated, that performance penalty will not be incurred.
I'm wondering if littering the type system with so many anonymous delegates like this will have some unforeseen consequence for application performance, or if there are any other strategies I should consider.
It should be fine. In particular, if your anonymous function doesn't capture anything, it is cached as a static field (because it can be). If you capture "this" then you'll end up creating new delegate instances, but they're not expensive.
If you capture local variables, that will involve instantiating a nested type - but I'd only worry about this if you saw it actually becoming a problem. As ever with optimisation, focus on readability first, measure the performance, and then profile it to find out where you need to concentrate your efforts.
Whilst I don't actually know for sure the answer to the question, I think its worth considering that a drive to a more functional style of programming in C# would be seriously undermined if there were any suggestion that there would some kind of limit on the use of such expressions.
I've got a solution that's got thousands of anon delegates, and it still works. Sometimes Visual Studio is a little clunky, but whether that's because we've got hundreds of projects or this or some other factor is unknown. The performance of the applications don't seem to be that strongly affected (with quite a bit of perf testing).
I am getting two contradicting views on this. Some source says there should be less little methods to reduce method calls, but some other source says writing shorter method is good for letting the JIT to do the optimization.
So, which side is correct?
The overhead of actually making the method call is inconsequentially small in most every case. You never need to worry about it unless you can clearly identify a problem down the road that requires revisiting the issue (you won't).
It's far more important that your code is simple, readable, modular, maintainable, and modifiable. Methods should do one thing, one thing only and delegate sub-things to other routines. This means your methods should be as short as they can possibly be, but not any shorter. You will see far more performance benefits by having code that is less prone to error and bugs because it is simple, than by trying to outsmart the compiler or the runtime.
The source that says methods should be long is wrong, on many levels.
None, you should have relatively short method to achieve readability.
There is no one simple rule about function size. The guideline should be a function should do 'one thing'. That's a little vague but becomes easier with experience. Small functions generally lead to readability. Big ones are occasionally necessary.
Worrying about the overhead of method calls is premature optimization.
As always, it's about finding a good balance. The most important thing is that the method does one thing only. Longer methods tend to do more than one thing.
The best single criterion to guide you in sizing methods is to keep them well-testable. If you can (and actually DO!-) thoroughly unit-test every single method, your code is likely to be quite good; if you skimp on testing, your code is likely to be, at best, mediocre. If a method is difficult to test thoroughly, then that method is likely to be "too big" -- trying to do too many things, and therefore also harder to read and maintain (as well as badly-tested and therefore a likely haven for bugs).
First of all, you should definitely not be micro-optimizing the performance on the number-of-methods level. You will most likely not get any measurable performance benefit. Only if you have some method that are being called in a tight loop millions of times, it might be an idea - but don't begin optimizing on that before you need it.
You should stick to short concise methods, that does one thing, that makes the intent of the method clear. This will give you easier-to-read code, that is easier to understand and promotes code reuse.
The most important cost to consider when writing code is maintanability. You will spend much, much more time maintaining an application and fixing bugs than you ever will fixing performance problems.
In this case the almost certainly insignificant cost of calling a method is incredibly small when compared to the cost of maintaining a large unwieldy method. Small concise methods are easier to maintain and comprehend. Additionally the cost of calling the method almost certainly will not have a significant performance impact on your application. And if it does, you can only assertain that by using a profiler. Developers are notoriously bad at identifying performance problems before hand.
Generally speaking, once a performance problem is identified, they are easy to fix. Making a method or more importantly a code base, maintainable is a much higher cost.
Personally, I am not afraid of long methods as long as the person writing them writes them well (every piece of sub-task separated by 2 newlines and a nice comment preceeding it, etc. Also, identation is very important.).
In fact, many times I even prefer them (e.g. when writing code that does things in a specific order with sequential logic).
Also, I really don't understand why breaking a long method into 100 pieces will improve readablility (as others suggest). Only the opposite. You will only end-up jumping all over the place and holding pieces of code in your memory just to get a complete picture of what is going on in your code. Combine that with possible lack of comments, bad function names, many similar function names and you have the perfect recipe for chaos.
Also, you could go the other end while trying to reduce the size of the methods: to create MANY classes and MANY functions each of which may take MANY parameters. I don't think this improves readability either (especially for a begginer to a project that has no clue what each class/method do).
And the demand that "a function should do 'one thing'" is very subjective. 'One thing' may be increasing a variable by one up to doing a ton of work supposedly for the 'same thing'.
My rule is only reuseability:
The same code should not appear many times in many places. If this is the case you need a new function.
All the rest is just philosophical talk.
In a question of "why do you make your methods so big" I reply, "why not if the code is simple?".