I am getting two contradicting views on this. Some source says there should be less little methods to reduce method calls, but some other source says writing shorter method is good for letting the JIT to do the optimization.
So, which side is correct?
The overhead of actually making the method call is inconsequentially small in most every case. You never need to worry about it unless you can clearly identify a problem down the road that requires revisiting the issue (you won't).
It's far more important that your code is simple, readable, modular, maintainable, and modifiable. Methods should do one thing, one thing only and delegate sub-things to other routines. This means your methods should be as short as they can possibly be, but not any shorter. You will see far more performance benefits by having code that is less prone to error and bugs because it is simple, than by trying to outsmart the compiler or the runtime.
The source that says methods should be long is wrong, on many levels.
None, you should have relatively short method to achieve readability.
There is no one simple rule about function size. The guideline should be a function should do 'one thing'. That's a little vague but becomes easier with experience. Small functions generally lead to readability. Big ones are occasionally necessary.
Worrying about the overhead of method calls is premature optimization.
As always, it's about finding a good balance. The most important thing is that the method does one thing only. Longer methods tend to do more than one thing.
The best single criterion to guide you in sizing methods is to keep them well-testable. If you can (and actually DO!-) thoroughly unit-test every single method, your code is likely to be quite good; if you skimp on testing, your code is likely to be, at best, mediocre. If a method is difficult to test thoroughly, then that method is likely to be "too big" -- trying to do too many things, and therefore also harder to read and maintain (as well as badly-tested and therefore a likely haven for bugs).
First of all, you should definitely not be micro-optimizing the performance on the number-of-methods level. You will most likely not get any measurable performance benefit. Only if you have some method that are being called in a tight loop millions of times, it might be an idea - but don't begin optimizing on that before you need it.
You should stick to short concise methods, that does one thing, that makes the intent of the method clear. This will give you easier-to-read code, that is easier to understand and promotes code reuse.
The most important cost to consider when writing code is maintanability. You will spend much, much more time maintaining an application and fixing bugs than you ever will fixing performance problems.
In this case the almost certainly insignificant cost of calling a method is incredibly small when compared to the cost of maintaining a large unwieldy method. Small concise methods are easier to maintain and comprehend. Additionally the cost of calling the method almost certainly will not have a significant performance impact on your application. And if it does, you can only assertain that by using a profiler. Developers are notoriously bad at identifying performance problems before hand.
Generally speaking, once a performance problem is identified, they are easy to fix. Making a method or more importantly a code base, maintainable is a much higher cost.
Personally, I am not afraid of long methods as long as the person writing them writes them well (every piece of sub-task separated by 2 newlines and a nice comment preceeding it, etc. Also, identation is very important.).
In fact, many times I even prefer them (e.g. when writing code that does things in a specific order with sequential logic).
Also, I really don't understand why breaking a long method into 100 pieces will improve readablility (as others suggest). Only the opposite. You will only end-up jumping all over the place and holding pieces of code in your memory just to get a complete picture of what is going on in your code. Combine that with possible lack of comments, bad function names, many similar function names and you have the perfect recipe for chaos.
Also, you could go the other end while trying to reduce the size of the methods: to create MANY classes and MANY functions each of which may take MANY parameters. I don't think this improves readability either (especially for a begginer to a project that has no clue what each class/method do).
And the demand that "a function should do 'one thing'" is very subjective. 'One thing' may be increasing a variable by one up to doing a ton of work supposedly for the 'same thing'.
My rule is only reuseability:
The same code should not appear many times in many places. If this is the case you need a new function.
All the rest is just philosophical talk.
In a question of "why do you make your methods so big" I reply, "why not if the code is simple?".
Related
My teacher ve always said that we shouldn't write the same part of code more than once while programming. But should a code which priority is to be robust & quick use class and methods instead of write down the same code ever and ever?! Does calling a class take a little bit more of time than a direct code?!
For example if i want to do that:
Program1.Action1();
Program1.Action2();
Program1.Action3();
&
Program2.Action1();
etc etc etc
and I want these actions to be perform the quickest as possible, May I call actions() or write down the full code?!
Adnt his question lead to an another one:
For a a project we need to make it easily readable by the teacher so we have a lot of "class tab" on visual studio, we make everything public and we call our class or methods in our mainform.
Ok it's quite organized, very easy to read, BUT doesn't make the code slow down?!
Does public "class tab" are slower than a private class in our mainform?!
I didn't find anything conlusive anywhere.. Thank you.
You could always consider profiling the performance.
But really, you ought to trust that C# will be better than you at making such choices when compiling your code.
The things you state in your question seem like unnecessary micro-optimisations to me that will probably not make a scrap of difference.
Readability and the ability to scale your program are more important considerations: computers are tending to double in speed every year but programmers are getting more and more expensive.
You have one main question and some concerns.
Let me address them separately; first: public and private are not, per se, faster or slower. The compiler could, in theory, optimize more when private methods are involved, but I don't think there are many cases when that could make a difference. So, the short answer is NO, public does not slow down your code.
A simple function call has negligible cost. If you're not programming some number-crunchink code looping over and over million and million of times, the cost of some function call is no concern.
So, if you don't have performance problems, you should not care about them. Do yourself a favor, while learning, and write this down 10 times: if you don't have performance problems, you should not care about them.
You should concentrate about code readability and algorithmic complexity, not about micro optimizations which may or may not improve "performance", but can easily complicate the code and create bugs.
Easy to read and test is paramount in (dare I say it?) 98% of the software developed.
Quite an academic question this - I've had it remarked that a class I've written in a WCF service is very long (~3000 lines) and it should be broken down into smaller classes.
The scope of the service has grown over time and the methods contained contain many similar functions hence me not creating multiple smaller classes up until now so I've no problem with doing so (other than the time it'll take to do so!), but it got me thinking - is there a significant performance overhead in using a single large class instead of multiple smaller classes? If so, why?
It won't make any noticeable difference. Before even thinking about such extreme micro-optimization, you should think about maintainability, which is quite endangered with a class of about 3000 LOC.
Write your code first such that it is correct and maintainable. Only if you then really run into performance problems, you should first profile your application before making any decisions about optimizations. Usually performance bottlenecks will be found somewhere else (lack of parallelization, bad algorithms etc.).
No, having one large class should not affect performance. Splitting a large class into smaller classes could even reduce performance as you will have more redirections. However, the impact is negligible in almost all cases.
The purpose of splitting a class into smaller parts is not to improve performance but to make it easier to read, modify and maintain the code. But this alone is enough reason to do it.
Performance considerations are the last of your worries when it comes to the decision to add a handful of well designed classes over a single source file. Think more of:
Maintainability... It's hard to make point fixes in so much code.
Readability... If a you have to page up and down like a fiend to
get anywhere, it's not readable.
Reusability... No decomposition
makes things difficult to reuse.
Cohesion... If you're doing too
many things in a single class, it's probably not cohesive in any way.
Testability... Good luck unit testing a 3,000 LoC bunch of
spaghetti code to any sensible level of coverage.
I could go on, but the mentality of large single source files seems to hark back to the VB/Procedureal programing era. Nowadays, I start to get the fear if a method has a cyclomatic complexity of more than 15 and a class has more than a couple of hundred lines in it.
Usually I find that if I refactor one of these 10k line of code behemoths, the sum total of the lines of code of the new classes ends up being 40% of the original if not less. More classes and decomposition (within reason) lead to less code. Counterintuitive at first, but it really works.
The real issue is not performance overhead. The overhead is in maintainability and reuse. You may have hard of the SOLID principles of Object Oriented design, a number of which imply smaller classes are better. In particular, I'd look at the Single Responsibility Principle, the Open/Closed Principle and the Liskov Substitution Principle, and... actually, come to think of it they all pretty much imply smaller classes are better, albeit indirectly.
This stuff is not easy to 'get'. If you've been programming with an OO language a while you look at SOLID and it suddenly makes so much sense. But until those lightbulbs come on it can seem a bit obscure.
On a far simpler note, having several classes, with one file per class, each one sensibly named to describe the behaviour, where each class has a single job, has to be easier to manage from a pure sanity perspective than a long page of 3,000 lines.
And then consider if one part of your 3,000 line class might be useful in another part of your program... putting that functionality in a dedicated class is an excellent way of encapsulating it for reuse.
In essence, as I write, I'm finding I'm just teasing out aspects of SOLID anyway. You'd probably be best to read straight from the horses mouth on this.
I wouldn't say there are performance issues, but rather maintenance a readability issues. It's far more easier to modify more classes that each perform its purpose than to work with a single monstrous class. That's just ridiculous. You're breaking all OOP principles by doing so.
hence me not creating multiple smaller classes up until now so I've no
problem with doing so
Precisely the case I've been warning of multiple times at SO already... People are afraid of premature optimization, but they are not afraid of writing a bad code with an idea like "I'll fix it later when it becomes an issue". Let me tell you something - 3000+ LOC class IS already an issue, no matter the performance impacts, if any.
It depends on how class is used and how often is instantiated. When class is instantiated once, e.g. contract service class, than performance overhead typical is not significant.
When class will be instantiated often, than it could reduce performance.
But in this case think not about performance, think about it design. Better to think about support and further development and testability. Classes of 3K LOC are huge and typically books of anti-patterns. Such classes are leading to code duplication and bugs, further development will be painful and causes already fixed bugs appear again and again, code is fragile.
So class definitely should be refactored.
This is a common belief that reflection is slow and try to avoid it as much as possible. But is that belief true, in the current situation? There has been lot of changes in the current .net versions like, use of IL Weaving (i.e. IL Emit) etc, as opposed to traditional PropertyInfo and MethodInfo ways of performing reflection.
Is that any convincing proof, that the new reflection is not that slow any more, and can be used. Is there better way to read attribute data?
Thanks,
Bhaskar
When you think about it, reflection is pretty darn impressive in how fast it is.
A cached delegate from a ConstructorInfo or MethodInfo can be called with speed comparable to any other delegate.
A delegate created from ILGenerator.Emit (which incidentally isn't new, it's been in .NET since version 1) can likewise be called just as fast.
An object obtained through emitting or calling a ConstructorInfo's delegate will be just as fast as any other object.
If you obtain an object by loading an assembly dynamically, using reflection to find the method to call, and calling it, and it implements a defined interface through which you call it from that point on, then it'll be just as fast in how it's used as another implementation of that interface.
In all, reflection gives us ways of doing things that without it we would - if we could do them at all - have to use techniques that are slower both to code and to execute.
It also gives us means of doing things that are more complicated, more brittle, less type-safe and less performant than other means too. Pretty much every line of C# code can be replaced by a big chunk of code that uses reflection. This code will almost certainly be worse than the original line in a whole bunch of ways, and performance is the least of them.
Quite possibly the "avoid reflection because its slow" advice stems from the belief that the sort of developer that would go nuts for any new technique just because it seemed cool would be the sort that would be more likely warned off by "it'll be slower" than by "it'll be less idiomatic, more error-prone and harder to maintain". Quite possibly this belief is completely correct.
For the most part though, when the most natural and obvious approach is to use reflection, then it also won't be less performant than a really convoluted attempt to avoid it.
If performance concerns apply to anything in reflection, its really to the uses that are hidden:
Using dynamic can seem sensible in a case where only a little work could avoid it. Here the performance difference may be worth considering.
In ASP.NET, using <%#DataBinder.Eval(Container.DataItem, "SomeProperty")%> is easier but generally less performant than <#((SomeType)Container.DataItem).SomeProperty%> or <%#SomeCodeBehindProvidedCallWithTheSameResult%>. I'll still use the former 90% of the time, and the latter only if I really care about a given page's performance or more likely because doing many operations on the same object makes the latter actually more natural.
So in all, everything remains "slow" in computers while they are bounded by the requirement to work in a single universe in a way that consumes energy and takes time, for some value of "slow". Different reflection techniques have different costs, but so does the alternatives. Beware not so much of reflection in cases where it's the obvious approach as hidden reflection where another approach just slightly less obvious may serve well.
And of course, code wisely with whatever technique you use. If you're going to call the same delegate a hundred times in a row, you should be storing it rather than obtaining it each call, and that would go whether the way to obtain it was through reflection or not.
It all depends.
Yes, using reflection is without any doubt slower than not using reflection, but you have to look at the big picture:
For example, imagine that your code does some reflection and then loads some data from a database over the network.
In this case, you can completely neglect the additional cost for reflection because the database request over the network will likely take much, much longer.
I wouldn't worry about the performance cost that much. Not saying performance doesn't matter, but in a lot of cases, a millisecond or two isn't noticeable or worth choosing a more complicated coding alternative. Take a look here.
I generally prefer this approach:
write simple code
measure performance
optimize the biggest hitter, if needed
Whad do you mean by mentioning a common belief? Reflection is not a solid block. You should consider each method apart. For instance, creating an object through a default constructor is few times slower than a simple call, whilst creating a parametered constructor is tens time slower. So if you want to study the speed, do a benchmark and benchmark the concrete functions you need.
PS. Using C#, you can always create and compile expressions on the fly which if you manage to do it would be much faster than reflection.
yes reflection is slow when we try to do on bulky opertations or with in a loop.
you can try out dynamic options as well . with dynamic option its can be cached and eventually will be faster than reflection.
If you need to look at attributes applied to source code, then you pretty much have to use reflection.
Some reflection operations are fast, such as Object.GetType(), but some operations are relatively slow, such as Type.FindMethod("MyMethod").
In general, I would say that if your application makes occasional use of Reflection, there should be no performance concern. On the other hand, if your application uses Reflection extensively, then you might observe some slowness.
Introspection is an heavy job. Saying slow is relative to a lot of things. Invoking method/constructor through reflection is slow, but using reflection to only retreive metadatas is not.
Keep in mind, reflection must be only use to retrieve metadatas.
If you need to invoke method, running something, just emit dynamic types/methods at initialization time and invoke them through interfaces/delegates.
there is some code to execute after validation.
consider a variable SOQualityStandards = true;
this variable is validated before the execution of code.
i have come across two ways of checking SOQualityStandards
one is
if(SOQualityStandards)
{
//code to execute
}
and the other is
if(!SOQualityStandards) return;
//code to execute
is there any performance difference between both. which one should i consider.
They have the same semantics (assuming there is no other code in the function after the if-block in the first example).
I find the first to be clearer, but that is a matter of personal preference.
The compiler will consider those two options to be the same and could transform one into the other or visa versa, so performance considerations are irrelivent. Even if performence were affected, I think the readability/maintainability is a larger issue in these questions anyway.
I tend to do the return at the beginning in cases like this, because it reduces the indentations and mental burden in reading the rest of the method. The states tested in returns at the beginning become states that no longer need to be considered in understanding the method. Large if blocks, on the other hand, require mentally tracking the state differences throughout the method.
This becomes especially important if there are several tests that need to be done to guard an interior block of code.
There's no difference in the two approaches. It's a matter of personal choice.
Choosing between the two is just personal preference (see Sven's answer and bomslang's answer). Micro-optimization is in most cases completely unnecessary.
You shouldn't optimize execute-time until you see it's a problem. You could be spending that valuable time to add other functionality or to come up with improvements to the system architecture.
In the case you actually need to optimize, loops and recursive functions are generally the first place to look.
If you would need further optimization than that, single line variable checks and manipulations would still be some of the last things to optimize.
Remember (as Jeffrey said) readability and maintainability are in most cases the most important factors.
Consider the following code sample:
private void AddEnvelope(MailMessage mail)
{
if (this.CopyEnvelope)
{
// Perform a few operations
}
}
vs
private void AddEnvelope(MailMessage mail)
{
if (!this.CopyEnvelope) return;
// Perform a few operations
}
Will the bottom code execute any faster? Why would ReSharper make this recommendation?
Update
Having thought about this question the answer might seem obvious to some. But lots of us developers were never in the habit of nesting zounds of if statements in the first place...
It doesn't matter. Stop agonizing over performance issues that don't exist - use a profiler to identify areas in your code that DO exhibit issues, and fix them. Proactive optimization - before you know that there is a problem - is by definition a waste of time.
Updated Answer:
It's a code maintainability suggestion. Easier to read than nesting the rest of the code in an IF statement. Examples/discussion of this can be seen at the following links:
Flattening Arrow Code
Replace Nested Conditional With Guard Clauses
Code Contracts section "Legacy Requires Statements"
Original Answer:
It will actually run (very negligibly) slower from having to perform a
NOT operation.
So much in fact, some people actually consider that prettier way to
code as it avoids an extra level of indentation for the bulk of the
code.
It's a refactor of a conditional that encompasses the entire method contents to a Guard Clause. It has nothing to do with optimization.
I like the comments about optimizing things like this, to add a little more to it...
The only time I can think of that it makes sense to optimize your if statements is when you have the results of TWO or more longish running methods that need to be combined to determine to do something else. You would only want to execute the second operation if the first operation yielded results that would pass the condition. Putting the one that is most likely to return false first will generally be a smarter choice. This is because if it is false, the second one will not be evaluated at all. Again, only worth worrying about if the operations are significant and you can predict which is more likely to pass or fail. Invert this for OR... if true, it will only evaluate the first, and so optimize that way.
i.e.
if (ThisOneUsuallyPasses() && ThisOneUsuallyFails())
isn't so good as
if (ThisOneUsuallyFails() && ThisOneUsuallyPasses())
because it's only on the odd case that the first one actually works that you have to look at the second. There's some other flavors of this you can derive, but I think you should get the point.
Better to worry about how you use strings, collections, index your database, and allocate objects than spend a lot of time worrying about single condition if statements if you are worrying about perf.
In general, what the bottom code you give will do is give you an opportunity to avoid a huge block of code inside an if statement which can lead to silly typo driven errors. Old school thinking was that you should only have one point that you return from a method to avoid a different breed of coder error. Current thinking (at least by some of the tool vendors such as jetbrains resharper, etc) seems to be that wrapping the least amount of code inside of conditional statements is better. Anything more than that would be subjective so I'll leave it at that.
This kind of "Optimizations" are not worth the time spent on refactoring your code, because all modern compilers does already enough small optimizations that they make this kind of tips trivial.
As mentioned above Performance Optimization is done through profilers to calculate how your system is performing and the potential bottlenecks before applying the performance fix, and then after the performance fix to see if your fix is any good.
Required reading: Cyclomatic_complexity
Cyclomatic Complexity is a quantitative measure of the number of linearly independent
paths through a program's source code
Which means, every time you branch using and if statement you increase the Cyclomatic Complexity by 1.
To test each linearly independent path through the program; in this case, the number of test cases will equal the cyclomatic complexity of the program.
Which means, if you want to test your code completely, for each if statement you would have to introduce a new test case.
So, by introducing more if statements the complexity of your code increases, as does the number of test cases required to test it.
By removing if statements, your code complexity decreases as does the number of test cases required to test.