Performance overhead of large class size in c# - c#

Quite an academic question this - I've had it remarked that a class I've written in a WCF service is very long (~3000 lines) and it should be broken down into smaller classes.
The scope of the service has grown over time and the methods contained contain many similar functions hence me not creating multiple smaller classes up until now so I've no problem with doing so (other than the time it'll take to do so!), but it got me thinking - is there a significant performance overhead in using a single large class instead of multiple smaller classes? If so, why?

It won't make any noticeable difference. Before even thinking about such extreme micro-optimization, you should think about maintainability, which is quite endangered with a class of about 3000 LOC.
Write your code first such that it is correct and maintainable. Only if you then really run into performance problems, you should first profile your application before making any decisions about optimizations. Usually performance bottlenecks will be found somewhere else (lack of parallelization, bad algorithms etc.).

No, having one large class should not affect performance. Splitting a large class into smaller classes could even reduce performance as you will have more redirections. However, the impact is negligible in almost all cases.
The purpose of splitting a class into smaller parts is not to improve performance but to make it easier to read, modify and maintain the code. But this alone is enough reason to do it.

Performance considerations are the last of your worries when it comes to the decision to add a handful of well designed classes over a single source file. Think more of:
Maintainability... It's hard to make point fixes in so much code.
Readability... If a you have to page up and down like a fiend to
get anywhere, it's not readable.
Reusability... No decomposition
makes things difficult to reuse.
Cohesion... If you're doing too
many things in a single class, it's probably not cohesive in any way.
Testability... Good luck unit testing a 3,000 LoC bunch of
spaghetti code to any sensible level of coverage.
I could go on, but the mentality of large single source files seems to hark back to the VB/Procedureal programing era. Nowadays, I start to get the fear if a method has a cyclomatic complexity of more than 15 and a class has more than a couple of hundred lines in it.
Usually I find that if I refactor one of these 10k line of code behemoths, the sum total of the lines of code of the new classes ends up being 40% of the original if not less. More classes and decomposition (within reason) lead to less code. Counterintuitive at first, but it really works.

The real issue is not performance overhead. The overhead is in maintainability and reuse. You may have hard of the SOLID principles of Object Oriented design, a number of which imply smaller classes are better. In particular, I'd look at the Single Responsibility Principle, the Open/Closed Principle and the Liskov Substitution Principle, and... actually, come to think of it they all pretty much imply smaller classes are better, albeit indirectly.
This stuff is not easy to 'get'. If you've been programming with an OO language a while you look at SOLID and it suddenly makes so much sense. But until those lightbulbs come on it can seem a bit obscure.
On a far simpler note, having several classes, with one file per class, each one sensibly named to describe the behaviour, where each class has a single job, has to be easier to manage from a pure sanity perspective than a long page of 3,000 lines.
And then consider if one part of your 3,000 line class might be useful in another part of your program... putting that functionality in a dedicated class is an excellent way of encapsulating it for reuse.
In essence, as I write, I'm finding I'm just teasing out aspects of SOLID anyway. You'd probably be best to read straight from the horses mouth on this.

I wouldn't say there are performance issues, but rather maintenance a readability issues. It's far more easier to modify more classes that each perform its purpose than to work with a single monstrous class. That's just ridiculous. You're breaking all OOP principles by doing so.
hence me not creating multiple smaller classes up until now so I've no
problem with doing so
Precisely the case I've been warning of multiple times at SO already... People are afraid of premature optimization, but they are not afraid of writing a bad code with an idea like "I'll fix it later when it becomes an issue". Let me tell you something - 3000+ LOC class IS already an issue, no matter the performance impacts, if any.

It depends on how class is used and how often is instantiated. When class is instantiated once, e.g. contract service class, than performance overhead typical is not significant.
When class will be instantiated often, than it could reduce performance.
But in this case think not about performance, think about it design. Better to think about support and further development and testability. Classes of 3K LOC are huge and typically books of anti-patterns. Such classes are leading to code duplication and bugs, further development will be painful and causes already fixed bugs appear again and again, code is fragile.
So class definitely should be refactored.

Related

Do public class & methods slow down your code?

My teacher ve always said that we shouldn't write the same part of code more than once while programming. But should a code which priority is to be robust & quick use class and methods instead of write down the same code ever and ever?! Does calling a class take a little bit more of time than a direct code?!
For example if i want to do that:
Program1.Action1();
Program1.Action2();
Program1.Action3();
&
Program2.Action1();
etc etc etc
and I want these actions to be perform the quickest as possible, May I call actions() or write down the full code?!
Adnt his question lead to an another one:
For a a project we need to make it easily readable by the teacher so we have a lot of "class tab" on visual studio, we make everything public and we call our class or methods in our mainform.
Ok it's quite organized, very easy to read, BUT doesn't make the code slow down?!
Does public "class tab" are slower than a private class in our mainform?!
I didn't find anything conlusive anywhere.. Thank you.
You could always consider profiling the performance.
But really, you ought to trust that C# will be better than you at making such choices when compiling your code.
The things you state in your question seem like unnecessary micro-optimisations to me that will probably not make a scrap of difference.
Readability and the ability to scale your program are more important considerations: computers are tending to double in speed every year but programmers are getting more and more expensive.
You have one main question and some concerns.
Let me address them separately; first: public and private are not, per se, faster or slower. The compiler could, in theory, optimize more when private methods are involved, but I don't think there are many cases when that could make a difference. So, the short answer is NO, public does not slow down your code.
A simple function call has negligible cost. If you're not programming some number-crunchink code looping over and over million and million of times, the cost of some function call is no concern.
So, if you don't have performance problems, you should not care about them. Do yourself a favor, while learning, and write this down 10 times: if you don't have performance problems, you should not care about them.
You should concentrate about code readability and algorithmic complexity, not about micro optimizations which may or may not improve "performance", but can easily complicate the code and create bugs.
Easy to read and test is paramount in (dare I say it?) 98% of the software developed.

Specification Pattern and Performance

I've been playing around w/ the specification pattern to handle and contain the business logic in our c#/mvc application. So far so good. I do have a question though - since we'll be creating a number of specification objects on the heap, will that affect performance in any way versus, say creating helper methods to handle the business logic? Thanks!
I do have a question though - since we'll be creating a number of specification objects on the heap, will that affect performance in any way versus, say creating helper methods to handle the business logic?
Of course it will affect performance, every line of code you write and design choice you makes affects performance in one way or another. This one is unlikely to be meaningful, be a bottleneck in your application or be worth caring about as this is almost surely a case of premature optimization. These days you should just focus on modeling your domain properly, and writing extremely clear and maintainable code. Focus more on developer productivity than on machine productivity. CPU cycles are cheap, and in nearly limitless supply. Developer cycles are not cheap, and are not limitless in supply.
But only you can know if it will impact the real-world use of your application on real-world data by profiling. We don't, and can't know, because we don't know your domain, don't know your users, don't know what performance you expect, etc. And even if we knew those things, we still couldn't give you as powerful of an answer as you can give yourself by dusting a profiler off the shelf and seeing what your application actually does.
since we'll be creating a number of specification objects on the heap, will that affect performance in any way
Most design patterns trade off some overhead for cleanliness of design - this is no exception. In general, the amount of memory that the specifications add is very minimal (typically a couple of references, and that's it). In addition, they tend to add a couple of extra method calls vs. custom logic.
That being said, I would not try to prematurely optimize this. The overhead here is incredibly small, so I would highly doubt it would be noticeable in any real world application.
If you use NSpecifications lib just as the examples in its GitHub page, you'll get the benefits from both worlds:
Most of these specifications are simply stored in static members therefore it doesn't take much from the heap
These specifications also use compiled expressions so that they can be reused many times with better performance
If you are using ORM to query the database with lambda expressions, that also uses the heap, the difference here is that NSpecifications stores those expressions inside a Spec object so that it can be reused for both business loginc and querying.
Check here
https://github.com/jnicolau/NSpecifications

Ways to improve efficiency of C# code [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
Like most of us, I am a big fan of improving efficiency of code. So much so that I would rather choose fast-executing dirty code over something which might be more elegant or clean, but slower.
Fortunately for all of us, in most cases, the faster and more efficient solutions are also the cleaner and the most elegant ones. I used to be just a dabbler in programming but I am into full-time development now, and just started with C# and web development. I have been reading some good books on these subjects but sadly, books rarely cover the finer aspects. Like say, which one of two codes which do the same thing will run faster. This kind of knowledge comes mostly through experience only. I request all fellow programmers to share any such knowledge here.
Here, I'll start off with these two blog posts I came across. This is exactly the kind of stuff I am looking for in this post:
Stringbuilder vs String performance analysis
The cost of throwing an exception
P.S: Do let me know if this kind of thing already exists somewhere on this site. I searched but couldn't find, surprisingly. Also please post any book you know of that covers such things.
P.P.S: If you got to know of something from some blog post or some online source to which we all have access, then it would be better to post the link itself imo.
There are some things you should do like use generics instead of objects to avoid boxing/unboxing and also improve the code safety, but the best way to optimize your code is to use a profiler to determine which parts of your code are slow. There are many great profilers for .NET code available and they can help determine the bottlenecks in your programs.
Generally you shouldn't concern yourself with small ways to improve code efficiency, but instead when you are done coding, then profile it to find the bottlenecks.
A good profiler will tell you stats like how many times a function was executed, what the average running time was for a function, what the peak running time was for a function, what the total running time was for a function, etc. Some profilers will even draw graphs for you so you can visually see which parts of the program are the biggest bottleneck and you can drill down into the sub function calls.
Without profiling you will most likely be wrong about which part of your program is slow.
An example of a great and free profiler for .NET is the EQATEC Profiler.
The single most important thing regarding this question is: Don't optimize prematurely!
There is only one good time to optimize and that is when there are performance constraints that your current working implementation cannot fulfill. Then you should get out a profiler and check which parts of your code are slow and how you can fix them.
Thinking about optimization while coding the first version is mostly wasted time and effort.
"I would rather choose fast-executing dirty code over something which might be more elegant or clean, but slower."
If I were writing a pixel renderer for a game, perhaps I'd consider doing this - however, when responding to a user's click on a button, for example, I'd always favour the slower, elegant approach over quick-and-dirty (unless slow > a few seconds, when I might reconsider).
I have to agree with the other posts - profile to determine where your slow points are and then deal with those. Writing optimal code from the outset is more trouble than its worth, you'll usually find that what you think will be slow will be just fine and the real slow areas will surprise you.
One good resource for .net related performance info is Rico Mariani's Blog
IMO it's the same for all programming platforms / languages, you have to use profiler and see whitch part of the code are slow, and then do optimization on that parts.
While these links that you provided are valuable insig don't do such things in advance, measure first and then optimize.
edit:
http://www.codinghorror.com/blog/2009/01/the-sad-tragedy-of-micro-optimization-theater.html
When to use StringBuilder?
At what point does using a StringBuilder become insignificant or an overhead?
There are lots of tricks, but if that's what you're thinking you need, you need to start over. The secret of performance in any language is not in coding techniques, it is in finding what to optimize.
To make an analogy, if you're a police detective, and you want to put robbers in jail, the heart of your business is not about different kinds of jails. It is about finding the robbers.
I rely on a purely manual method of profiling. This is an example of finding a series of points to optimize, resulting in a speedup multiple of 43 times.
If you do this on an existing application, you are likely to discover that the main cause of slow performance is overblown data structure design, resulting in an excess of notification-style consistency maintenance, characterized by an excessively bushy call tree. You need to find the calls in the call tree that cost a lot and that you can prune.
Having done that, you may realize that a way of designing software that uses the bare minimum of data structure and abstractions will run faster to begin with.
If you've profiled your code, and found it to be lacking swiftness, then there are some micro-optimizations you can sometimes use. Here's a short list.
Micro-optimize judiciously - it's like the mirror from Harry Potter: if you're not careful you'll spend all your time there and get nothing else done without getting a lot in return.
The StringBuilder and exception throwing examples are good ones - those are mistakes I used to make which sometimes added seconds to a function execution. When profiling, I find I personally use up a lot of cycles simply finding things. In that case, I cache frequently accessed objects using a hashtable (or a dictionary).
Good program architecture give you a lot better optimization, than optimized function.
The most optimization is to avoiding all if else in runtime code, put them all at initialize time.
Overall, optimization is bad idea, because the most valuable is readable program, not a fast program.
http://www.techgalaxy.net/Docs/Dev/5ways.htm has some very good points... just came across it today.

C#, Writing longer method or shorter method?

I am getting two contradicting views on this. Some source says there should be less little methods to reduce method calls, but some other source says writing shorter method is good for letting the JIT to do the optimization.
So, which side is correct?
The overhead of actually making the method call is inconsequentially small in most every case. You never need to worry about it unless you can clearly identify a problem down the road that requires revisiting the issue (you won't).
It's far more important that your code is simple, readable, modular, maintainable, and modifiable. Methods should do one thing, one thing only and delegate sub-things to other routines. This means your methods should be as short as they can possibly be, but not any shorter. You will see far more performance benefits by having code that is less prone to error and bugs because it is simple, than by trying to outsmart the compiler or the runtime.
The source that says methods should be long is wrong, on many levels.
None, you should have relatively short method to achieve readability.
There is no one simple rule about function size. The guideline should be a function should do 'one thing'. That's a little vague but becomes easier with experience. Small functions generally lead to readability. Big ones are occasionally necessary.
Worrying about the overhead of method calls is premature optimization.
As always, it's about finding a good balance. The most important thing is that the method does one thing only. Longer methods tend to do more than one thing.
The best single criterion to guide you in sizing methods is to keep them well-testable. If you can (and actually DO!-) thoroughly unit-test every single method, your code is likely to be quite good; if you skimp on testing, your code is likely to be, at best, mediocre. If a method is difficult to test thoroughly, then that method is likely to be "too big" -- trying to do too many things, and therefore also harder to read and maintain (as well as badly-tested and therefore a likely haven for bugs).
First of all, you should definitely not be micro-optimizing the performance on the number-of-methods level. You will most likely not get any measurable performance benefit. Only if you have some method that are being called in a tight loop millions of times, it might be an idea - but don't begin optimizing on that before you need it.
You should stick to short concise methods, that does one thing, that makes the intent of the method clear. This will give you easier-to-read code, that is easier to understand and promotes code reuse.
The most important cost to consider when writing code is maintanability. You will spend much, much more time maintaining an application and fixing bugs than you ever will fixing performance problems.
In this case the almost certainly insignificant cost of calling a method is incredibly small when compared to the cost of maintaining a large unwieldy method. Small concise methods are easier to maintain and comprehend. Additionally the cost of calling the method almost certainly will not have a significant performance impact on your application. And if it does, you can only assertain that by using a profiler. Developers are notoriously bad at identifying performance problems before hand.
Generally speaking, once a performance problem is identified, they are easy to fix. Making a method or more importantly a code base, maintainable is a much higher cost.
Personally, I am not afraid of long methods as long as the person writing them writes them well (every piece of sub-task separated by 2 newlines and a nice comment preceeding it, etc. Also, identation is very important.).
In fact, many times I even prefer them (e.g. when writing code that does things in a specific order with sequential logic).
Also, I really don't understand why breaking a long method into 100 pieces will improve readablility (as others suggest). Only the opposite. You will only end-up jumping all over the place and holding pieces of code in your memory just to get a complete picture of what is going on in your code. Combine that with possible lack of comments, bad function names, many similar function names and you have the perfect recipe for chaos.
Also, you could go the other end while trying to reduce the size of the methods: to create MANY classes and MANY functions each of which may take MANY parameters. I don't think this improves readability either (especially for a begginer to a project that has no clue what each class/method do).
And the demand that "a function should do 'one thing'" is very subjective. 'One thing' may be increasing a variable by one up to doing a ton of work supposedly for the 'same thing'.
My rule is only reuseability:
The same code should not appear many times in many places. If this is the case you need a new function.
All the rest is just philosophical talk.
In a question of "why do you make your methods so big" I reply, "why not if the code is simple?".

Atomic or Gigantic

If you are planning to write a very parallel application in C#, is it better to build things very small, like
20 small classes, making 40 larger classes, and together making 60 more, for a total of 120
or gigantic like:
making these 60 classes individually (still with reusability in mind).
So in #2 these 60 classes can contain methods to do things instead of other classes.
Abstractly, neither one of those approaches would make a difference.
Concretely, minimizing mutable state will make your application more paralellizable. Every time you change the state of an instance of your object, you create the potential for thread safety issues (either complexity, or bugs; choose at least one). If you look at Parallel LINQ or functional languages emphasizing parallelism, you'll notice that class design matters less than the discipline of avoiding changes in state.
Class design is for your sanity. Loosely coupled code makes you more sane. Immutable objects make you more parallel. Combine as needed.
Smaller pieces are easier to test, easier to refactor, and easier to maintain.
It's not the size of the classes, but the scope of the coupling that matters.
For parallel applications, you should favor immutable objects---sometimes called "value objects" rather than objects with a lot of properties. If you need to apply operations that result in new values, just create new objects as the result.
Observe good separation of concerns, and let that lead you to the natural number of classes to represent the concepts in your program. I recommend the SOLID principles, cataloged and popularized by Robert Martin from ObjectMentor. (That should be enough Google-fodder to locate the list!)
Finally, I also recommend that you get intimate with both System.Threading and System.Collections. Most of the collections are not inherently thread safe, and synchronization is notoriously difficult to get right. So, you're better off using widely-used, tested, reliable synchronization primitives.

Categories

Resources