Techniques implement polymorphism but save on the 4 bytes - c#

I came across a technique in template metadata programming which allows to implement polymorphism without the virtual function mechanism.
Hence I am wondering if there are some other tricks one can use to do polymorphic behavior in c++ or C#?
EDIT : also sometime ago i read that visitor design pattern is an alternative to virtual mechanism, but cannot recall the article. can someone confirm it it can be used too?
EDIT2 : I understand it is not an ideal programming practice, but hack is what I am looking for since optimization is of primary concern. The classes hierarchy is assigned during compile time (the pointers are not assigned to the classes during run time by if-else logic, etc.)

Do you ever wonder why polymorphism tends to add around 4 bytes of overhead per use? It's because that's the simplest, most practical way to implement polymorphism. There are grotesque hacks possible that can simulate C++/C#/Objective-C polymorphism with less overhead, but they come with great tradeoffs--several times the CPU usage per call, for instance, or statically-stored class hierarchies with limited extensibility.
Polymorphism is implemented the way it is implemented because the way it is implemented is already optimal.

(EDIT: This is answer is in the context of C#.)
You're not going to save 4 bytes per object anyway. The type will still have a single vtable to lookup function member implementations. You might save one entry in that table by avoiding a virtual method, but that's a single entry in a single object - it's not going to affect you on a per-instance basis.
Either you're missing something, or I am. It would help if you could edit the question to show what you're trying to do in the most natural implementation - then explain where you're trying to save space.

Templates can only be used to implement compile-time polymorphism. I'm not aware of any mechanism to implement runtime polymorphism without a space overhead, but why does it matter? If you need runtime polymorphism to solve your problem, you should use it.

If you really want to save memory, you could implement your own memory handling. Block headers and footers are larger than 4 bytes each (I think they're 8 bytes each), so putting everything in a huge memory block and doing your own indexing would be the way to go. Don't even use objects, just binary indices.
My point is, unless you're designing low-level databases, OS kernels, or a SOC, you really shouldn't be worrying about this. Especially with C#. Do you know how much overhead garbage collection has??

There are lots of tricks varying from well known techniques (e.g., the CRTP in C++) to rather ugly hacks (I once roughly quintupled the capacity of a particular program by eliminating vtable pointers; instead, I implemented a separate allocator for each of the classes, and found the correct vtable based on the object's address).
The visitor pattern really just displaces polymorphism rather than replacing it. In other words, what could/would have been polymorphic behavior in the class being visited is moved into the visitor class. To get dynamic binding, however, you still have a hierarchy of visitor classes, typically using the usual vtable mechanism to implement polymorphism. Under the right circumstances, this can still save a fair amount of memory though, as the number of visitor objects is often much smaller than the number of objects they visit.

Boost.Variant is a C++ library implementation of the visitor design pattern that avoids dynamic polymorphism.
And for your edification, the C++ static polymorphism approach you mentioned has a name: CRTP.

Related

Is object casting an inevitability of reality when there is a need to design modular architecture?

It is common to read around that object casting is a bad practice and should be avoided, for instance Why should casting be avoided? question has gotten some answers with great arguments:
By Jerry Coffin:
Looking at things more generally, the situation's pretty simple (at
least IMO): a cast (obviously enough) means you're converting
something from one type to another. When/if you do that, it raises the
question "Why?" If you really want something to be a particular type,
why didn't you define it to be that type to start with? That's not to
say there's never a reason to do such a conversion, but anytime it
happens, it should prompt the question of whether you could re-design
the code so the correct type was used throughout.
By Eric Lippert:
Both kinds of casts are red flags. The first kind of cast
raises the question "why exactly is it that the developer knows
something that the compiler doesn't?" If you are in that situation
then the better thing to do is usually to change the program so that
the compiler does have a handle on reality. Then you don't need the
cast; the analysis is done at compile time.
The second kind of cast raises the question "why isn't the operation
being done in the target data type in the first place?" If you need
a result in ints then why are you holding a double in the first
place? Shouldn't you be holding an int?
Moving on to my question, recently I have started to look into the source code of the well known open source project AutoFixture originally devloped by Mark Seemann which I really appreciate.
One of the main components of the library is the interface ISpecimenBuilder which define an somehow abstract method:
object Create(object request, ISpecimenContext context);
As you can see request parameter type is object and by such it accepts completely different types, different implementations of the interface treat different requests by their runtime type, checking if it is something they cable dealing with otherwise returning some kind of no response representation.
It seems that the design of the interface does not adhere to the "good practice" that object casting should be used sparsely.
I was thinking to myself if there is a better way to design this contract in a way that defeats all the casting but couldn't find any solution.
Obviously the object parameter could be replaced with some marker interface but it will not save us the casting problem, I have also thought that it is possible to use some variation of visitor pattern as described here but it does not seem to be very scalable, the visitor will must have dozens of different methods since there is so many different implementations of the interface that capable dealing with different types of requests.
Although the fact that I basically agree with the arguments against using casting as part of a good design in this specific scenario it seems as not only the best option but also the only realistic one.
To sum up, is object casting and a very general contracts are inevitability of reality when there is a need to design modular and extendable architecture?
I don't think that I can answer this question generally, for any type of application or framework, but I can offer an answer that specifically talks about AutoFixture, as well as offer some speculation about other usage scenarios.
If I had to write AutoFixture from scratch today, there's certainly things I'd do differently. Particularly, I wouldn't design the day-to-day API around something like ISpecimenBuilder. Rather, I'd design the data manipulation API around the concept of functors and monads, as outlined here.
This design is based entirely on generics, but it does require statically typed building blocks (also described in the article) known at compile time.
This is closely related to how something like QuickCheck works. When you write QuickCheck-based tests, you must supply generators for all of your own custom types. Haskell doesn't support run-time casting of values, but instead relies exclusively on generics and some compile-time automation. Granted, Haskell's generics are more powerful than C#'s, so you can't necessarily transfer the knowledge gained from Haskell to C#. It does suggest, however, that it's possible to write code entirely without relying on run-time casting.
AutoFixture does, however, support user-defined types without the need for the user to write custom generators. It does this via .NET Reflection. In .NET, the Reflection API is untyped; all the methods for generating objects and invoking members take object as input and return object as output.
Any application, library, or framework based on Reflection will have to perform some run-time casting. I don't see how to get around that.
Would it be possible to write data generators without Reflection? I haven't tried the following, but perhaps one could adopt a strategy where one would write 'the code' for a data generator directly in IL and use Reflection emit to dynamically compile an in-memory assembly that contains the generators.
This is a bit like how the Hiro container works, IIRC. I suppose that one could design other types of general-purpose frameworks around this concept, but I rarely see it done in .NET.

Dynamic Lang. Runtime vs Reflection

I am planning to use dynamic keyword for my new project. But before stepping in, I would like to know about the pros and cons in using dynamic keyword over Reflection.
Following where the pros, I could find in respect to dynamic keyword:
Readable\Maintainable code.
Fewer lines of code.
While the negatives associated with using dynamic keyword, I came to hear was like:
Affects application performance.
Dynamic keyword is internally a wrapper of Reflection.
Dynamic typing might turn into breeding ground for hard to find bugs.
Affects interoperability with previous .NET versions.
Please help me on whether the pros and cons I came across are sensible or not?
Please help me on whether the pros and cons I came across are sensible or not?
The concern I have with your pros and cons is that some of them do not address differences between using reflection and using dynamic. That dynamic typing makes for bugs that are not caught until runtime is true of any dynamic typing system. Reflection code is just as likely to have a bug as code that uses the dynamic type.
Rather than thinking of it in terms of pros and cons, think about it in more neutral terms. The question I'd ask is "What are the differences between using Reflection and using the dynamic type?"
First: with Reflection you get exactly what you asked for. With dynamic, you get what the C# compiler would have done had it been given the type information at compile time. Those are potentially two completely different things. If you have a MethodInfo to a particular method, and you invoke that method with a particular argument, then that is the method that gets invoked, period. If you use "dynamic", then you are asking the DLR to work out at runtime what the C# compiler's opinion is about which is the right method to call. The C# compiler might pick a method different than the one you actually wanted.
Second: with Reflection you can (if your code is granted suitably high levels of trust) do private reflection. You can invoke private methods, read private fields, and so on. Whether doing so is a good idea, I don't know. It certainly seems dangerous and foolish to me, but I don't know what your application is. With dynamic, you get the behaviour that you'd get from the C# compiler; private methods and fields are not visible.
Third: with Reflection, the code you write looks like a mechanism. It looks like you are loading a metadata source, extracting some types, extracting some method infos, and invoking methods on receiver objects through the method info. Every step of the way looks like the operation of a mechanism. With dynamic, every step of the way looks like business logic. You invoke a method on a receiver the same way as you'd do it in any other code. What is important? In some code, the mechanism is actually the most important thing. In some code, the business logic that the mechanism implements is the most important thing. Choose the technique that emphasises the right level of abstraction.
Fourth: the performance costs are different. With Reflection you do not get any cached behaviour, which means that operations are generally slower, but there is no memory cost for maintaining the cache and every operation is roughly the same cost. With the DLR, the first operation is very slow indeed as it does a huge amount of analysis, but the analysis is cached and reused. That consumes memory, in exchange for increased speed in subsequent calls in some scenarios. What the right balance of speed and memory usage is for your application, I don't know.
Readable\Maintainable code
Certainly true in my experence.
Fewer lines of code.
Not significantly, but it will help.
Affects application performance.
Very slightly. But not even close to the way reflection does.
Dynamic keyword is internally a wrapper of Reflection.
Completely untrue. The dynamic keyword leverages the Dynamic Library Runtime.
[Edit: correction as per comment below]
It would seem that the Dynamic Language Runtime does use Reflection and the performance improvements are only due to cacheing techniques.
Dynamic typing might turn into breeding ground for hard to find bugs.
This may be true; it depends how you write your code. You are effectively removing compiler checking from your code. If your test coverage is good, this probably won't matter; if not then I suspect you will run into problems.
Affects interoperability with previous .NET versions
Not true. I mean you won't be able to compile your code against older versions, but if you want to do that then you should use the old versions as a base and up-compile it rather than the other way around. But if you want to use a .NET 2 library then you shouldn't run into too many problems, as long as you include the declaration in app.config / web.config.
One significant pro that you're missing is the improved interoperability with COM/ATL components.
There are 4 great differences between Dynamic and reflection. Below is a detailed explanation of the same. Reference http://www.codeproject.com/Articles/593881/What-is-the-difference-between-Reflection-and-Dyna
Point 1. Inspect VS Invoke
Reflection can do two things one is it can inspect meta-data and second it also has the ability to invoke methods on runtime.While in Dynamic we can only invoke methods. So if i am creating software's like visual studio IDE then reflection is the way to go. If i just want dynamic invocation from the my c# code, dynamic is the best option.
Point 2. Private Vs Public Invoke
You can not invoke private methods using dynamic. In reflection its possible to invoke private methods.
Point 3. Caching
Dynamic uses reflection internally and it also adds caching benefits. So if you want to just invoke a object dynamically then Dynamic is the best as you get performance benefits.
Point 4. Static classes
Dynamic is instance specific: You don't have access to static members; you have to use Reflection in those scenarios.
In most cases, using the dynamic keyword will not result in meaningfully shorter code. In some cases it will; that depends on the provider and as such it's an important distinction. You should probably never use the dynamic keyword to access plain CLR objects; the benefit there is too small.
The dynamic keyword undermines automatic refactoring tools and makes high-coverage unit tests more important; after all, the compiler isn't checking much of anything when you use it. That's not as much of an issue when you're interoperating with a very stable or inherently dynamically typed API, but it's particularly nasty if you use keyword dynamic to access a library whose API might change in the future (such as any code you yourself write).
Use the keyword sparingly, where it makes sense, and make sure such code has ample unit tests. Don't use it where it's not needed or where type inference (e.g. var) can do the same.
Edit: You mention below that you're doing this for plug-ins. The Managed Extensibility Framework was designed with this in mind - it may be a better option that keyword dynamic and reflection.
If you are using dynamic specifically to do reflection your only concern is compatibility with previous versions. Otherwise it wins over reflection because it is more readable and shorter. You will lose strong typing and (some) performance from the very use of reflection anyway.
The way I see it all your cons for using dynamic except interoperability with older .NET versions are also present when using Reflection:
Affects application performance
While it does affect the performance, so does using Reflection. From what I remember the DLR more or less uses Reflection the first time you access a method/property of your dynamic object for a given type and caches the type/access target pair so that later access is just a lookup in the cache making it faster then Reflection
Dynamic keyword is internally a wrapper of Reflection
Even if it was true (see above), how would that be a negative point? Whether or not it does wrap Reflection shouldn't influence your application in any significant matter.
Dynamic typing might turn into breeding ground for hard to find bugs
While this is true, as long as you use it sparingly it shouldn't be that much of a problem. Furthermore is you basically use it as a replacement for reflection (that is you use dynamic only for the briefest possible durations when you want to access something via reflection), the risk of such bugs shouldn't be significantly higher then if you use reflection to access your methods/properties (of course if you make everything dynamic it can be more of a problem).
Affects interoperability with previous .NET versions
For that you have to decide yourself how much of a concern it is for you.

Dependency injection in .Net without virtual method calls?

I've been thinking about whether it's possible to apply the DI pattern without incurring the cost of virtual method calls (which according to experiments I did can be up to 4 times slower than non-virtual calls). The first idea I had was to do dependency injection via generics:
sealed class ComponentA<TComponentB, TComponentC> : IComponentA
where TComponentB : IComponentB
where TComponentC : IComponentC
{ ... }
Unfortunately the CLR still does method calls via the interfaces even when concrete implementations of TComponentB and TComponentC are specified as generic type parameters and all of the classes are declared as sealed. The only way to get the CLR to do non-virtual calls is by changing all of the classes to structs (which implement the interfaces). Using struct doesn't really make sense for DI though and makes the issue below even more unsolvable.
The second issue with the above solution is that it can't handle circular references. I can't think of any way, either by C# code, or by constructing expression trees, to handle circular references because that would entail infinitely recursing generic types. (.Net does support generic types referencing itself, but it doesn't seem to generalize to this case.) Since only structs can cause the CLR to bypass the interfaces, I don't think this problem is solvable at all because circular references between structs could cause a paradox.
There's only one other solution I can think of and it's guaranteed to work - emit all of the classes from scratch at runtime, maybe basing them on compiled classes as templates. Not really an ideal solution though.
Anyone have better ideas?
Edit: In regards to most of the comments, I guess I should say that this is filed under "pure intellectual curiosity" I debated whether to ask this because I do realize that I don't have any concrete case in which it's necessary. I was just thinking about it for fun and was wondering whether anyone else came across this before.
Typical example of trying to completely over-engineer something in my opinion. Just don't compromise your design because you can save a few 10's of milliseconds - if it is even that.
Are you seriously suggesting that because of the callvirt instructions, your app ends up being so significantly slower that users (those people you write the app for) will notice any difference - at all? I doubt that very much.
This blog post explains why you can't optimize the virtual call.
While a callvirt instruction does take longer this is usually done because it provides a cheap null check for the CLR prior to making the call to the method. A callvirt shouldn't take significantly longer than a call instruction especially considering the null check.
Have you found that you could significantly improve the performance of your application by creating types (either structs or classes with static methods) that allow you to guarantee that the C# compiler will emit call instructions rather than callvirt instructions?
The reason I ask is that I am wondering if you are going to create an unmaintainable code base that is brittle and hard to use simply to solve a problem that may or may not exist.

Converting all parameters, return types, class definitions to use Interfaces

I am in the process of converting all my parameters, return types, classes to all use Interfaces instead ie. IUser instead of User.
Besides the extra code required to maintain this, are their any negatives to this approach?
This isn't an uncommon approach, especially if you do a lot of mocking; however, it has issues with:
data-binding support (especially when adding rows to tables)
xml serialization (including comms WCF, asmx, etc), or contract-based serialization in general
You need to figure out whether the advantages of mocking etc outweigh these issues. It might be that you use IUser in most scenarios, but (for example) at the comms layer it may be simpler to use raw DTOs rather than interfaces.
Note that I'm applying the above to classes. If you involve structs, then remember that in most cases this will involve boxing too.
Overall, this will give you all the advantages associated with loose coupling, so in general I consider this a huge win. However, since you asked about disadvantages, here are some minor ones I can think of:
There's more code involved becase you have both the interface declaration and at least one implementation (you already mentioned this, so I'm just including it for completeness sake).
It becomes more difficult to navigate the code because you can't just Go to Definition to review what another method does (although I'm told that Resharper helps in this regard).
In some cases there may be more in a contract than mere semantics, and an interface can't capture that. The classic example from the BCL is that if you implement IList, you must increment the Count property every time you add an item. However, nothing in the interface forces you, as a developer, to do this. In such cases, abstract base classes may be worth considering instead.
Interfaces are brittle when it comes to adding new members. Once again, you may consider abstract base classes instead.
I once went through a phase of this, but in practice found that for anemic data objects (i.e. POCOs) the interfaces weren't required.
In practice it can be useful to have interfaces to define a contract for behaviour, but not necessarily for attributes.
In general I'd suggest you let your unit testing guide you. If you have rich objects throughout your application then you'll most likely need interfaces. If you have POCOs, you most likely will only need them for controller-style classes.
Interfaces are very good thing, but applying them to all artifacts is overkill. Especially in java you would end up with two distinct files (interface + implementation). So (as always), it really depends :)
Regarding 'interface or not-to-interface'
I would not have domain-objects have interfaces (e.g. User). In my view for code comprehension it is rather confusing (for domain objects interface often would define getter methods). Recently it was a real pain to get unit-tests in place and having domain-object-interfaces. I had to mock all these getters and getting test-data into the domain-object mocks was rather cumbersome.
The opposite is true for coarse grained services and api interfaces. For them I always have an interface from start on.
In more internal-module encapsulated scenarios I start without interface and after some code-evolution if necessary react and do an extract-interface refactoring.
'I' prefix for interface artifact names
I also used to work with the Ixxx prefix but I (happily) got rid of it nowadays for following reasons:
It is difficult to keep up all this 'I' naming convention in an evolving codebase, especially if you refactor a lot (extract-interface, collapse-interface etc.)
For client code it should be transparent whether the type is interface or class based
The 'I' can make you lazy about good interface and classnames.
Not so much a disadvantage but a warning. You would find that to achieve good component de-coupling you will need to use a Dependency Injection framework (that is if you want to keep your sanity and have some sort of idea what your interfaces map to).
What also tends to happen is that non-trivial classes sometimes naturally convert into more than one interface. This is especially true when you have public static methods (i.e. User.CreateAdminUser). You will also find that it's harder to get an interface to hold state AND do stuff. It's frequently more natural for an interface to be either one or the other.
If you get stuck in the process, do take a minute a do some research into what's an appropriate paradigm that you are trying to implement. Chances are someone has solved this particular design pattern before. Considering this is a big refactoring job, you might as well take extra time and do it properly.
avoid the ISuck prefix.
edit: the ISuck convention is a manifestation of the Systems Hungarian notation applied to type names.

What should be on a checklist that would help someone develop good OO software?

I have used OO programming languages and techniques years ago (primarily on C++) but in the intervening time haven't done much with OO.
I'm starting to make a small utility in C#. I could simply program it all without using good OO practice, but it would be a good refresher for me to apply OO techniques.
Like database normalization levels, I'm looking for a checklist that will remind me of the various rules of thumb for a 'good' object oriented program - a concise yes/no list that I can read through occasionally during design and implementation to prevent me from thinking and working procedurally. Would be even more useful if it contained the proper OO terms and concepts so that any check item is easily searchable for further information.
What should be on a checklist that would help someone develop good OO software?
Conversely, what 'tests' could be applied that would show software is not OO?
Objects do things. (Most important point in the whole of OOP!) Don't think about them as "data holders" - send them a message to do something. What verbs should my class have? The "Responsibility-Driven Design" school of thinking is brilliant for this. (See Object Design: Roles, Responsibilities, and Collaborations, Rebecca Wirfs-Brock and Alan McKean, Addison-Wesley 2003, ISBN 0201379430.)
For each thing the system must do, come up with a bunch of concrete scenarios describing how the objects talk to each other to get the job done. This means thinking in terms of interaction diagrams and acting out the method calls. - Don't start with a class diagram - that's SQL-thinking not OO-thinking.
Learn Test-Driven Development. Nobody gets their object model right up front but if you do TDD you're putting in the groundwork to make sure your object model does what it needs to and making it safe to refactor when things change later.
Only build for the requirements you have now - don't obsess about "re-use" or stuff that will be "useful later". If you only build what you need right now, you're keeping the design space of things you could do later much more open.
Forget about inheritance when you're modelling objects. It's just one way of implementing common code. When you're modelling objects just pretend you're looking at each object through an interface that describes what it can be asked to do.
If a method takes loads of parameters or if you need to repeatedly call a bunch of objects to get lots of data, the method might be in the wrong class. The best place for a method is right next to most of the fields it uses in the same class (or superclass ...)
Read a Design Patterns book for your language. If it's C#, try "Design Patterns in C#" by Steve Metsker. This will teach you a series of tricks you can use to divide work up between objects.
Don't test an object to see what type it is and then take action based on that type - that's a code smell that the object should probably be doing the work. It's a hint that you should call the object and ask it to do the work. (If only some kinds of objects do the work, you can simply have "do nothing" implementations in some objects... That's legitimate OOP.)
Putting the methods and data in the right classes makes OO code run faster (and gives virtual machines a chance to optimise better) - it's not just aesthetic or theoretical. The Sharble and Cohen study points this out - see http://portal.acm.org/citation.cfm?doid=159420.155839 (See the graph of metrics on "number of instructions executed per scenario")
Sounds like you want some basic yes/no questions to ask yourself along your way. Everyone has given some great "do this" and "think like that" lists, so here is my crack at some simple yes/no's.
Can I answer yes to all of these?
Do my classes represent the nouns I am concerned with?
Do my classes provide methods for actions/verbs that it can perform?
Can I answer no to all of these?
Do I have global state data that could either be put into a singleton or stored in class implementations that work with it?
Can I remove any public methods on a class and add them to an interface or make them private/protected to better encapsulate the behavior?
Should I use an interface to separate a behavior away from other interfaces or the implementing class?
Do I have code that is repeated between related classes that I can move into a base class for better code reuse and abstraction?
Am I testing for the type of something to decide what action to do? If so can this behavior be included on the base type or interface that the code in question is using to allow more effect use of the abstraction or should the code in question be refactored to use a better base class or interface?
Am I repeatedly checking some context data to decided what type to instantiate? If so can this be abstracted out into a factory design pattern for better abstraction of logic and code reuse?
Is a class very large with multiple focuses of functionality? If so can I divide it up into multiple classes, each with their own single purpose?
Do I have unrelated classes inheriting from the same base class? If so can I divide the base class into better abstractions or can I use composition to gain access to functionality?
Has my inheritance hierarchy become fearfully deep? If so can I flatten it or separate things via interfaces or splitting functionality?
I have worried way too much about my inheritance hierarchy?
When I explain the design to a rubber ducky do I feel stupid about the design or stupid about talking to a duck?
Just some quick ones off the top of my head. I hope it helps, OOP can get pretty crazy. I didn't include any yes/no's for more advanced stuff that's usually a concern with larger apps, like dependency injection or if you should split something out into different service/logic layers/assemblies....of course I hope you at least separate your UI from your logic.
Gathered from various books, famous C# programmers, and general advice (not much if any of this is mine; It is in the sense that these are various questions i ask myself during development, but that's it):
Structs or Classes? Is the item you're creating a value of it's own, make it a struct. If it's an "object" with attributes and sub-values, methods, and possibly state then make it an object.
Sealed Classes: If you're going to be creating a class and you don't explicitly need to be able to inherit from it make it sealed. (I do it for the supposed performance gain)
Don't Repeat Yourself: if you find yourself copy-past(a/e)ing code around then you should probably (but not always) rethink your design to minimize code duplication.
If you don't need to provide a base implementation for a given abstract class turn it into an interface.
The specialization principle: Each object you have should only do one thing. This helps avoid the "God-object".
Use properties for public access: This has been debated over and over again, but it's really the best thing to do. Properties allow you to do things you normally can't with fields, and it also allows you more control over how the object is gotten and set.
Singletons: another controversial topic, and here's the idea: only use them when you Absolutely Need To. Most of the time a bunch of static methods can serve the purpose of a singleton. (Though if you absolutely need a singleton pattern use Jon Skeet's excellent one)
Loose coupling: Make sure that your classes depend on each other as little as possible; make sure that it's easy for your library users to swap out parts of your library with others (or custom built portions). This includes using interfaces where necessary, Encapsulation (others have mentioned it), and most of the rest of the principles in this answer.
Design with simplicity in mind: Unlike cake frosting, it's easier to design something simple now and add later than it is to design complex now and remove later.
I might chuck some or all of this out the door if i'm:
Writing a personal project
really in a hurry to get something done (but i'll come back to it later.... sometime..... ;) )
These principles help guide my everyday coding and have improved my coding quality vastly in some areas! hope that helps!
Steve McConnell's Code Complete 2 contains a lot of ready to use checklists for good software construction.
Robert C. Martin's Agile Principles, Patterns, and Practices in C# contains a lot of principles for good OO desgin.
Both will give you a solid foundation to start with.
Data belongs with the code that operates on it (i.e. into the same class). This improves maintainability because many fields and methods can be private (encapsulation) and are thus to some degree removed from consideration when looking at the interaction between components.
Use polymorphism instead of conditions - whenever you have to do different things based on what class an object is, try to put that behaviour into a method that different classes implement differently so that all you have to do is call that method
Use inheritance sparingly, prefer composition - Inheritance is a distinctive feature of OO programming and often seen as the "essence" of OOP. It is in fact gravely overused and should be classified as the least important feature
Have I clearly defined the
requirements? Formal requirements documentation may not be necessary, but you should have a clear vision before you begin coding. Mind-mapping tools and prototypes or design sketches may be good alternatives if you don't need formal documentation. Work with end-users and stakeholders as early as possible in the software process, to make sure you are implementing what they need.
Am I reinventing the wheel? If you are coding to solve a common problem, look for a robust library that already solves this problem. If you think you might already have solved the problem elsewhere in your code (or a co-worker might have), look first for an existing solution.
Does my object have a clear, single purpose? Following the principle of Encapsulation, an object should have behavior together with the data that it operates on. An object should only have one major responsibility.
Can I code to an interface? Design By Contract is a great way to enable unit testing, document detailed, class-level requirements, and encourage code reuse.
Can I put my code under test? Test-Driven Development (TDD) is not always easy; but unit tests are invaluable for refactoring, and verifying regression behavior after making changes. Goes hand-in-hand with Design By Contract.
Am I overdesigning? Don't try to code a reusable component. Don't try to anticipate future requirements. These things may seem counterintuitive, but they lead to better design. The first time you code something, just implement it as straightforwardly as possible, and make it work. The second time you use the same logic, copy and paste. Once you have two working sections of code with common logic, you can easily refactor without trying to anticipate future requirements.
Am I introducing redudant code? Don't Repeat Yourself (DRY) is the biggest driving principal of refactoring. Use copy-and-paste only as a first step to refactoring. Don't code the same thing in different places, it's a maintenance nightmare.
Is this a common design pattern, anti-pattern, or code smell? Be familiar with common solutions to OO design problems, and look for them as you code - but don't try to force a problem to fit a certain pattern. Watch out for code that falls into a common "bad practice" pattern.
Is my code too tightly coupled? Loose Coupling is a principle that tries to reduce the inter-dependencies between two or more classes. Some dependencies are necessary; but the more you are dependent on another class, the more you have to fix when that class changes. Don't let code in one class depend on the implementation details of another class - use an object only according to its contract.
Am I exposing too much information? Practice information hiding. If a method or field doesn't need to be public, make it private. Expose only the minimum API necessary for an object to fulfill its contract. Don't make implementation details accessible to client objects.
Am I coding defensively? Check for error conditions, and Fail Fast. Don't be afraid to use exceptions, and let them propagate. In the event your program reaches an unexpected state, it's much, much better to abort an operation, log a stack trace for you to work with, and avoid unpredictable behavior in your downstream code. Follow best practices for cleaning up resources, such as the using() {} statement.
Will I be able to read this code in six months? Good code is readable with minimal documentation. Put comments where necessary; but also write code that's intuitive, and use meaningful class, method, and variable names. Practice good coding style; if you're working on a team project, each member of the team should write code that looks the same.
Does it still work? Test early, test often. After introducing new functionality, go back and touch any existing behavior that might have been affected. Get other team members to peer review and test your code. Rerun unit tests after making changes, and keep them up to date.
One of the best sources would be Martin Fowler's "Refactoring" book which contains a list (and supporting detail) of object oriented code smells that you might want to consider refactoring.
I would also recommend the checklists in Robert Martin's "Clean Code".
SOLID
DRY
TDD
Composition over inheritance
Make sure you read up and understand the following
Encapsulation
(Making sure you only expose the minimal state and functionality to get the job done)
Polymorphism
(Ability for derived objects to behave like their parents)
The difference between and interface and an abstract class
(An abstract class allows
functionality and state to be shared
with it's descendants, an interface
is only the promise that the
functionality will be implemented)
I like this list, although it might be a little dense to be used as a checklist.
UML - Unified Modeling Language, for object modeling and defining the structure of and relationships between classes
http://en.wikipedia.org/wiki/Unified_Modeling_Language
Then of course the programming techniques for OO (most already mentioned)
Information Hiding
Abstraction
Interfaces
Encapsulation
Inheritance / Polymorphism
Some of the rules are language agnostic, some of the rules differ from language to language.
Here are a few rules and comments that contradict some other the previously posted rules:
OO has 4 principles:
Abstraction, Encapsulation, Polymorphism and Inheritence.
Read about them and keep them in mind.
Modelling - Your classes are supposed to model entities in the problem domain:
Divide the problem to sub-domains (packages/namespaces/assemblies)
then divide the entities in each sub-domain to classes.
Classes should contain methods that model what objects of that type do.
Use UML, think about the requirements, use-cases, then class diagrams, them sequences.
(applicable mainly for high-level design - main classes and processes.)
Design Patterns - good concept for any language, implementation differs between languages.
Struct vs. class - in C# this is a matter of passing data by value or by reference.
Interface vs. base-class, is a base-class, does an interface.
TDD - this is not OO, in fact, it can cause a lack of design and lead to a lot of rewriting. (Scrum for instance recommends against it, for XP it is a must).
C#'s Reflection in some ways overrides OO (for example reflection based serialization), but it necessary for advanced frameworks.
Make sure classes do not "know of" other classes unless they have to, dividing to multiple assemblies and being scarce with references helps.
AOP (Aspect Oriented Programming) enhances OOP, (see PostSharp for instance) in a way that is so revolutionary that you should at least look it up and watch their clip.
For C#, read MS guidelines (look up guidelines in VS's MSDN help's index),
they have a lot of useful guidelines and conventions there
Recommended books:
Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries
C# 3.0 in a Nutshell

Categories

Resources