Can you ever have too many "protected virtual" methods? - c#

Here's a question for those of you with experience in larger projects and API/framework design.
I am working on a framework that will be used by many other projects in the future, so I want to make it nice and extensible, but at the same time it needs to be simple and easy to understand.
I know that a lot of people complain that the .NET framework contains too many sealed classes and private members. Should I avoid this criticism and open up all my classes with plenty of protected virtual members?
Is it a good idea to make as many of my methods and properties protected virtual as possible? Under what situations would you avoid protected virtual and make members private.

Your class includes data members; methods that perform basic internal operations on those data members where the functionality should never change should always be private. So methods that do basic operations with your data members such as initialization and allocation should be private. Otherwise, you run the risk of "second order" derivative classes getting an incomplete set of behaviors enabled; first derivative members could potentially redefine the behavior of the class.
That all said, I think you should be very careful with defining methods as "protected virtual". I would use great caution in defining methods as "protected virtual", because doing so not only declares the possibility of overriding the functionality, but in some ways define an expectation of overridden functionality. That sounds to me like an underdefined set of behaviors to override; I would rather have a well-defined set of behaviors to override. If you want to have a very large set of overridable behaviors, I would rather look into Aspect Oriented Programming, which allows for that sort of thing in a very structured way.

When you mark a method with the word virtual, you're allowing the users to change the way that piece of logic is executed. For many purposes, that is exactly what you want. I believe you already know that.
However, types should be designed for this sort of extension. You have to actively pick out the methods, where it makes sense to let the user change the behavior. If you just slap on virtual all over the place you risk ruining the integrity of the type, it doesn't really help the user to understand the type, and you may introduce a number of bugs including security related issues.
I prefer the conservative approach. I mark all my classes with sealed unless I specifically want to enable inheritance and in those (few) cases I only make the required methods virtual.
It is easy to remove the sealed tag if the class needs to change to allow inheritance in the future. However, if you want to change a class, which is already being used as a base class for some other type, you risk breaking the subclass when you change the base class.

My point of view is:
If you can user events, its preferred to protected methods.
Try to avoid protected methods as possible, if not possible then you have to use it ;-).

Choosing protected over private is a deliberate design decision. You are stating that your class explicitly supports having that function used, with all the overhead (design and implementation effort) that comes with that. I would only use protected in those situations where I know that it is necessary, largely because I am doing it myself. (You'll also find comments from BCL developers along the same lines as what I have said.)
The virtual/non-virtual performance difference is irrelevant on any machine that is powerful enough to run the .NET Framework.

No, you can't have "too many." However, the idea that we should just make every protected instead of private or avoid "sealed" at all costs is just silly. I would keep "helper methods" and internal data structures private.

Is it a good idea to make as many of my methods and properties protected virtual as possible?
Not as good idea.
Protected virtual methods provide extensibility points in the framework while adding coupling.
There are more promising techniques to provide extensibility: Composition and Delegation.

Related

How can I prevent methods from being added to a class?

I'm trying to find out if there's a way to stop functions/methods from being added (EDIT: by other developers) to a class for the case where the object is a Model or DTO which should not contain methods (to prevent 'abuse' of the Models/DTOs by others, who may try and add 'helper' methods etc).
Is there any way to achieve this?
Use reflection and write a unit test that fails if a model-class has methods.
Mark all you model classes with a custom attribute. Then make a unit test that uses reflection to load a given assembly, iterate all classes in that assembly and check that classes marked with the model attribute does not have methods. This should be fairly straight forward using reflection.
I believe you are trying to solve a procedural issue with code where you should be using communication.
Your colleagues (i assume) are operating on the code files with 'full trust' privileges. If they break that privilege you should open a dialogue. Use the change as an opportunity to educate them on the intended design. Perhaps they are correct and you will be educated!
I suggest simply making the intended design obvious in the class name and with a comment stating the intended nature. Perhaps quote the design document(s) that informed the class.
You cannot hinder anyone with full write-access to your code-base to do so. The only two things you may do to avoid it are create some CodeAnalysis-rule for FXCop as mentioned by Christian.K in the comments or by writing your DTO-class so that it is undoubtly a DTO that should not have any methods by using a unambigious name for the class and if this is not enough provide some code-comments that notifies the coder to do not so.
However you may need some kind of method if using collections e.g. where you will need some kind of comparision if two instances of your DTO are equal, so you have to provide at least an Equals- and GetHashCode-method.
You don't need to use a struct to prevent additions to a class. You can use the sealed keyword
public sealed class MyDTOObject { ... }
Now, you can not inherent a class and also prevent inheritance (which is essentially what you're asking). The very fact of inheriting MyDTOObject is creating a new class which is based off of not equal to, or restricted, or defined in any way by the implementation of MyDTOObject.
You can use an abstract class, to force derived classes to implement certain methods, but not the other way around.
If you want to prevent others from deriving from your class and implementing helper methods, you must use the sealed keyword, or mark the class internal.
You may prevent the class being extended or inherited by marking it final that way nobody would be able to extend your class and hence not being able to add any behavior. But stop and ask yourself whether you want to do that or not, because then you'd be signing an invisible contract that everything ever required by the class is written in the class and this class needs no further addition.
To be clear, I was talking in Java context.

When and why would you seal a class?

In C# and C++/CLI the keyword sealed (or NotInheritable in VB) is used to protect a class from any inheritance chance (the class will be non-inheritable). I know that one feature of object-oriented programming is inheritance and I feel that the use of sealed goes against this feature, it stops inheritance.
Is there an example that shows the benefit of sealed and when it is important to use it?
On a class that implements security features, so that the original object cannot be "impersonated".
More generally, I recently exchanged with a person at Microsoft, who told me they tried to limit the inheritance to the places where it really made full sense, because it becomes expensive performance-wise if left untreated. The sealed keyword tells the CLR that there is no class further down to look for methods, and that speeds things up.
In most performance-enhancing tools on the market nowadays, you will find a checkbox that will seal all your classes that aren't inherited.
Be careful though, because if you want to allow plugins or assembly discovery through MEF, you will run into problems.
An addendum to Louis Kottmann's excellent answer:
If a class isn't designed for inheritance, subclasses might break class invariants. This really only applies if you're creating a public API, of course, but as I rule of thumb I seal any class not explicitly designed to be subclassed.
On a related note, applicable to unsealed classes only: any method created virtual is an extension point, or at least looks like it should be an extension point. Declaring methods virtual should be a conscious decision as well. (In C# this is a conscious decision; in Java it isn't.)
And then there's this:
Sealing can make unit testing more difficult, as it prohibits mocking.
Some relevant links:
Effective Java, 2nd Edition by Joshua Bloch. See item 17 (requires Safari subscription)
Effective Java Item 17: Design and document for inheritance or else prohibit it (discussion of same item)
Also note that Kotlin seals classes by default; its open keyword is the opposite of Java's final or the sealed of C#. (To be sure, there is no universal agreement that this is a good thing.)
Marking a class as Sealed prevents tampering of important classes that can compromise security, or affect performance.
Many times, sealing a class also makes sense when one is designing a utility class with fixed behaviour, which we don't want to change.
For example, System namespace in C# provides many classes which are sealed, such as String. If not sealed, it would be possible to extend its functionality, which might be undesirable, as it's a fundamental type with given functionality.
Similarly, structures in C# are always implicitly sealed. Hence one cannot derive one structure/class from another structure. The reasoning for this is that structures are used to model only stand-alone, atomic, user-defined data types, which we don't want to modify.
Sometimes, when you are building class hierarchies, you might want to cap off a certain branch in the inheritance chain, based on your domain model or business rules.
For example, a Manager and PartTimeEmployee are both Employees, but you don't have any role after part-time employees in your organization. In this case, you might want to seal PartTimeEmployee to prevent further branching. On the other hand, if you have hourly or weekly part-time employees, it might make sense to inherit them from PartTimeEmployee.
I think this post has some good point, the specific case was when trying to cast a non-sealed class to any random interface, compiler doesn't throw error; but when sealed is used the compiler throws error that it can't convert. Sealed class brings additional code access security.
https://www.codeproject.com/Articles/239939/Csharp-Tweaks-Why-to-use-the-sealed-keyword-on-cla
Sealing is a conscious decision that should be considered only when you want to clearly reveal your intent about the structural characteristics of your class. It is a structural choice about your object model. It should never be a decision about performance, or security(**). But more importantly, never about arbitrary limits to your inheritance tree.
I am putting forward this rule of thumb:
A class should never be sealed if you have to think whether it is a good idea to seal it. A decision to seal a class should be obvious to you and will be made even before you write the class's first line of code.
As an example, since we can't derive from them but they look so much like a regular class, we often think of structs as sealed classes. Which is what they are. It is this limitation that allows them to implement value-type semantics since inheritance and polymorphism can only work with reference types. So the "struct class" is sealed because any class that implements value-type semantics must give away inheritance and have its memory managed differently. (Note that this is true of any value-type object in C#, not just structs).
Another example: A code generator may write a sealed class representing a window and all its elements for the user to define the behavior on, because the UI engine expects this class, and no other, in order to be able to render the window.
And last example: A math utility class may be sealed because it is built around truisms, and any extended behavior can never be correct or "work as intended". This is one example that doesn't exactly fall under the rule of thumb above. Never blindly trust rules of thumb.
(**) If performance is an issue in your application, you can be sure unsealed classes are not the reason. Similarly, if you depend on sealed classes to enforce security in your application, the problem must be on your base classes -- what they expose or allow to be extended.

In C#, is it ok to virtualize every method?

This may make a lot of C# programmers cringe, but is it ok to virtual-ize every method in a base class -- even if certain methods are never overridden?
The reason I need to do this is that I have a special case where I need to get C# to act like Java. It's actually an automatic program transformation of a Java program.
I'm thinking to mark any Java method that has no base method as virtual, and any that do have an associated base method as override.
Aside from a lack of flexibility, are there any other issues with doing it this way?
Yes it is okay, but not necessarily a good practice. Virtualization helps with two things: inheritance and decoupling (for things like unit testing or replacing out other classes with new ones).
Prefer composition over inheritance in your OO design and you can use interfaces instead of virtuals with your classes. That will give you what you need for both unit testing and composition.
But, with the speed of today's CPUs, I'd not worry terribly about the extra V-table lookup if everything is virtual.
So I suggest, if you can solve your problem by providing an interface for the java program, do that. Otherwise, don't lose any sleep over having the methods virtual.
Not only is this OK, but some applications need this.
nHibernate requires you to mark properties as virtual for mapping, for example.
This is ok. There will be no unwanted behaviors caused by this, and as far as I know, virtual methods do not cause the classes to slow down. So, why not?

Are protected members/fields really that bad?

Now if you read the naming conventions in the MSDN for C# you will notice that it states that properties are always preferred over public and protected fields. I have even been told by some people that you should never use public or protected fields. Now I will agree I have yet to find a reason in which I need to have a public field but are protected fields really that bad?
I can see it if you need to make sure that certain validation checks are performed when getting/setting the value however a lot of the time it seems like just extra overhead in my opinion. I mean lets say I have a class GameItem with fields for baseName, prefixName, and suffixName. Why should I take the overhead of both creating the properties (C#) or accessor methods and the performance hit I would occur (if I do this for every single field in an application, I am sure that it would adds up at less a little especially in certain languages like PHP or certain applications with performance is critical like games)?
Are protected members/fields really that bad?
No. They are way, way worse.
As soon as a member is more accessible than private, you are making guarantees to other classes about how that member will behave. Since a field is totally uncontrolled, putting it "out in the wild" opens your class and classes that inherit from or interact with your class to higher bug risk. There is no way to know when a field changes, no way to control who or what changes it.
If now, or at some point in the future, any of your code ever depends on a field some certain value, you now have to add validity checks and fallback logic in case it's not the expected value - every place you use it. That's a huge amount of wasted effort when you could've just made it a damn property instead ;)
The best way to share information with deriving classes is the read-only property:
protected object MyProperty { get; }
If you absolutely have to make it read/write, don't. If you really, really have to make it read-write, rethink your design. If you still need it to be read-write, apologize to your colleagues and don't do it again :)
A lot of developers believe - and will tell you - that this is overly strict. And it's true that you can get by just fine without being this strict. But taking this approach will help you go from just getting by to remarkably robust software. You'll spend far less time fixing bugs.
And regarding any concerns about performance - don't. I guarantee you will never, in your entire career, write code so fast that the bottleneck is the call stack itself.
OK, downvote time.
First of all, properties will never hurt performance (provided they don't do much). That's what everyone else says, and I agree.
Another point is that properties are good in that you can place breakpoints in them to capture getting/setting events and find out where they come from.
The rest of the arguments bother me in this way:
They sound like "argument by prestige". If MSDN says it, or some famous developer or author whom everybody likes says it, it must be so.
They are based on the idea that data structures have lots of inconsistent states, and must be protected against wandering or being placed into those states. Since (it seems to me) data structures are way over-emphasized in current teaching, then typically they do need those protections. Far more preferable is to minimize data structure so that it tends to be normalized and not to have inconsistent states. Then, if a member of a class is changed, it is simply changed, rather than damaged. After all, somehow lots of good software was/is written in C, and that didn't suffer massively from lack of protections.
They are based on defensive coding carried to extremes. It is based on the idea that your classes will be used in a world where nobody else's code can be trusted not to goose your stuff. I'm sure there are situations where this is true, but I've never seen them. What I have seen is situations where things were made horribly complicated to get around protections for which there was no need, and to try to guard the consistency of data structures that were horribly over-complicated and un-normalized.
Regarding fields vs. properties, I can think of two reasons for prefering properties in the public interface (protected is also public in the sense that someone else than just your class can see it).
Exposing properties gives you a way to hide the implementation. It also allows you to change the implementation without changing the code that uses it (e.g. if you decide to change the way data are stored in the class)
Many tools that work with classes using reflection only focus on properties (for example, I think that some libraries for serialization work this way). Using properties consistently makes it easier to use these standard .NET tools.
Regarding overheads:
If the getter/setter is the usual one line piece of code that simply reads/sets the value of a field, then the JIT should be able to inline the call, so there is no performance overhad.
Syntactical overhead is largely reduced when you're using automatically implemented properties (C# 3.0 and newer), so I don't think this is an issue:
protected int SomeProperty { get; set; }
In fact, this allows you to make for example set protected and get public very easily, so this can be even more elegant than using fields.
Public and/or protected fields are bad because they can be manipulated from outside the declaring class without validation; thus they can be said to break the encapsulation principle of object oriented programming.
When you lose encapsulation, you lose the contract of the declaring class; you cannot guarantee that the class behaves as intended or expected.
Using a property or a method to access the field enables you to maintain encapsulation, and fulfill the contract of the declaring class.
I agree with the read-only property answer. But to play devil's advocate here, it really depends on what you're doing. I'll be happy to admit i write code with public members all the time (i also don't comment, follow guidelines, or any of the formalities).
But when i'm at work that's a different story.
It actually depends on if your class is a data class or a behaviour class.
If you keep your behaviour and data separate, it is fine to expose the data of your data classes, as long as they have no behaviour.
If the class is a behaviour class, then it should not expose any data.

Should I use internal or public visibility by default?

I'm a pretty new C# and .NET developer. I recently created an MMC snapin using C# and was gratified by how easy it was to do, especially after hearing a lot of horror stories by some other developers in my organisation about how hard it is to do in C++.
I pretty much went through the whole project at some point and made every instance of the "public" keyword to "internal", except as required by the runtime in order to run the snapin. What is your feeling on this, should you generally make classes and methods public or internal?
I believe in blackboxes where possible. As a programmer, I want a well defined blackbox which I can easily drop into my systems, and have it work. I give it values, call the appropriate methods, and then get my results back out of it.
To that end, give me only the functionality that the class needs to expose to work.
Consider an elevator. To get it to go to a floor, I push a button. That's the public interface to the black box which activates all the functions needed to get the elevator to the desired floor.
What you did is exactly what you should do; give your classes the most minimal visibility you can. Heck, if you want to really go whole hog, you can make everything internal (at most) and use the InternalsVisibleTo attribute, so that you can separate your functionality but still not expose it to the unknown outside world.
The only reason to make things public is that you're packaging your project in several DLLs and/or EXEs and (for whatever reason) you don't care to use InternalsVisibleTo, or you're creating a library for use by third parties. But even in a library for use by third parties, you should try to reduce the "surface area" wherever possible; the more classes you have available, the more confusing your library will be.
In C#, one good way to ensure you're using the minimum visibility possible is to leave off the visibility modifiers until you need them. Everything in C# defaults to the least visibility possible: internal for classes, and private for class members and inner classes.
I think you should err on the side of internal classes and members. You can always increase an item's visibility but decreasing it can cause problems. This is especially true if you are building a framework for others.
You do need to be careful though not to hide useful functionality from your users. There are many useful methods in the .NET BCL that cannot be used without resorting to reflection. However, by hiding these methods, the surface area of what has to be tested and maintained is reduced.
I prefer to avoid marking classes as public unless I explicitly want my customer to consume them, and I am prepared to support them.
Instead of marking a class as internal, I leave the accessibility blank. This way, public stands out to the eye as something notable. (The exception, of course, is nested classes, which have to be marked if they are to be visible even in the same assembly.)
Most classes should be internal, but most non-private members should be public.
The question you should ask about a member is "if the class were made public would I want to member the member to be exposed?". The answer is usually "yes (so public)" because classes without any accessible members are not much use!
internal members do have a role; they are 'back-door access' meant only for close relatives that live in the same assembly.
Even if your class remains internal, it is nice to see which are front-door members and which are back-door. And if you ever change it to public you are not going to have to go back and think about which are which.
Is there any reason you need to use Internal instead of Private? You do realise that Internal has assembly level scope. In other words Internal classes/members are accessible to all classes in a multi-class assembly.
As some other answers have said, in general go for the highest level of encapsulation as possible (ie private) unless you actually need internal/protected/public.
I found a problem using internal classes as much as possible. You cannot have methods, properties, fields, etc of that type (or parameter type or return type) more visible than internal. This leads to have constructors that are internal, as well as properties. This shouldn't be a problem, but as a matter of fact, when using Visual Studio and the xaml designer, there are problems. False positive errors are detected by the designer due to the fact that the methods are not public, user control properties seems not visible to the designer. I don't know if others have already fallen on such issues...
You should try to make them only as visible as possible, but as stated by Mike above, this causes problems with UserControls and using the VS Designer with those controls on forms or other UserControls.
So as a general rule, keep all classes and UserControls that you aren't adding using the Designer only as visible as they need to be. But if you are creating a UserControl that you want to use in the Designer (even if that's within the same assembly), you will need to make sure that the UserControl class, its default constructor, and any properties and events, are made public for the designer to work with it.
I had a problem recently where the designer would keep removing the this.myControl = new MyControl() line from the InitializeComponent() method because the UserControl MyControl was marked as internal along with its constructor.
It's really a bug I think because even if they are marked as internal they still show up in the Toolbox to add in the Designer, either Microsoft needs to only show public controls with public constructors, or they need to make it work with internal controls as well.
You should tend toward exposing as little as possible to other classes, and think carefully about what you do expose and why.
It depends on how much control you have over code that consumes it. In my Java development, I make all my stuff public final by default because getters are annoying. However, I also have the luxury of being able to change anything in my codebase whenever I want. In the past, when I've had to release code to consumers, I've always used private variables and getters.
I like to expose things as little as possible. Private, protected, internal, public: give classes, variables, properties, and functions the least amount of visibility they need for everything to still work.
I'll bump something's visibility up that chain toward public only when there's a good reason to.
I completely disagree with the answers so far. I feel that internal is a horrid idea, preventing another assembly from inheriting your types, or even using your internal types should the need for a workaround come about.
Today, I had to use reflection in order to get to the internals of a System.Data.DataTable (I have to build a datatable lightning fast, without all of its checks), and I had to use reflection, since not a single type was available to me; they were all marked as internal.
by default class is created as internal in c#:
internal means: Access is limited to the current assembly.
see
http://msdn.microsoft.com/en-us/library/0b0thckt.aspx
Good Article the defaults scope is internal:
http://www.c-sharpcorner.com/UploadFile/84c85b/default-scope-of-a-C-Sharp-class/
Do not choose a "default". Pick what best fits the visibility needs for that particular class. When you choose a new class in Visual Studio, the template is created as:
class Class1
{
}
Which is private (since no scope is specified). It is up to you to specify scope for the class (or leave as private). There should be a reason to expose the class.

Categories

Resources