Is Reflection breaking the encapsulation principle? - c#

Okay, let's say we have a class defined like
public class TestClass
{
private string MyPrivateProperty { get; set; }
// This is for testing purposes
public string GetMyProperty()
{
return MyPrivateProperty;
}
}
then we try:
TestClass t = new TestClass { MyPrivateProperty = "test" };
Compilation fails with TestClass.MyPrivateProperty is inaccessible due to its protection level, as expected.
Try
TestClass t = new TestClass();
t.MyPrivateProperty = "test";
and compilation fails again, with the same message.
All good until now, we were expecting this.
But then one write:
PropertyInfo aProp = t.GetType().GetProperty(
"MyPrivateProperty",
BindingFlags.NonPublic | BindingFlags.Instance);
// This works:
aProp.SetValue(t, "test", null);
// Check
Console.WriteLine(t.GetMyProperty());
and here we are, we managed to change a private field.
Isn't it abnormal to be able to alter some object's internal state just by using reflection?
Edit:
Thanks for the replies so far. For those saying "you don't have to use it": what about a class designer, it looks like he can't assume internal state safety anymore?

Reflection breaks encapsulation principles by giving access to private fields and methods, but it's not the first or only way in which encapsulation can be circumvented; one could argue that serialization exposes all the internal data of a class, information which would normally be private.
It's important to understand that encapsulation is only a technique, one that makes designing behaviour easier, provided consumers agree use an API you have defined. If somebody chooses to circumvent your API using reflection or any other technique, they no longer have the assurance that your object will behave as you designed it. If somebody assigns a value of null to a private field, they'd better be ready to catch a NullReferenceException the next time they try to use your class!
In my experience, programming is all about assertions and assumptions. The language asserts constraints (classes, interfaces, enumerations) which make creating isolated behaviour much easier to produce, on the assumption that a consumer agrees to not violate those boundaries.
This is a fair assertion to make given it makes a divide-and-conquer approach to software development more easy than any technique before it.

Reflection is a tool. You may use it to break encapsulation, when it gives you more than it takes away.
Reflection has a certain "pain" (or cost -- in performance, in readability, in reliability of code) associated with it, so you won't use it for a common problem. It's just easier to follow object-oriented principles for common problems, which is one of the goals of the language, commonly referred to as the pit of success.
On the other hand, there are some tasks that wouldn't be solvable without that mechanism, e.g. working with run-time generate types (though, it is going to be much-much easier starting from .NET 4.0 with it's DLR and the "dynamic" variables in C# 4.0).

You are right that reflection can be opposed to any number of good design principles, but it can also be an essential building block that you can use to support good design principles - e.g. software that can be extended by plugins, inversion of control, etc.
If you're worried that it represents a capability that should be discouraged, you may have a point. But it's not as convenient to use as true language features, so it is easier to do things the right way.
If you think reflection ought to be impossible, you're dreaming!
In C++ there is no reflection as such. But there is an underlying "object model" that the compiler-generated machine code uses to access the structure of objects (and virtual functions). So a C++ programmer can break encapsulation in the same way.
class RealClass
{
private:
int m_secret;
};
class FakeClass
{
public:
int m_notSecret;
};
We can take a pointer to an object of type RealClass and simply cast it to FakeClass and access the "private" member.
Any restricted system has to be implemented on top of a more flexible system, so it is always possible to circumvent it. If reflection wasn't provided as a BCL feature, someone could add it with a library using unsafe code.
In some languages there are ways to encapsulate data so that it is not possible within the language to get at the data except in certain prescribed ways. But it will always be possible to cheat if you can find a way to escape out of the language. An extreme example would be scoped variables in JavaScript:
(function() {
var x = 5;
myGetter = function() { return x; };
mySetter = function(v) { x = v; };
})();
After that executes, the global namespace contains two functions, myGetter and mySetter, which are the only way to access the value of x. Javascript has no reflective ability to get at x any other way. But it has to run in some kind of host interpreter (e.g. in the browser), and so there is certainly some horrible way to manipulate x. A memory corruption bug in a plugin could do it by accident!

I suppose you could say that. You could also say that the CodeDom, Reflection Emit, and Expression Tree APIs break encapsulation by allowing you to write code on the fly. They are merely facilities provided by .NET. You don't have to use them.
Don't forget that you have to (a) be running in full trust, and (b) explicitly specify BindingFlags.NonPublic, in order to access private data. That should be enough of a safety net.

One can say that reflection breaks inheritance and polymorphism as well. When instead of supplying each object type with its own overloaded method version you have a single method that checks object type at run time and switches to a particular behavior for each.
Reflection is a nice tool nevertheless. Sometimes you need to instantiate an object with a private constructor because it's an optimal course of action and you cannot change the class implementation (closed library, can't get in touch with the author any longer). Or you wish to perform some auxiliary operation on a variety of objects in one place. Or maybe you wish to perform logging for certain types. These are the situation when reflection does help without causing any harm.

That's part of the functionality provided by Reflection, but not, I would say, the best use of it.
It only works under Full Trust.

The encapsulation principle is hold by your API, but reflection gives you a way to work around the API.
Whoever uses reflection knows that he is not using your API correctly!

No, it is not abnormal.
It is something reflection allows you to do, in some situation what you did might be of some use, in many other that simply will make your code a time bomb.
Reflection is a tool, it can be used or abused.
Regarding the question after your edit: the answer is simply no.
An API designer could do his best, putting a lot of effort into exposing a clean interface and see his API overly misused by using reflection.
An engineer can model a perfect washing machine that is secure, doesn't shock you with electricity and so on, but if you smash it with an hammer until you see the cables no one is going to blame the engineer if you get a shock :D

Reflection can help you to keep your code clean. E.g. when you use hibernate you can directly bind the private variables to DB values and you don't have to write unnecessary setter methods that you would not need otherwise. Seen from this point of view, reflection can help you to keep encapsulation.

Related

C# Encapsulation (OOP) [duplicate]

What's the advantage of using getters and setters - that only get and set - instead of simply using public fields for those variables?
If getters and setters are ever doing more than just the simple get/set, I can figure this one out very quickly, but I'm not 100% clear on how:
public String foo;
is any worse than:
private String foo;
public void setFoo(String foo) { this.foo = foo; }
public String getFoo() { return foo; }
Whereas the former takes a lot less boilerplate code.
There are actually many good reasons to consider using accessors rather than directly exposing fields of a class - beyond just the argument of encapsulation and making future changes easier.
Here are the some of the reasons I am aware of:
Encapsulation of behavior associated with getting or setting the property - this allows additional functionality (like validation) to be added more easily later.
Hiding the internal representation of the property while exposing a property using an alternative representation.
Insulating your public interface from change - allowing the public interface to remain constant while the implementation changes without affecting existing consumers.
Controlling the lifetime and memory management (disposal) semantics of the property - particularly important in non-managed memory environments (like C++ or Objective-C).
Providing a debugging interception point for when a property changes at runtime - debugging when and where a property changed to a particular value can be quite difficult without this in some languages.
Improved interoperability with libraries that are designed to operate against property getter/setters - Mocking, Serialization, and WPF come to mind.
Allowing inheritors to change the semantics of how the property behaves and is exposed by overriding the getter/setter methods.
Allowing the getter/setter to be passed around as lambda expressions rather than values.
Getters and setters can allow different access levels - for example the get may be public, but the set could be protected.
Because 2 weeks (months, years) from now when you realize that your setter needs to do more than just set the value, you'll also realize that the property has been used directly in 238 other classes :-)
A public field is not worse than a getter/setter pair that does nothing except returning the field and assigning to it. First, it's clear that (in most languages) there is no functional difference. Any difference must be in other factors, like maintainability or readability.
An oft-mentioned advantage of getter/setter pairs, isn't. There's this claim that you can change the implementation and your clients don't have to be recompiled. Supposedly, setters let you add functionality like validation later on and your clients don't even need to know about it. However, adding validation to a setter is a change to its preconditions, a violation of the previous contract, which was, quite simply, "you can put anything in here, and you can get that same thing later from the getter".
So, now that you broke the contract, changing every file in the codebase is something you should want to do, not avoid. If you avoid it you're making the assumption that all the code assumed the contract for those methods was different.
If that should not have been the contract, then the interface was allowing clients to put the object in invalid states. That's the exact opposite of encapsulation If that field could not really be set to anything from the start, why wasn't the validation there from the start?
This same argument applies to other supposed advantages of these pass-through getter/setter pairs: if you later decide to change the value being set, you're breaking the contract. If you override the default functionality in a derived class, in a way beyond a few harmless modifications (like logging or other non-observable behaviour), you're breaking the contract of the base class. That is a violation of the Liskov Substitutability Principle, which is seen as one of the tenets of OO.
If a class has these dumb getters and setters for every field, then it is a class that has no invariants whatsoever, no contract. Is that really object-oriented design? If all the class has is those getters and setters, it's just a dumb data holder, and dumb data holders should look like dumb data holders:
class Foo {
public:
int DaysLeft;
int ContestantNumber;
};
Adding pass-through getter/setter pairs to such a class adds no value. Other classes should provide meaningful operations, not just operations that fields already provide. That's how you can define and maintain useful invariants.
Client: "What can I do with an object of this class?"
Designer: "You can read and write several variables."
Client: "Oh... cool, I guess?"
There are reasons to use getters and setters, but if those reasons don't exist, making getter/setter pairs in the name of false encapsulation gods is not a good thing. Valid reasons to make getters or setters include the things often mentioned as the potential changes you can make later, like validation or different internal representations. Or maybe the value should be readable by clients but not writable (for example, reading the size of a dictionary), so a simple getter is a nice choice. But those reasons should be there when you make the choice, and not just as a potential thing you may want later. This is an instance of YAGNI (You Ain't Gonna Need It).
Lots of people talk about the advantages of getters and setters but I want to play devil's advocate. Right now I'm debugging a very large program where the programmers decided to make everything getters and setters. That might seem nice, but its a reverse-engineering nightmare.
Say you're looking through hundreds of lines of code and you come across this:
person.name = "Joe";
It's a beautifully simply piece of code until you realize its a setter. Now, you follow that setter and find that it also sets person.firstName, person.lastName, person.isHuman, person.hasReallyCommonFirstName, and calls person.update(), which sends a query out to the database, etc. Oh, that's where your memory leak was occurring.
Understanding a local piece of code at first glance is an important property of good readability that getters and setters tend to break. That is why I try to avoid them when I can, and minimize what they do when I use them.
In a pure object-oriented world getters and setters is a terrible anti-pattern. Read this article: Getters/Setters. Evil. Period. In a nutshell, they encourage programmers to think about objects as of data structures, and this type of thinking is pure procedural (like in COBOL or C). In an object-oriented language there are no data structures, but only objects that expose behavior (not attributes/properties!)
You may find more about them in Section 3.5 of Elegant Objects (my book about object-oriented programming).
There are many reasons. My favorite one is when you need to change the behavior or regulate what you can set on a variable. For instance, lets say you had a setSpeed(int speed) method. But you want that you can only set a maximum speed of 100. You would do something like:
public void setSpeed(int speed) {
if ( speed > 100 ) {
this.speed = 100;
} else {
this.speed = speed;
}
}
Now what if EVERYWHERE in your code you were using the public field and then you realized you need the above requirement? Have fun hunting down every usage of the public field instead of just modifying your setter.
My 2 cents :)
One advantage of accessors and mutators is that you can perform validation.
For example, if foo was public, I could easily set it to null and then someone else could try to call a method on the object. But it's not there anymore! With a setFoo method, I could ensure that foo was never set to null.
Accessors and mutators also allow for encapsulation - if you aren't supposed to see the value once its set (perhaps it's set in the constructor and then used by methods, but never supposed to be changed), it will never been seen by anyone. But if you can allow other classes to see or change it, you can provide the proper accessor and/or mutator.
Thanks, that really clarified my thinking. Now here is (almost) 10 (almost) good reasons NOT to use getters and setters:
When you realize you need to do more than just set and get the value, you can just make the field private, which will instantly tell you where you've directly accessed it.
Any validation you perform in there can only be context free, which validation rarely is in practice.
You can change the value being set - this is an absolute nightmare when the caller passes you a value that they [shock horror] want you to store AS IS.
You can hide the internal representation - fantastic, so you're making sure that all these operations are symmetrical right?
You've insulated your public interface from changes under the sheets - if you were designing an interface and weren't sure whether direct access to something was OK, then you should have kept designing.
Some libraries expect this, but not many - reflection, serialization, mock objects all work just fine with public fields.
Inheriting this class, you can override default functionality - in other words you can REALLY confuse callers by not only hiding the implementation but making it inconsistent.
The last three I'm just leaving (N/A or D/C)...
Depends on your language. You've tagged this "object-oriented" rather than "Java", so I'd like to point out that ChssPly76's answer is language-dependent. In Python, for instance, there is no reason to use getters and setters. If you need to change the behavior, you can use a property, which wraps a getter and setter around basic attribute access. Something like this:
class Simple(object):
def _get_value(self):
return self._value -1
def _set_value(self, new_value):
self._value = new_value + 1
def _del_value(self):
self.old_values.append(self._value)
del self._value
value = property(_get_value, _set_value, _del_value)
Well i just want to add that even if sometimes they are necessary for the encapsulation and security of your variables/objects, if we want to code a real Object Oriented Program, then we need to STOP OVERUSING THE ACCESSORS, cause sometimes we depend a lot on them when is not really necessary and that makes almost the same as if we put the variables public.
EDIT: I answered this question because there are a bunch of people learning programming asking this, and most of the answers are very technically competent, but they're not as easy to understand if you're a newbie. We were all newbies, so I thought I'd try my hand at a more newbie friendly answer.
The two main ones are polymorphism, and validation. Even if it's just a stupid data structure.
Let's say we have this simple class:
public class Bottle {
public int amountOfWaterMl;
public int capacityMl;
}
A very simple class that holds how much liquid is in it, and what its capacity is (in milliliters).
What happens when I do:
Bottle bot = new Bottle();
bot.amountOfWaterMl = 1500;
bot.capacityMl = 1000;
Well, you wouldn't expect that to work, right?
You want there to be some kind of sanity check. And worse, what if I never specified the maximum capacity? Oh dear, we have a problem.
But there's another problem too. What if bottles were just one type of container? What if we had several containers, all with capacities and amounts of liquid filled? If we could just make an interface, we could let the rest of our program accept that interface, and bottles, jerrycans and all sorts of stuff would just work interchangably. Wouldn't that be better? Since interfaces demand methods, this is also a good thing.
We'd end up with something like:
public interface LiquidContainer {
public int getAmountMl();
public void setAmountMl(int amountMl);
public int getCapacityMl();
}
Great! And now we just change Bottle to this:
public class Bottle implements LiquidContainer {
private int capacityMl;
private int amountFilledMl;
public Bottle(int capacityMl, int amountFilledMl) {
this.capacityMl = capacityMl;
this.amountFilledMl = amountFilledMl;
checkNotOverFlow();
}
public int getAmountMl() {
return amountFilledMl;
}
public void setAmountMl(int amountMl) {
this.amountFilled = amountMl;
checkNotOverFlow();
}
public int getCapacityMl() {
return capacityMl;
}
private void checkNotOverFlow() {
if(amountOfWaterMl > capacityMl) {
throw new BottleOverflowException();
}
}
I'll leave the definition of the BottleOverflowException as an exercise to the reader.
Now notice how much more robust this is. We can deal with any type of container in our code now by accepting LiquidContainer instead of Bottle. And how these bottles deal with this sort of stuff can all differ. You can have bottles that write their state to disk when it changes, or bottles that save on SQL databases or GNU knows what else.
And all these can have different ways to handle various whoopsies. The Bottle just checks and if it's overflowing it throws a RuntimeException. But that might be the wrong thing to do.
(There is a useful discussion to be had about error handling, but I'm keeping it very simple here on purpose. People in comments will likely point out the flaws of this simplistic approach. ;) )
And yes, it seems like we go from a very simple idea to getting much better answers quickly.
Please note also that you can't change the capacity of a bottle. It's now set in stone. You could do this with an int by declaring it final. But if this was a list, you could empty it, add new things to it, and so on. You can't limit the access to touching the innards.
There's also the third thing that not everyone has addressed: getters and setters use method calls. That means that they look like normal methods everywhere else does. Instead of having weird specific syntax for DTOs and stuff, you have the same thing everywhere.
I know it's a bit late, but I think there are some people who are interested in performance.
I've done a little performance test. I wrote a class "NumberHolder" which, well, holds an Integer. You can either read that Integer by using the getter method
anInstance.getNumber() or by directly accessing the number by using anInstance.number. My programm reads the number 1,000,000,000 times, via both ways. That process is repeated five times and the time is printed. I've got the following result:
Time 1: 953ms, Time 2: 741ms
Time 1: 655ms, Time 2: 743ms
Time 1: 656ms, Time 2: 634ms
Time 1: 637ms, Time 2: 629ms
Time 1: 633ms, Time 2: 625ms
(Time 1 is the direct way, Time 2 is the getter)
You see, the getter is (almost) always a bit faster. Then I tried with different numbers of cycles. Instead of 1 million, I used 10 million and 0.1 million.
The results:
10 million cycles:
Time 1: 6382ms, Time 2: 6351ms
Time 1: 6363ms, Time 2: 6351ms
Time 1: 6350ms, Time 2: 6363ms
Time 1: 6353ms, Time 2: 6357ms
Time 1: 6348ms, Time 2: 6354ms
With 10 million cycles, the times are almost the same.
Here are 100 thousand (0.1 million) cycles:
Time 1: 77ms, Time 2: 73ms
Time 1: 94ms, Time 2: 65ms
Time 1: 67ms, Time 2: 63ms
Time 1: 65ms, Time 2: 65ms
Time 1: 66ms, Time 2: 63ms
Also with different amounts of cycles, the getter is a little bit faster than the regular way. I hope this helped you.
Don't use getters setters unless needed for your current delivery I.e. Don't think too much about what would happen in the future, if any thing to be changed its a change request in most of the production applications, systems.
Think simple, easy, add complexity when needed.
I would not take advantage of ignorance of business owners of deep technical know how just because I think it's correct or I like the approach.
I have massive system written without getters setters only with access modifiers and some methods to validate n perform biz logic. If you absolutely needed the. Use anything.
We use getters and setters:
for reusability
to perform validation in later stages of programming
Getter and setter methods are public interfaces to access private class members.
Encapsulation mantra
The encapsulation mantra is to make fields private and methods public.
Getter Methods: We can get access to private variables.
Setter Methods: We can modify private fields.
Even though the getter and setter methods do not add new functionality, we can change our mind come back later to make that method
better;
safer; and
faster.
Anywhere a value can be used, a method that returns that value can be added. Instead of:
int x = 1000 - 500
use
int x = 1000 - class_name.getValue();
In layman's terms
Suppose we need to store the details of this Person. This Person has the fields name, age and sex. Doing this involves creating methods for name, age and sex. Now if we need create another person, it becomes necessary to create the methods for name, age, sex all over again.
Instead of doing this, we can create a bean class(Person) with getter and setter methods. So tomorrow we can just create objects of this Bean class(Person class) whenever we need to add a new person (see the figure). Thus we are reusing the fields and methods of bean class, which is much better.
I spent quite a while thinking this over for the Java case, and I believe the real reasons are:
Code to the interface, not the implementation
Interfaces only specify methods, not fields
In other words, the only way you can specify a field in an interface is by providing a method for writing a new value and a method for reading the current value.
Those methods are the infamous getter and setter....
It can be useful for lazy-loading. Say the object in question is stored in a database, and you don't want to go get it unless you need it. If the object is retrieved by a getter, then the internal object can be null until somebody asks for it, then you can go get it on the first call to the getter.
I had a base page class in a project that was handed to me that was loading some data from a couple different web service calls, but the data in those web service calls wasn't always used in all child pages. Web services, for all of the benefits, pioneer new definitions of "slow", so you don't want to make a web service call if you don't have to.
I moved from public fields to getters, and now the getters check the cache, and if it's not there call the web service. So with a little wrapping, a lot of web service calls were prevented.
So the getter saves me from trying to figure out, on each child page, what I will need. If I need it, I call the getter, and it goes to find it for me if I don't already have it.
protected YourType _yourName = null;
public YourType YourName{
get
{
if (_yourName == null)
{
_yourName = new YourType();
return _yourName;
}
}
}
One aspect I missed in the answers so far, the access specification:
for members you have only one access specification for both setting and getting
for setters and getters you can fine tune it and define it separately
In languages which don't support "properties" (C++, Java) or require recompilation of clients when changing fields to properties (C#), using get/set methods is easier to modify. For example, adding validation logic to a setFoo method will not require changing the public interface of a class.
In languages which support "real" properties (Python, Ruby, maybe Smalltalk?) there is no point to get/set methods.
One of the basic principals of OO design: Encapsulation!
It gives you many benefits, one of which being that you can change the implementation of the getter/setter behind the scenes but any consumer of that value will continue to work as long as the data type remains the same.
You should use getters and setters when:
You're dealing with something that is conceptually an attribute, but:
Your language doesn't have properties (or some similar mechanism, like Tcl's variable traces), or
Your language's property support isn't sufficient for this use case, or
Your language's (or sometimes your framework's) idiomatic conventions encourage getters or setters for this use case.
So this is very rarely a general OO question; it's a language-specific question, with different answers for different languages (and different use cases).
From an OO theory point of view, getters and setters are useless. The interface of your class is what it does, not what its state is. (If not, you've written the wrong class.) In very simple cases, where what a class does is just, e.g., represent a point in rectangular coordinates,* the attributes are part of the interface; getters and setters just cloud that. But in anything but very simple cases, neither the attributes nor getters and setters are part of the interface.
Put another way: If you believe that consumers of your class shouldn't even know that you have a spam attribute, much less be able to change it willy-nilly, then giving them a set_spam method is the last thing you want to do.
* Even for that simple class, you may not necessarily want to allow setting the x and y values. If this is really a class, shouldn't it have methods like translate, rotate, etc.? If it's only a class because your language doesn't have records/structs/named tuples, then this isn't really a question of OO…
But nobody is ever doing general OO design. They're doing design, and implementation, in a specific language. And in some languages, getters and setters are far from useless.
If your language doesn't have properties, then the only way to represent something that's conceptually an attribute, but is actually computed, or validated, etc., is through getters and setters.
Even if your language does have properties, there may be cases where they're insufficient or inappropriate. For example, if you want to allow subclasses to control the semantics of an attribute, in languages without dynamic access, a subclass can't substitute a computed property for an attribute.
As for the "what if I want to change my implementation later?" question (which is repeated multiple times in different wording in both the OP's question and the accepted answer): If it really is a pure implementation change, and you started with an attribute, you can change it to a property without affecting the interface. Unless, of course, your language doesn't support that. So this is really just the same case again.
Also, it's important to follow the idioms of the language (or framework) you're using. If you write beautiful Ruby-style code in C#, any experienced C# developer other than you is going to have trouble reading it, and that's bad. Some languages have stronger cultures around their conventions than others.—and it may not be a coincidence that Java and Python, which are on opposite ends of the spectrum for how idiomatic getters are, happen to have two of the strongest cultures.
Beyond human readers, there will be libraries and tools that expect you to follow the conventions, and make your life harder if you don't. Hooking Interface Builder widgets to anything but ObjC properties, or using certain Java mocking libraries without getters, is just making your life more difficult. If the tools are important to you, don't fight them.
From a object orientation design standpoint both alternatives can be damaging to the maintenance of the code by weakening the encapsulation of the classes. For a discussion you can look into this excellent article: http://typicalprogrammer.com/?p=23
Code evolves. private is great for when you need data member protection. Eventually all classes should be sort of "miniprograms" that have a well-defined interface that you can't just screw with the internals of.
That said, software development isn't about setting down that final version of the class as if you're pressing some cast iron statue on the first try. While you're working with it, code is more like clay. It evolves as you develop it and learn more about the problem domain you are solving. During development classes may interact with each other than they should (dependency you plan to factor out), merge together, or split apart. So I think the debate boils down to people not wanting to religiously write
int getVar() const { return var ; }
So you have:
doSomething( obj->getVar() ) ;
Instead of
doSomething( obj->var ) ;
Not only is getVar() visually noisy, it gives this illusion that gettingVar() is somehow a more complex process than it really is. How you (as the class writer) regard the sanctity of var is particularly confusing to a user of your class if it has a passthru setter -- then it looks like you're putting up these gates to "protect" something you insist is valuable, (the sanctity of var) but yet even you concede var's protection isn't worth much by the ability for anyone to just come in and set var to whatever value they want, without you even peeking at what they are doing.
So I program as follows (assuming an "agile" type approach -- ie when I write code not knowing exactly what it will be doing/don't have time or experience to plan an elaborate waterfall style interface set):
1) Start with all public members for basic objects with data and behavior. This is why in all my C++ "example" code you'll notice me using struct instead of class everywhere.
2) When an object's internal behavior for a data member becomes complex enough, (for example, it likes to keep an internal std::list in some kind of order), accessor type functions are written. Because I'm programming by myself, I don't always set the member private right away, but somewhere down the evolution of the class the member will be "promoted" to either protected or private.
3) Classes that are fully fleshed out and have strict rules about their internals (ie they know exactly what they are doing, and you are not to "fuck" (technical term) with its internals) are given the class designation, default private members, and only a select few members are allowed to be public.
I find this approach allows me to avoid sitting there and religiously writing getter/setters when a lot of data members get migrated out, shifted around, etc. during the early stages of a class's evolution.
There is a good reason to consider using accessors is there is no property inheritance. See next example:
public class TestPropertyOverride {
public static class A {
public int i = 0;
public void add() {
i++;
}
public int getI() {
return i;
}
}
public static class B extends A {
public int i = 2;
#Override
public void add() {
i = i + 2;
}
#Override
public int getI() {
return i;
}
}
public static void main(String[] args) {
A a = new B();
System.out.println(a.i);
a.add();
System.out.println(a.i);
System.out.println(a.getI());
}
}
Output:
0
0
4
Getters and setters are used to implement two of the fundamental aspects of Object Oriented Programming which are:
Abstraction
Encapsulation
Suppose we have an Employee class:
package com.highmark.productConfig.types;
public class Employee {
private String firstName;
private String middleName;
private String lastName;
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getMiddleName() {
return middleName;
}
public void setMiddleName(String middleName) {
this.middleName = middleName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public String getFullName(){
return this.getFirstName() + this.getMiddleName() + this.getLastName();
}
}
Here the implementation details of Full Name is hidden from the user and is not accessible directly to the user, unlike a public attribute.
There is a difference between DataStructure and Object.
Datastructure should expose its innards and not behavior.
An Object should not expose its innards but it should expose its behavior, which is also known as the Law of Demeter
Mostly DTOs are considered more of a datastructure and not Object. They should only expose their data and not behavior. Having Setter/Getter in DataStructure will expose behavior instead of data inside it. This further increases the chance of violation of Law of Demeter.
Uncle Bob in his book Clean code explained the Law of Demeter.
There is a well-known heuristic called the Law of Demeter that says a
module should not know about the innards of the objects it
manipulates. As we saw in the last section, objects hide their data
and expose operations. This means that an object should not expose its
internal structure through accessors because to do so is to expose,
rather than to hide, its internal structure.
More precisely, the Law of Demeter says that a method f of a class C
should only call the methods of these:
C
An object created by f
An object passed as an argument to f
An object held in an instance variable of C
The method should not invoke methods on objects that are returned by any of the allowed functions.
In other words, talk to friends, not to strangers.
So according this, example of LoD violation is:
final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();
Here, the function should call the method of its immediate friend which is ctxt here, It should not call the method of its immediate friend's friend. but this rule doesn't apply to data structure. so here if ctxt, option, scratchDir are datastructure then why to wrap their internal data with some behavior and doing a violation of LoD.
Instead, we can do something like this.
final String outputDir = ctxt.options.scratchDir.absolutePath;
This fulfills our needs and doesn't even violate LoD.
Inspired by Clean Code by Robert C. Martin(Uncle Bob)
If you don't require any validations and not even need to maintain state i.e. one property depends on another so we need to maintain the state when one is change. You can keep it simple by making field public and not using getter and setters.
I think OOPs complicates things as the program grows it becomes nightmare for developer to scale.
A simple example; we generate c++ headers from xml. The header contains simple field which does not require any validations. But still as in OOPS accessor are fashion we generates them as following.
const Filed& getfield() const
Field& getField()
void setfield(const Field& field){...}
which is very verbose and is not required. a simple
struct
{
Field field;
};
is enough and readable.
Functional programming don't have the concept of data hiding they even don't require it as they do not mutate the data.
Additionally, this is to "future-proof" your class. In particular, changing from a field to a property is an ABI break, so if you do later decide that you need more logic than just "set/get the field", then you need to break ABI, which of course creates problems for anything else already compiled against your class.
One other use (in languages that support properties) is that setters and getters can imply that an operation is non-trivial. Typically, you want to avoid doing anything that's computationally expensive in a property.
One relatively modern advantage of getters/setters is that is makes it easier to browse code in tagged (indexed) code editors. E.g. If you want to see who sets a member, you can open the call hierarchy of the setter.
On the other hand, if the member is public, the tools don't make it possible to filter read/write access to the member. So you have to trudge though all uses of the member.
Getters and setters coming from data hiding. Data Hiding means We
are hiding data from outsiders or outside person/thing cannot access
our data.This is a useful feature in OOP.
As a example:
If you create a public variable, you can access that variable and change value in anywhere(any class). But if you create as private that variable cannot see/access in any class except declared class.
public and private are access modifiers.
So how can we access that variable outside:
This is the place getters and setters coming from. You can declare variable as private then you can implement getter and setter for that variable.
Example(Java):
private String name;
public String getName(){
return this.name;
}
public void setName(String name){
this.name= name;
}
Advantage:
When anyone want to access or change/set value to balance variable, he/she must have permision.
//assume we have person1 object
//to give permission to check balance
person1.getName()
//to give permission to set balance
person1.setName()
You can set value in constructor also but when later on when you want
to update/change value, you have to implement setter method.

When should we use public and when private?

Should the Access Modifier be Public or Private when we are implementing method that is for now not using by other classes in our team solution?
I believe that "A public member says this member represents the key, documented functionality provided by this object.". We make private only "implementation details" methods and all methods that can be useful in future we should make public even for now there is no consumers of our methods in others classes. But my opponent says that such methods should be private. How do you think?
Added:
Let's be more specific. For example there is a class SqlHelper.
In it there is useful functionality for operation with the SQL Server.
In particular there is used connection to the SQL server. But not only in that class.
And for example I need to implement the public static HandleSqlExeption method(now only for class SqlHelper) which will process SqlExeptions. But I want that in all classes where there is operations with SQL connection in exception handling will be used this method (instead of it is simple, for example:
catch (Exception) { MsgBox {"SqlError"};
as somewhere happens now. So i consider that public access modifier will say to other colleagues that they can use this method. And private will hide that method. And i will need ti change code and rebuid assembly if some one will ask to use tsuch method. Why? There is only negatives.
In general, you should code using the least possible permissive access modifier.
If the method is not used outside the class, make it private.
If the method needs to be used by inheriting types, make it protected.
If the method will only be used within the assembly, make it internal.
Only if the method is to be used outside the assembly should it be public.
This helps with information hiding and lets you change the implementation at will.
I have heard this described as protecting your privates.
In terms of API design, you should have a complete API, exposing all logical functionality as public - this is why you should use interfaces in .NET for API design, as all interface members must be public. In implementing classes, use the above rules of thumb for any members that are not part of the interface - unless they form a logical part of the interface, as far as its consumers are concerned.
So, if you have a Read method being used today, you should have a complementary Write method that has the same accessibility. That's a good design (symmetric and expected), but how you read or write should be hidden behind private methods, if the public ones use them.
Use public and private depending on the nature of the method as name describes.
public when the method can be exposed to other objects and
private when the method should not be accessed by other objects.
If your object needs to be more secure, use private access specifier. You should also know about protected access specifier which is different in that, those methods can be accessed only by the objects that inherit them.
if a class has some thing to expose to outside world then make it public else private. it depends on the functionality and purpose of class not the current or future requirement.
This is a judgment call. Generally speaking, if it's logically a part of the interface the class is presenting to get the functionality it provides, it should be public even if it has no current users. That will reduce the chances that the class will need to be modified every time you want to use it for substantially the same things you're already using it for.
However, this has a cost. Each function you expose is a capability you are committing to provide. If, in the future, you decide to remove it from the public interface, you may break callers.
For example, consider a map class. Say you currently don't have any callers that need to know the number of items in the map, and your current implementation has a size variable you could easily return. You could add a getSize function that returns the size of the map. It's logically part of the function the map class exposes. So one can argue it should be public even if no current callers need to know the size.
However, it's entirely possible a future implementation might wish to get rid of the size variable. Perhaps the overhead of incrementing and decrementing the size is significant and implementation is changed to not need to know the size. If you exposed getSize function and it was called, you would either have to keep tracking the size even though you didn't need it or make the function very expensive, having to actually count the size.
Say lots of code needs to know if the table is empty and you provide an isEmpty function for this purpose. Experience shows that if you also provide a getCount function, people will do getCount() == 0 instead of isEmpty. That may not make any difference today, but it may constrain your future implementation choices to keep getCount as cheap as isEmpty.
Where possible, I would strongly suggest making it public if you can easily imagine a non-broken user of the class that would benefit from having access to that functionality and it forms a logical part of the functionality the class provides. Otherwise, you can easily add it later anyway. So don't stress over it too much. It is largely a style issue mixed with trying to predict the future.
A basic principle of object oriented development; an instance of a class should only expose what needs to be accessed by it's clients. See link and search for "encapsulation".

Is there anything wrong with a class with all static methods?

I'm doing code review and came across a class that uses all static methods. The entrance method takes several arguments and then starts calling the other static methods passing along all or some of the arguments the entrance method received.
It isn't like a Math class with largely unrelated utility functions. In my own normal programming, I rarely write methods where Resharper pops and says "this could be a static method", when I do, they tend to be mindless utility methods.
Is there anything wrong with this pattern? Is this just a matter of personal choice if the state of a class is held in fields and properties or passed around amongst static methods using arguments?
UPDATE: the particular state that is being passed around is the result set from the database. The class's responsibility is to populate an excel spreadsheet template from a result set from the DB. I don't know if this makes any difference.
Is there anything wrong with this
pattern? Is this just a matter of
personal choice if the state of a
class is held in fields and properties
or passed around amongst static
methods using arguments?
Speaking from my own personal experience, I've worked on 100 KLOC applications which have very very deep object hiearchies, everything inherits and overrides everything else, everything implements half a dozen interfaces, even the interfaces inherit half a dozen interfaces, the system implements every design pattern in the book, etc.
End result: a truly OOP-tastic architecture with so many levels of indirection that it takes hours to debug anything. I recently started a job with a system like this, where the learning curve was described to me as "a brick wall, followed by a mountain".
Sometimes overzealous OOP results in classes so granular that it actually a net harm.
By contrast, many functional programming languages, even the OO ones like F# and OCaml (and C#!), encourage flat and shallow hiearchy. Libraries in these languages tend to have the following properties:
Most objects are POCOs, or have at most one or two levels of inheritance, where the objects aren't much more than containers for logically related data.
Instead of classes calling into each other, you have modules (equivalent to static classes) controlling the interactions between objects.
Modules tend to act on a very limited number of data types, and so have a narrow scope. For example, the OCaml List module represents operations on lists, a Customer modules facilitates operations on customers. While modules have more or less the same functionality as instance methods on a class, the key difference with module-based libraries is that modules are much more self-contained, much less granular, and tend to have few if any dependencies on other modules.
There's usually no need to subclass objects override methods since you can pass around functions as first-class objects for specialization.
Although C# doesn't support this functionality, functors provide a means to subclass an specialize modules.
Most big libraries tend to be more wide than deep, for example the Win32 API, PHP libraries, Erlang BIFs, OCaml and Haskell libraries, stored procedures in a database, etc. So this style of programming is battle testing and seems to work well in the real world.
In my opinion, the best designed module-based APIs tend to be easier to work with than the best designed OOP APIs. However, coding style is just as important in API design, so if everyone else on your team is using OOP and someone goes off and implements something in a completely different style, then you should probably ask for a rewrite to more closely match your teams coding standards.
What you describe is simply structured programming, as could be done in C, Pascal or Algol. There is nothing intrinsically wrong with that. There are situations were OOP is more appropriate, but OOP is not the ultimate answer and if the problem at hand is best served by structured programming then a class full of static methods is the way to go.
Does it help to rephrase the question:
Can you describe the data that the static methods operates on as an entity having:
a clear meaning
responsibility for keeping it's internal state consistent.
In that case it should be an instantiated object, otherwise it may just be a bunch of related functions, much like a math library.
Here's a refactor workflow that I frequently encounter that involves static methods. It may lend some insight into your problem.
I'll start with a class that has reasonably good encapsulation. As I start to add features I run into a piece of functionality that doesn't really need access to the private fields in my class but seems to contain related functionality. After this happens a few times (sometimes just once) I start to see the outlines of a new class in the static methods I've implemented and how that new class relates to the old class in which I first implemented the static methods.
The benefit that I see of turning these static methods into one or more classes is, when you do this, it frequently becomes easier to understand and maintain your software.
I feel that if the class is required to maintain some form of state (e.g. properties) then it should be instantiated (i.e. a "normal" class.)
If there should only be one instance of this class (hence all the static methods) then there should be a singleton property/method or a factory method that creates an instance of the class the first time it's called, and then just provides that instance when anyone else asks for it.
Having said that, this is just my personal opinion and the way I'd implement it. I'm sure others would disagree with me. Without knowing anything more it's hard to give reasons for/against each method, to be honest.
The biggest problem IMO is that if you want to unit test classes that are calling the class you mention, there is no way to replace that dependency. So you are forced to test both the client class, and the staticly called class at once.
If we are talking about a class with utility methods like Math.floor() this is not really a problem. But if the class is a real dependency, for instance a data access object, then it ties all its clients in to its implementation.
EDIT: I don't agree with the people saying there is 'nothing wrong' with this type of 'structured programming'. I would say a class like this is at least a code smell when encountered within a normal Java project, and probably indicates misunderstanding of object-oriented design on the part of the creator.
There is nothing wrong with this pattern. C# in fact has a construct called static classes which is used to support this notion by enforcing the requirement that all methods be static. Additionally there are many classes in the framework which have this feature: Enumerable, Math, etc ...
Nothing is wrong with it. It is a more "functional" way to code. It can be easier to test (because no internal state) and better performance at runtime (because no overhead to instance an otherwise useless object).
But you immediately lose some OO capabilities
Static methods don't respond well (at all) to inheritance.
A static class cannot participate in many design patterns such as factory/ service locator.
No, many people tend to create completely static classes for utility functions that they wish to group under a related namespace. There are many valid reasons for having completely static classes.
One thing to consider in C# is that many classes previously written completely static are now eligible to be considered as .net extension classes which are also at their heart still static classes. A lot of the Linq extensions are based on this.
An example:
namespace Utils {
public static class IntUtils {
public static bool IsLessThanZero(this int source)
{
return (source < 0);
}
}
}
Which then allows you to simply do the following:
var intTest = 0;
var blNegative = intTest.IsLessThanZero();
One of the disadvantages of using a static class is that its clients cannot replace it by a test double in order to be unit tested.
In the same way, it's harder to unit test a static class because its collaborators cannot be replaced by test doubles (actually,this happens with all the classes that are not dependency-injected).
It depends on whether the passed arguments can really be classified as state.
Having static methods calling each other is OK in case it's all utility functionality split up in multiple methods to avoid duplication. For example:
public static File loadConfiguration(String name, Enum type) {
String fileName = (form file name based on name and type);
return loadFile(fileName); // static method in the same class
}
Well, personnally, I tend to think that a method modifying the state of an object should be an instance method of that object's class. In fact, i consider it a rule a thumb : a method modifying an object is an instance method of that object's class.
There however are a few exceptions :
methods that process strings (like uppercasing their first letters, or that kind of feature)
method that are stateless and simply assemble some things to produce a new one, without any internal state. They obviously are rare, but it is generally useful to make them static.
In fact, I consider the static keyword as what it is : an option that should be used with care since it breaks some of OOP principles.
Passing all state as method parameters can be a useful design pattern. It ensures that there is no shared mutable state, and so the class is intrinsicly thread-safe. Services are commonly implemented using this pattern.
However, passing all state via method parameters doesn't mean the methods have to be static - you can still use the same pattern with non-static methods. The advantages of making the methods static is that calling code can just use the class by referencing it by name. There's no need for injection, or lookup or any other middleman. The disadvantage is maintanability - static methods are not dynamic dispatch, and cannot be easily subclassed, nor refactored to an interface. I recommend using static methods when there is intrinsicly only one possible implementation of the class, and when there is a strong reason not to use non-static methods.
"state of a class is ...passed around amongst static methods using arguments?"
This is how procedual programming works.
A class with all static methods, and no instance variables (except static final constants) is normally a utility class, eg Math.
There is nothing wrong with making a unility class, (not in an of itself)
BTW: If making a utility class, you chould prevent the class aver being used to crteate an object. in java you would do this by explictily defining the constructor, but making the constructor private.
While as i said there is nothing wrong with creating a utility class,
If the bulk of the work is being done by a utiulity class (wich esc. isn't a class in the usual sense - it's more of a collection of functions)
then this is prob as sign the problem hasn't been solved using the object orientated paradim.
this may or maynot be a good thing
The entrance method takes several arguments and then starts calling the other static methods passing along all or some of the arguments the entrance method received.
from the sound of this, the whole class is just effectivly one method (this would definatly be the case is al lthe other static methods are private (and are just helper functions), and there are no instance variables (baring constants))
This may be and Ok thing,
It's esc. structured/procedual progamming, rather neat having them (the function and it's helper)all bundled in one class. (in C you'ld just put them all in one file, and declare the helper's static (meaning can't be accesses from out side this file))
if there's no need of creating an object of a class, then there's no issue in creating all method as static of that class, but i wanna know what you are doing with a class fullof static methods.
I'm not quite sure what you meant by entrance method but if you're talking about something like this:
MyMethod myMethod = new MyMethod();
myMethod.doSomething(1);
public class MyMethod {
public String doSomething(int a) {
String p1 = MyMethod.functionA(a);
String p2 = MyMethod.functionB(p1);
return p1 + P2;
}
public static String functionA(...) {...}
public static String functionB(...) {...}
}
That's not advisable.
I think using all static methods/singletons a good way to code your business logic when you don't have to persist anything in the class. I tend to use it over singletons but that's simply a preference.
MyClass.myStaticMethod(....);
as opposed to:
MyClass.getInstance().mySingletonMethod(...);
All static methods/singletons tend to use less memory as well but depending on how many users you have you may not even notice it.

Fields vs Properties for private class variables [duplicate]

This question already has answers here:
Are there any reasons to use private properties in C#?
(19 answers)
Closed 9 years ago.
For private class variables, which one is preferred?
If you have a property like int limit, you want it to be:
int Limit {get; set;}
and use it inside the class, like so:
this.Limit
Is there a reason to use it or not use it? Maybe for performance reasons?
I wonder if this is a good practice.
For a private member, I only make it a property when getting and/or setting the value should cause something else to occur, like:
private int Limit
{
get
{
EnsureValue();
return this._limit;
}
}
Otherwise, fields are fine. If you need to increase their accessibility, it's already a big enough change that making it a property at that point isn't a huge deal.
Edit: as Scott reminds us in the comments, side effects in properties can often cause more pain than anything else. Don't violate Single Responsibility and limit property logic to consistent, logical operations on the value only that must be done at the gate - such as lazy loading (as in the example above), transforming an internal structure into a publicly-useful format, etc.
The only real benefit an auto-property has over a field when the accessibility is private is that you can set a breakpoint on accesses and updates of the variable. If that is important to your scenario then definitely use an auto-property. Otherwise, given there is no substantial advantage, I choose to go with the simplest construct which is a field.
I would say its good practice to use a property. If ever you had to expose the limit value and used a local member it will require more coding while if its a property it would only require a change of its modifier.
I think it's cleaner also.
Granted, since it's a private API, its an implementation detail - you can do whatever you want here. However, there is very little reason to not use a property, even for private classes. The properties get inlined away by the JIT, unless there is extra code in place, so there isn't really a performance impact.
The biggest reasons to prefer properties, IMO, are:
Consistency in your API - You'll want properties in publicly exposed APIs, so making them in the private API will make your programming exprience more consistent, which leads to less bugs due to better maintainability
Easier to convert private class to public
From my perspective, using properties in lieu of variables boils down to:
Pros
Can set a break point for debugging, as Jared mentioned,
Can cause side-effects, like Rex's EnsureValue(),
The get and set can have different access restrictions (public get, protected set),
Can be utilized in Property Editors,
Cons
Slower access, uses method calls.
Code bulk, harder to read (IMO).
More difficult to initialize, like requiring EnsureValue();
Not all of these apply to int Limit {get; set;} style properties.
The point of automatic properties is they are very quick at creating a public access to some field in your class. Now, they offer no benefit over exposing straight up fields to the outside world, other than one big one.
Your class' interface is how it communicates with the outside world. Using automatic properties over fields allows you to change the internals of your class down the road in case you need to make setting the value of that property do something or check authorization rules or something similar on the read.
The fact that you already have a property means you can change your implementation without breaking your public interface.
Therefore, if this is just a private field, an automatic property isn't really that useful, not only that, but you can't initialize public properties at declaration like you can with fields.
I generally follow the following principle: If it's for strictly private use, use a field as it is faster.
If you decide that it should become public, protected or internal some day, it's not difficult to refactor to a property anyway, and with tools like ReSharper, it takes about 3 seconds to do so... :)
There's nothing wrong with having private or protected properties; this is mostly useful when there is some rule or side effect associated with the underlying variable.
The reason why properties seem more natural for public variables is that in the public case, it is a way to hedge one's bet against future implementation changes, whereby the property will remain intact but the implementation details somehow move around (and/or some additional business rule will be needed).
On performance, this is typically insignificant, or indeed identical for straight-assignment properties.
I personally dislike (but often use) plain assignment properties because they just clutter the code. I wish C# would allow for "after the fact refactoring".
Properties provide some very good automatic features (like Json and Xml Serialization)
Fields do not.
Properties can also be a part of an Interface. If you decide to refactor later on... this might be something to consider too.
Properties are just syntactic sugar, C# will compile them into get_PropertyName and set_PropertyName, so performance differences are not a consideration.
If your data member need only set and get logic then properties are very good and fast solution in C#

C# and Data Hiding

I'm new to C# and .Net in general so this may be a naive thing to ask. But anyway, consider this C# code:
class A {
public int Data {get; set;}
}
class B {
public A Aval {get; set;}
}
The B.Aval property above is returning a reference to its internal A object. As a former C++ programmer, I find this dangerous because by exposing reference to an member object class B violates the principle of data hiding. Nonetheless, this seems to be the normal practice in the C# world.
My question is, if it is at all, why is such a design the usual approach as opposed to returning copies of internal members, which will be much safer in so many ways (including the case of thread safety)?
I understand that even in C++ sometimes good design demands that you do expose class members directly (a Car class with Engine and Stereo objects inside it comes to mind), but it is not the norm as seems to be in C#.
You're absolutely right - you should only return objects from properties where either the object is immutable, or you're happy for the caller to modify it to whatever extent they can. A classic example of this is returning collections - often it's much better to return a read-only wrapper round a collection than to return the "real" collection directly.
On the other hand, pragmatism sometimes calls for just documenting this as "please don't change the returned object" - particularly when it's an API which is only used within a company.
Hopefully there'll be more support for immutability in future versions of C# and .NET, which will make this easier to cope with - but it's likely to remain a knotty problem in many cases.
This isn't encapsulation - it's an act of abstraction through object composition or aggregation depending on how the internal object lifetimes are created/managed.
In composition patterns it is perfectly acceptable to access composite state e.g. the instance of A in the instance of B.
http://en.wikipedia.org/wiki/Object_composition
As you point out the semantics of encapsulation are very different - to completely hide the internal implementation of A e.g. by inheriting B from A.
This is maybe related to the Law of Demeter... are you talking only about getters and setters that have no extra logic (and thus effectively directly expose the member), or any get-set pairs? In any case, if this portion of the outer object's state doesn't participate in meaninful invariants on the outer object, I don't think it's necessarily unreasonable to do this.

Categories

Resources