C# and Data Hiding

C# and Data Hiding - c#

I'm new to C# and .Net in general so this may be a naive thing to ask. But anyway, consider this C# code:
class A {
public int Data {get; set;}
}
class B {
public A Aval {get; set;}
}
The B.Aval property above is returning a reference to its internal A object. As a former C++ programmer, I find this dangerous because by exposing reference to an member object class B violates the principle of data hiding. Nonetheless, this seems to be the normal practice in the C# world.
My question is, if it is at all, why is such a design the usual approach as opposed to returning copies of internal members, which will be much safer in so many ways (including the case of thread safety)?
I understand that even in C++ sometimes good design demands that you do expose class members directly (a Car class with Engine and Stereo objects inside it comes to mind), but it is not the norm as seems to be in C#.

You're absolutely right - you should only return objects from properties where either the object is immutable, or you're happy for the caller to modify it to whatever extent they can. A classic example of this is returning collections - often it's much better to return a read-only wrapper round a collection than to return the "real" collection directly.
On the other hand, pragmatism sometimes calls for just documenting this as "please don't change the returned object" - particularly when it's an API which is only used within a company.
Hopefully there'll be more support for immutability in future versions of C# and .NET, which will make this easier to cope with - but it's likely to remain a knotty problem in many cases.

This isn't encapsulation - it's an act of abstraction through object composition or aggregation depending on how the internal object lifetimes are created/managed.
In composition patterns it is perfectly acceptable to access composite state e.g. the instance of A in the instance of B.
http://en.wikipedia.org/wiki/Object_composition
As you point out the semantics of encapsulation are very different - to completely hide the internal implementation of A e.g. by inheriting B from A.

This is maybe related to the Law of Demeter... are you talking only about getters and setters that have no extra logic (and thus effectively directly expose the member), or any get-set pairs? In any case, if this portion of the outer object's state doesn't participate in meaninful invariants on the outer object, I don't think it's necessarily unreasonable to do this.

Related

C# Encapsulation (OOP) [duplicate]

What's the advantage of using getters and setters - that only get and set - instead of simply using public fields for those variables?
If getters and setters are ever doing more than just the simple get/set, I can figure this one out very quickly, but I'm not 100% clear on how:
public String foo;
is any worse than:
private String foo;
public void setFoo(String foo) { this.foo = foo; }
public String getFoo() { return foo; }
Whereas the former takes a lot less boilerplate code.

There are actually many good reasons to consider using accessors rather than directly exposing fields of a class - beyond just the argument of encapsulation and making future changes easier.
Here are the some of the reasons I am aware of:
Encapsulation of behavior associated with getting or setting the property - this allows additional functionality (like validation) to be added more easily later.
Hiding the internal representation of the property while exposing a property using an alternative representation.
Insulating your public interface from change - allowing the public interface to remain constant while the implementation changes without affecting existing consumers.
Controlling the lifetime and memory management (disposal) semantics of the property - particularly important in non-managed memory environments (like C++ or Objective-C).
Providing a debugging interception point for when a property changes at runtime - debugging when and where a property changed to a particular value can be quite difficult without this in some languages.
Improved interoperability with libraries that are designed to operate against property getter/setters - Mocking, Serialization, and WPF come to mind.
Allowing inheritors to change the semantics of how the property behaves and is exposed by overriding the getter/setter methods.
Allowing the getter/setter to be passed around as lambda expressions rather than values.
Getters and setters can allow different access levels - for example the get may be public, but the set could be protected.

Because 2 weeks (months, years) from now when you realize that your setter needs to do more than just set the value, you'll also realize that the property has been used directly in 238 other classes :-)

A public field is not worse than a getter/setter pair that does nothing except returning the field and assigning to it. First, it's clear that (in most languages) there is no functional difference. Any difference must be in other factors, like maintainability or readability.
An oft-mentioned advantage of getter/setter pairs, isn't. There's this claim that you can change the implementation and your clients don't have to be recompiled. Supposedly, setters let you add functionality like validation later on and your clients don't even need to know about it. However, adding validation to a setter is a change to its preconditions, a violation of the previous contract, which was, quite simply, "you can put anything in here, and you can get that same thing later from the getter".
So, now that you broke the contract, changing every file in the codebase is something you should want to do, not avoid. If you avoid it you're making the assumption that all the code assumed the contract for those methods was different.
If that should not have been the contract, then the interface was allowing clients to put the object in invalid states. That's the exact opposite of encapsulation If that field could not really be set to anything from the start, why wasn't the validation there from the start?
This same argument applies to other supposed advantages of these pass-through getter/setter pairs: if you later decide to change the value being set, you're breaking the contract. If you override the default functionality in a derived class, in a way beyond a few harmless modifications (like logging or other non-observable behaviour), you're breaking the contract of the base class. That is a violation of the Liskov Substitutability Principle, which is seen as one of the tenets of OO.
If a class has these dumb getters and setters for every field, then it is a class that has no invariants whatsoever, no contract. Is that really object-oriented design? If all the class has is those getters and setters, it's just a dumb data holder, and dumb data holders should look like dumb data holders:
class Foo {
public:
int DaysLeft;
int ContestantNumber;
};
Adding pass-through getter/setter pairs to such a class adds no value. Other classes should provide meaningful operations, not just operations that fields already provide. That's how you can define and maintain useful invariants.
Client: "What can I do with an object of this class?"
Designer: "You can read and write several variables."
Client: "Oh... cool, I guess?"
There are reasons to use getters and setters, but if those reasons don't exist, making getter/setter pairs in the name of false encapsulation gods is not a good thing. Valid reasons to make getters or setters include the things often mentioned as the potential changes you can make later, like validation or different internal representations. Or maybe the value should be readable by clients but not writable (for example, reading the size of a dictionary), so a simple getter is a nice choice. But those reasons should be there when you make the choice, and not just as a potential thing you may want later. This is an instance of YAGNI (You Ain't Gonna Need It).

Lots of people talk about the advantages of getters and setters but I want to play devil's advocate. Right now I'm debugging a very large program where the programmers decided to make everything getters and setters. That might seem nice, but its a reverse-engineering nightmare.
Say you're looking through hundreds of lines of code and you come across this:
person.name = "Joe";
It's a beautifully simply piece of code until you realize its a setter. Now, you follow that setter and find that it also sets person.firstName, person.lastName, person.isHuman, person.hasReallyCommonFirstName, and calls person.update(), which sends a query out to the database, etc. Oh, that's where your memory leak was occurring.
Understanding a local piece of code at first glance is an important property of good readability that getters and setters tend to break. That is why I try to avoid them when I can, and minimize what they do when I use them.

In a pure object-oriented world getters and setters is a terrible anti-pattern. Read this article: Getters/Setters. Evil. Period. In a nutshell, they encourage programmers to think about objects as of data structures, and this type of thinking is pure procedural (like in COBOL or C). In an object-oriented language there are no data structures, but only objects that expose behavior (not attributes/properties!)
You may find more about them in Section 3.5 of Elegant Objects (my book about object-oriented programming).

There are many reasons. My favorite one is when you need to change the behavior or regulate what you can set on a variable. For instance, lets say you had a setSpeed(int speed) method. But you want that you can only set a maximum speed of 100. You would do something like:
public void setSpeed(int speed) {
if ( speed > 100 ) {
this.speed = 100;
} else {
this.speed = speed;
}
}
Now what if EVERYWHERE in your code you were using the public field and then you realized you need the above requirement? Have fun hunting down every usage of the public field instead of just modifying your setter.
My 2 cents :)

One advantage of accessors and mutators is that you can perform validation.
For example, if foo was public, I could easily set it to null and then someone else could try to call a method on the object. But it's not there anymore! With a setFoo method, I could ensure that foo was never set to null.
Accessors and mutators also allow for encapsulation - if you aren't supposed to see the value once its set (perhaps it's set in the constructor and then used by methods, but never supposed to be changed), it will never been seen by anyone. But if you can allow other classes to see or change it, you can provide the proper accessor and/or mutator.

Thanks, that really clarified my thinking. Now here is (almost) 10 (almost) good reasons NOT to use getters and setters:
When you realize you need to do more than just set and get the value, you can just make the field private, which will instantly tell you where you've directly accessed it.
Any validation you perform in there can only be context free, which validation rarely is in practice.
You can change the value being set - this is an absolute nightmare when the caller passes you a value that they [shock horror] want you to store AS IS.
You can hide the internal representation - fantastic, so you're making sure that all these operations are symmetrical right?
You've insulated your public interface from changes under the sheets - if you were designing an interface and weren't sure whether direct access to something was OK, then you should have kept designing.
Some libraries expect this, but not many - reflection, serialization, mock objects all work just fine with public fields.
Inheriting this class, you can override default functionality - in other words you can REALLY confuse callers by not only hiding the implementation but making it inconsistent.
The last three I'm just leaving (N/A or D/C)...

Depends on your language. You've tagged this "object-oriented" rather than "Java", so I'd like to point out that ChssPly76's answer is language-dependent. In Python, for instance, there is no reason to use getters and setters. If you need to change the behavior, you can use a property, which wraps a getter and setter around basic attribute access. Something like this:
class Simple(object):
def _get_value(self):
return self._value -1
def _set_value(self, new_value):
self._value = new_value + 1
def _del_value(self):
self.old_values.append(self._value)
del self._value
value = property(_get_value, _set_value, _del_value)

Well i just want to add that even if sometimes they are necessary for the encapsulation and security of your variables/objects, if we want to code a real Object Oriented Program, then we need to STOP OVERUSING THE ACCESSORS, cause sometimes we depend a lot on them when is not really necessary and that makes almost the same as if we put the variables public.

EDIT: I answered this question because there are a bunch of people learning programming asking this, and most of the answers are very technically competent, but they're not as easy to understand if you're a newbie. We were all newbies, so I thought I'd try my hand at a more newbie friendly answer.
The two main ones are polymorphism, and validation. Even if it's just a stupid data structure.
Let's say we have this simple class:
public class Bottle {
public int amountOfWaterMl;
public int capacityMl;
}
A very simple class that holds how much liquid is in it, and what its capacity is (in milliliters).
What happens when I do:
Bottle bot = new Bottle();
bot.amountOfWaterMl = 1500;
bot.capacityMl = 1000;
Well, you wouldn't expect that to work, right?
You want there to be some kind of sanity check. And worse, what if I never specified the maximum capacity? Oh dear, we have a problem.
But there's another problem too. What if bottles were just one type of container? What if we had several containers, all with capacities and amounts of liquid filled? If we could just make an interface, we could let the rest of our program accept that interface, and bottles, jerrycans and all sorts of stuff would just work interchangably. Wouldn't that be better? Since interfaces demand methods, this is also a good thing.
We'd end up with something like:
public interface LiquidContainer {
public int getAmountMl();
public void setAmountMl(int amountMl);
public int getCapacityMl();
}
Great! And now we just change Bottle to this:
public class Bottle implements LiquidContainer {
private int capacityMl;
private int amountFilledMl;
public Bottle(int capacityMl, int amountFilledMl) {
this.capacityMl = capacityMl;
this.amountFilledMl = amountFilledMl;
checkNotOverFlow();
}
public int getAmountMl() {
return amountFilledMl;
}
public void setAmountMl(int amountMl) {
this.amountFilled = amountMl;
checkNotOverFlow();
}
public int getCapacityMl() {
return capacityMl;
}
private void checkNotOverFlow() {
if(amountOfWaterMl > capacityMl) {
throw new BottleOverflowException();
}
}
I'll leave the definition of the BottleOverflowException as an exercise to the reader.
Now notice how much more robust this is. We can deal with any type of container in our code now by accepting LiquidContainer instead of Bottle. And how these bottles deal with this sort of stuff can all differ. You can have bottles that write their state to disk when it changes, or bottles that save on SQL databases or GNU knows what else.
And all these can have different ways to handle various whoopsies. The Bottle just checks and if it's overflowing it throws a RuntimeException. But that might be the wrong thing to do.
(There is a useful discussion to be had about error handling, but I'm keeping it very simple here on purpose. People in comments will likely point out the flaws of this simplistic approach. ;) )
And yes, it seems like we go from a very simple idea to getting much better answers quickly.
Please note also that you can't change the capacity of a bottle. It's now set in stone. You could do this with an int by declaring it final. But if this was a list, you could empty it, add new things to it, and so on. You can't limit the access to touching the innards.
There's also the third thing that not everyone has addressed: getters and setters use method calls. That means that they look like normal methods everywhere else does. Instead of having weird specific syntax for DTOs and stuff, you have the same thing everywhere.

I know it's a bit late, but I think there are some people who are interested in performance.
I've done a little performance test. I wrote a class "NumberHolder" which, well, holds an Integer. You can either read that Integer by using the getter method
anInstance.getNumber() or by directly accessing the number by using anInstance.number. My programm reads the number 1,000,000,000 times, via both ways. That process is repeated five times and the time is printed. I've got the following result:
Time 1: 953ms, Time 2: 741ms
Time 1: 655ms, Time 2: 743ms
Time 1: 656ms, Time 2: 634ms
Time 1: 637ms, Time 2: 629ms
Time 1: 633ms, Time 2: 625ms
(Time 1 is the direct way, Time 2 is the getter)
You see, the getter is (almost) always a bit faster. Then I tried with different numbers of cycles. Instead of 1 million, I used 10 million and 0.1 million.
The results:
10 million cycles:
Time 1: 6382ms, Time 2: 6351ms
Time 1: 6363ms, Time 2: 6351ms
Time 1: 6350ms, Time 2: 6363ms
Time 1: 6353ms, Time 2: 6357ms
Time 1: 6348ms, Time 2: 6354ms
With 10 million cycles, the times are almost the same.
Here are 100 thousand (0.1 million) cycles:
Time 1: 77ms, Time 2: 73ms
Time 1: 94ms, Time 2: 65ms
Time 1: 67ms, Time 2: 63ms
Time 1: 65ms, Time 2: 65ms
Time 1: 66ms, Time 2: 63ms
Also with different amounts of cycles, the getter is a little bit faster than the regular way. I hope this helped you.

Don't use getters setters unless needed for your current delivery I.e. Don't think too much about what would happen in the future, if any thing to be changed its a change request in most of the production applications, systems.
Think simple, easy, add complexity when needed.
I would not take advantage of ignorance of business owners of deep technical know how just because I think it's correct or I like the approach.
I have massive system written without getters setters only with access modifiers and some methods to validate n perform biz logic. If you absolutely needed the. Use anything.

We use getters and setters:
for reusability
to perform validation in later stages of programming
Getter and setter methods are public interfaces to access private class members.
Encapsulation mantra
The encapsulation mantra is to make fields private and methods public.
Getter Methods: We can get access to private variables.
Setter Methods: We can modify private fields.
Even though the getter and setter methods do not add new functionality, we can change our mind come back later to make that method
better;
safer; and
faster.
Anywhere a value can be used, a method that returns that value can be added. Instead of:
int x = 1000 - 500
use
int x = 1000 - class_name.getValue();
In layman's terms
Suppose we need to store the details of this Person. This Person has the fields name, age and sex. Doing this involves creating methods for name, age and sex. Now if we need create another person, it becomes necessary to create the methods for name, age, sex all over again.
Instead of doing this, we can create a bean class(Person) with getter and setter methods. So tomorrow we can just create objects of this Bean class(Person class) whenever we need to add a new person (see the figure). Thus we are reusing the fields and methods of bean class, which is much better.

I spent quite a while thinking this over for the Java case, and I believe the real reasons are:
Code to the interface, not the implementation
Interfaces only specify methods, not fields
In other words, the only way you can specify a field in an interface is by providing a method for writing a new value and a method for reading the current value.
Those methods are the infamous getter and setter....

It can be useful for lazy-loading. Say the object in question is stored in a database, and you don't want to go get it unless you need it. If the object is retrieved by a getter, then the internal object can be null until somebody asks for it, then you can go get it on the first call to the getter.
I had a base page class in a project that was handed to me that was loading some data from a couple different web service calls, but the data in those web service calls wasn't always used in all child pages. Web services, for all of the benefits, pioneer new definitions of "slow", so you don't want to make a web service call if you don't have to.
I moved from public fields to getters, and now the getters check the cache, and if it's not there call the web service. So with a little wrapping, a lot of web service calls were prevented.
So the getter saves me from trying to figure out, on each child page, what I will need. If I need it, I call the getter, and it goes to find it for me if I don't already have it.
protected YourType _yourName = null;
public YourType YourName{
get
{
if (_yourName == null)
{
_yourName = new YourType();
return _yourName;
}
}
}

One aspect I missed in the answers so far, the access specification:
for members you have only one access specification for both setting and getting
for setters and getters you can fine tune it and define it separately

In languages which don't support "properties" (C++, Java) or require recompilation of clients when changing fields to properties (C#), using get/set methods is easier to modify. For example, adding validation logic to a setFoo method will not require changing the public interface of a class.
In languages which support "real" properties (Python, Ruby, maybe Smalltalk?) there is no point to get/set methods.

One of the basic principals of OO design: Encapsulation!
It gives you many benefits, one of which being that you can change the implementation of the getter/setter behind the scenes but any consumer of that value will continue to work as long as the data type remains the same.

You should use getters and setters when:
You're dealing with something that is conceptually an attribute, but:
Your language doesn't have properties (or some similar mechanism, like Tcl's variable traces), or
Your language's property support isn't sufficient for this use case, or
Your language's (or sometimes your framework's) idiomatic conventions encourage getters or setters for this use case.
So this is very rarely a general OO question; it's a language-specific question, with different answers for different languages (and different use cases).
From an OO theory point of view, getters and setters are useless. The interface of your class is what it does, not what its state is. (If not, you've written the wrong class.) In very simple cases, where what a class does is just, e.g., represent a point in rectangular coordinates,* the attributes are part of the interface; getters and setters just cloud that. But in anything but very simple cases, neither the attributes nor getters and setters are part of the interface.
Put another way: If you believe that consumers of your class shouldn't even know that you have a spam attribute, much less be able to change it willy-nilly, then giving them a set_spam method is the last thing you want to do.
* Even for that simple class, you may not necessarily want to allow setting the x and y values. If this is really a class, shouldn't it have methods like translate, rotate, etc.? If it's only a class because your language doesn't have records/structs/named tuples, then this isn't really a question of OO…
But nobody is ever doing general OO design. They're doing design, and implementation, in a specific language. And in some languages, getters and setters are far from useless.
If your language doesn't have properties, then the only way to represent something that's conceptually an attribute, but is actually computed, or validated, etc., is through getters and setters.
Even if your language does have properties, there may be cases where they're insufficient or inappropriate. For example, if you want to allow subclasses to control the semantics of an attribute, in languages without dynamic access, a subclass can't substitute a computed property for an attribute.
As for the "what if I want to change my implementation later?" question (which is repeated multiple times in different wording in both the OP's question and the accepted answer): If it really is a pure implementation change, and you started with an attribute, you can change it to a property without affecting the interface. Unless, of course, your language doesn't support that. So this is really just the same case again.
Also, it's important to follow the idioms of the language (or framework) you're using. If you write beautiful Ruby-style code in C#, any experienced C# developer other than you is going to have trouble reading it, and that's bad. Some languages have stronger cultures around their conventions than others.—and it may not be a coincidence that Java and Python, which are on opposite ends of the spectrum for how idiomatic getters are, happen to have two of the strongest cultures.
Beyond human readers, there will be libraries and tools that expect you to follow the conventions, and make your life harder if you don't. Hooking Interface Builder widgets to anything but ObjC properties, or using certain Java mocking libraries without getters, is just making your life more difficult. If the tools are important to you, don't fight them.

From a object orientation design standpoint both alternatives can be damaging to the maintenance of the code by weakening the encapsulation of the classes. For a discussion you can look into this excellent article: http://typicalprogrammer.com/?p=23

Code evolves. private is great for when you need data member protection. Eventually all classes should be sort of "miniprograms" that have a well-defined interface that you can't just screw with the internals of.
That said, software development isn't about setting down that final version of the class as if you're pressing some cast iron statue on the first try. While you're working with it, code is more like clay. It evolves as you develop it and learn more about the problem domain you are solving. During development classes may interact with each other than they should (dependency you plan to factor out), merge together, or split apart. So I think the debate boils down to people not wanting to religiously write
int getVar() const { return var ; }
So you have:
doSomething( obj->getVar() ) ;
Instead of
doSomething( obj->var ) ;
Not only is getVar() visually noisy, it gives this illusion that gettingVar() is somehow a more complex process than it really is. How you (as the class writer) regard the sanctity of var is particularly confusing to a user of your class if it has a passthru setter -- then it looks like you're putting up these gates to "protect" something you insist is valuable, (the sanctity of var) but yet even you concede var's protection isn't worth much by the ability for anyone to just come in and set var to whatever value they want, without you even peeking at what they are doing.
So I program as follows (assuming an "agile" type approach -- ie when I write code not knowing exactly what it will be doing/don't have time or experience to plan an elaborate waterfall style interface set):
1) Start with all public members for basic objects with data and behavior. This is why in all my C++ "example" code you'll notice me using struct instead of class everywhere.
2) When an object's internal behavior for a data member becomes complex enough, (for example, it likes to keep an internal std::list in some kind of order), accessor type functions are written. Because I'm programming by myself, I don't always set the member private right away, but somewhere down the evolution of the class the member will be "promoted" to either protected or private.
3) Classes that are fully fleshed out and have strict rules about their internals (ie they know exactly what they are doing, and you are not to "fuck" (technical term) with its internals) are given the class designation, default private members, and only a select few members are allowed to be public.
I find this approach allows me to avoid sitting there and religiously writing getter/setters when a lot of data members get migrated out, shifted around, etc. during the early stages of a class's evolution.

There is a good reason to consider using accessors is there is no property inheritance. See next example:
public class TestPropertyOverride {
public static class A {
public int i = 0;
public void add() {
i++;
}
public int getI() {
return i;
}
}
public static class B extends A {
public int i = 2;
#Override
public void add() {
i = i + 2;
}
#Override
public int getI() {
return i;
}
}
public static void main(String[] args) {
A a = new B();
System.out.println(a.i);
a.add();
System.out.println(a.i);
System.out.println(a.getI());
}
}
Output:
0
0
4

Getters and setters are used to implement two of the fundamental aspects of Object Oriented Programming which are:
Abstraction
Encapsulation
Suppose we have an Employee class:
package com.highmark.productConfig.types;
public class Employee {
private String firstName;
private String middleName;
private String lastName;
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getMiddleName() {
return middleName;
}
public void setMiddleName(String middleName) {
this.middleName = middleName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public String getFullName(){
return this.getFirstName() + this.getMiddleName() + this.getLastName();
}
}
Here the implementation details of Full Name is hidden from the user and is not accessible directly to the user, unlike a public attribute.

There is a difference between DataStructure and Object.
Datastructure should expose its innards and not behavior.
An Object should not expose its innards but it should expose its behavior, which is also known as the Law of Demeter
Mostly DTOs are considered more of a datastructure and not Object. They should only expose their data and not behavior. Having Setter/Getter in DataStructure will expose behavior instead of data inside it. This further increases the chance of violation of Law of Demeter.
Uncle Bob in his book Clean code explained the Law of Demeter.
There is a well-known heuristic called the Law of Demeter that says a
module should not know about the innards of the objects it
manipulates. As we saw in the last section, objects hide their data
and expose operations. This means that an object should not expose its
internal structure through accessors because to do so is to expose,
rather than to hide, its internal structure.
More precisely, the Law of Demeter says that a method f of a class C
should only call the methods of these:
C
An object created by f
An object passed as an argument to f
An object held in an instance variable of C
The method should not invoke methods on objects that are returned by any of the allowed functions.
In other words, talk to friends, not to strangers.
So according this, example of LoD violation is:
final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();
Here, the function should call the method of its immediate friend which is ctxt here, It should not call the method of its immediate friend's friend. but this rule doesn't apply to data structure. so here if ctxt, option, scratchDir are datastructure then why to wrap their internal data with some behavior and doing a violation of LoD.
Instead, we can do something like this.
final String outputDir = ctxt.options.scratchDir.absolutePath;
This fulfills our needs and doesn't even violate LoD.
Inspired by Clean Code by Robert C. Martin(Uncle Bob)

If you don't require any validations and not even need to maintain state i.e. one property depends on another so we need to maintain the state when one is change. You can keep it simple by making field public and not using getter and setters.
I think OOPs complicates things as the program grows it becomes nightmare for developer to scale.
A simple example; we generate c++ headers from xml. The header contains simple field which does not require any validations. But still as in OOPS accessor are fashion we generates them as following.
const Filed& getfield() const
Field& getField()
void setfield(const Field& field){...}
which is very verbose and is not required. a simple
struct
{
Field field;
};
is enough and readable.
Functional programming don't have the concept of data hiding they even don't require it as they do not mutate the data.

Additionally, this is to "future-proof" your class. In particular, changing from a field to a property is an ABI break, so if you do later decide that you need more logic than just "set/get the field", then you need to break ABI, which of course creates problems for anything else already compiled against your class.

One other use (in languages that support properties) is that setters and getters can imply that an operation is non-trivial. Typically, you want to avoid doing anything that's computationally expensive in a property.

One relatively modern advantage of getters/setters is that is makes it easier to browse code in tagged (indexed) code editors. E.g. If you want to see who sets a member, you can open the call hierarchy of the setter.
On the other hand, if the member is public, the tools don't make it possible to filter read/write access to the member. So you have to trudge though all uses of the member.

Getters and setters coming from data hiding. Data Hiding means We
are hiding data from outsiders or outside person/thing cannot access
our data.This is a useful feature in OOP.
As a example:
If you create a public variable, you can access that variable and change value in anywhere(any class). But if you create as private that variable cannot see/access in any class except declared class.
public and private are access modifiers.
So how can we access that variable outside:
This is the place getters and setters coming from. You can declare variable as private then you can implement getter and setter for that variable.
Example(Java):
private String name;
public String getName(){
return this.name;
}
public void setName(String name){
this.name= name;
}
Advantage:
When anyone want to access or change/set value to balance variable, he/she must have permision.
//assume we have person1 object
//to give permission to check balance
person1.getName()
//to give permission to set balance
person1.setName()
You can set value in constructor also but when later on when you want
to update/change value, you have to implement setter method.

Is there any advantage in disallowing interface implementation for existing classes?

In static OOP languages, interfaces are used in order to declare that several classes share some logical property - they are disposable, they can be compared to an int, they can be serialized, etc.
Let's say .net didn't have a standard IDisposable interface, and I've just came up with this beautiful idea:
interface IDiscardable { void Discard(); }
My app uses a lot of System.Windows.Forms, and I think that a Form satisfies the logical requirements for being an IDiscardable. The problem is, Form is defined outside of my project, so C# (and Java, C++...) won't allow me to implement IDiscardable for it. C# doesn't allow me to formally represent the fact that a Form can be discarded ( and I'll probably end up with a MyForm wrapper class or something.
In contrast, Haskell has typeclasses, which are logically similar to interfaces. A Show instance can be presented (or serialized) as a string, Eq allows comparisons, etc. But there's one crucial difference: you can write a typeclass instance (which is similar to implementing an interface) without accessing the source code of a type. So if Haskell supplies me with some Form type, writing an Discardable instance for it is trivial.
My question is: from a language designer perspective, is there any advantage to the first approach? Haskell is not an object oriented language - does the second approach violates OOP in any way?
Thanks!

This is a difficult question, which stems from a common misunderstanding. Haskell type classes (TC), are said to be "logically similar" to the interfaces or abstract classes (IAC) from object-oriented programming languages. They are not. They represent different concepts about types and programming languages: IAC are a case of subtyping, while TC is a form of parametric polymorphism.
Nevertheless, since your questions are methodological, here I answer from a methodological side. To start with the second question:
does the second approach [that of extending the implementation of a class outside the class] violate OOP in any way
Object oriented programming is a set of ideas to describe the execution of a program, the main elements of an execution, how to specify these elements in the program's code, and how to structure a program so as to separate the specification of different elements. In particular, OOP is based in these ideas:
At any state of its execution, a process (executing program) consists of a set of objects. This set is dynamic: it may contain different objects at different states, via object creation and destruction.
Every object has an internal state represented by a set of fields, which may include references to other related objects. Relations are dynamic: the same field of the same object a may at different states point to different objects.
Every object can receive some messages from another object. Upon receiving a message, the object may alter its state and may send messages to objects in its fields.
Every object is an instance of a class: the class describes what fields the object has, what messages it can receive, and what it does upon receiving a message.
In an object a, the same field a.f may at different states point to
different objects, which may belong to different classes. Thus, a needs not to know to what class those objects b belong; it only needs to know what messages do those objects accept. For this reason, the type of those fields can be an interface.
The interface declares a set of messages that an object can receive. The class specifies explicitly what interfaces are satisfied by the objects of that class.
My answer to the question: in my opinion yes.
Implementing an interface (as suggested in the example) outside a class breaks one of these ideas: that the class of the object describes the complete set of messages that objects in that class can receive.
You may like to know, though, that this is (in part) what "Aspects", as in AspectJ, are about. An Aspect describes the implementation of a certain "method" in several classes, and these implementations are incorportated (weaved) into the class.
To answer back the first question, "is there any advantage to the first approach", the answer would be also yes: that all the behaviour of an object (what messages it answers to) is only described in one place, in the class.

Well, the Haskell approach does have one disadvantage, which is when you write, for example, two different libraries that each provides its own implementation of interface Foo for the same external type (provided by yet a third library). In this case now these two libraries can't be used at the same time in the same program. So if you call lack of a disadvantage an advantage, then I guess that would be one advantage for the OOP language way of doing this—but it's a pretty weak advantage.
What I would add to this, however, is that Haskell type classes are a bit like OOP interfaces, but not entirely like them. But type classes are also a bit like the Strategy and Template Method patterns; a type class can be simulated by explicitly passing around a "dictionary" object that provides implementations for the type class operations. So the following Haskell type class:
class Monoid m where
mempty :: m
mappend :: m -> m -> m
...can be simulated with this explicit dictionary type:
data Monoid_ m = Monoid_ { _mempty :: m, _mappend :: m -> m -> m }
...or an OOP interface like this:
interface Monoid<M> {
M empty();
M append(M a, M b);
}
What type classes add on top of this is that the compiler will maintain and pass around your dictionaries implicitly. Sometimes in the Haskell community you get arguments about when and whether type classes are superior to explicit dictionary passing; see for example Gabriel Gonzalez's "Scrap your type classes" blog entry (and keep in mind that he doesn't 100% agree with what he says there!). So the OOP counterpart to this idea would be instead of extending the language to allow external implements declarations, what are the drawbacks to just explicitly using Strategies or Template Methods?

What you are describing is the adapter pattern. The act of composing an object in a new type that provides some additional behavior to the underlying type, in this case the implementation of another interface.
As with so many design patterns, different languages choose different design patterns to incorporate directly into the language itself and provide special language support, often in the form of a more concise syntax, while other patterns are need to be implemented through the use of other mechanisms without their own special syntax.
C# doesn't have special language support for the adapter pattern, you need to create a new explicit type that composes your other type, implements the interface, and uses the composed type to fulfill the interface's contract. Is it possible for them to add such a feature to the language, sure. Like any other feature request in existence it needs to be designed, implemented, tested, documented, and all sorts of other expenses accounted for. This feature has (thus far) not made the cut.

What you are describing is called duck typing, after the phrase "If it walks like a duck, swims like a duck, and quacks like a duck, then it's a duck".
C# actually does allow dynamic (run-time) duck typing through the dynamic keyword. What it doesn't allow is static (compile-time) duck typing.
You'd probably need somebody from Microsoft to come along and provide the exact reasons this doesn't exist in C#, but here are some likely candidates:
The "minus 100 points" philosophy to adding features. It's not just enough for a feature to have no drawbacks, to justify the effort put into implementing, testing, maintaining and supporting a language feature, it has to provide a clear benefit. Between the dynamic keyword and the adapter pattern, there's not many situations where this is useful. Reflection is also powerful enough that it would be possible to effectively provide duck typing, for example I believe it'd be relatively straightforward to use Castle's DynamicProxy for this.
There are situations where you want a class to be able to specify how it is accessed. For example, fluent APIs often control the valid orderings and combinations of chained methods on a class through the use of interfaces. See, for example, this article. If my fluent class was designed around a grammar which stated that once method A was called, no other methods except B could be called, I could control this with interfaces like:
public class FluentExample : ICanCallAB
{
public ICanCallB A()
{
return this;
}
public ICanCallAB B()
{
return this;
}
}
public interface ICanCallA
{
void A();
}
public interface ICanCallAB : ICanCallA
{
void B();
}
Of course, a consumer could get around this using casting or dynamic, but at least in this case the class can state its own intent.
Related to the above point, an interface implementation is a declaration of meaning. For example, Tree and Poodle might both have a Bark() member, but I would want to be able to use Tree as an IDog.

When should we use public and when private?

Should the Access Modifier be Public or Private when we are implementing method that is for now not using by other classes in our team solution?
I believe that "A public member says this member represents the key, documented functionality provided by this object.". We make private only "implementation details" methods and all methods that can be useful in future we should make public even for now there is no consumers of our methods in others classes. But my opponent says that such methods should be private. How do you think?
Added:
Let's be more specific. For example there is a class SqlHelper.
In it there is useful functionality for operation with the SQL Server.
In particular there is used connection to the SQL server. But not only in that class.
And for example I need to implement the public static HandleSqlExeption method(now only for class SqlHelper) which will process SqlExeptions. But I want that in all classes where there is operations with SQL connection in exception handling will be used this method (instead of it is simple, for example:
catch (Exception) { MsgBox {"SqlError"};
as somewhere happens now. So i consider that public access modifier will say to other colleagues that they can use this method. And private will hide that method. And i will need ti change code and rebuid assembly if some one will ask to use tsuch method. Why? There is only negatives.

In general, you should code using the least possible permissive access modifier.
If the method is not used outside the class, make it private.
If the method needs to be used by inheriting types, make it protected.
If the method will only be used within the assembly, make it internal.
Only if the method is to be used outside the assembly should it be public.
This helps with information hiding and lets you change the implementation at will.
I have heard this described as protecting your privates.
In terms of API design, you should have a complete API, exposing all logical functionality as public - this is why you should use interfaces in .NET for API design, as all interface members must be public. In implementing classes, use the above rules of thumb for any members that are not part of the interface - unless they form a logical part of the interface, as far as its consumers are concerned.
So, if you have a Read method being used today, you should have a complementary Write method that has the same accessibility. That's a good design (symmetric and expected), but how you read or write should be hidden behind private methods, if the public ones use them.

Use public and private depending on the nature of the method as name describes.
public when the method can be exposed to other objects and
private when the method should not be accessed by other objects.
If your object needs to be more secure, use private access specifier. You should also know about protected access specifier which is different in that, those methods can be accessed only by the objects that inherit them.

if a class has some thing to expose to outside world then make it public else private. it depends on the functionality and purpose of class not the current or future requirement.

This is a judgment call. Generally speaking, if it's logically a part of the interface the class is presenting to get the functionality it provides, it should be public even if it has no current users. That will reduce the chances that the class will need to be modified every time you want to use it for substantially the same things you're already using it for.
However, this has a cost. Each function you expose is a capability you are committing to provide. If, in the future, you decide to remove it from the public interface, you may break callers.
For example, consider a map class. Say you currently don't have any callers that need to know the number of items in the map, and your current implementation has a size variable you could easily return. You could add a getSize function that returns the size of the map. It's logically part of the function the map class exposes. So one can argue it should be public even if no current callers need to know the size.
However, it's entirely possible a future implementation might wish to get rid of the size variable. Perhaps the overhead of incrementing and decrementing the size is significant and implementation is changed to not need to know the size. If you exposed getSize function and it was called, you would either have to keep tracking the size even though you didn't need it or make the function very expensive, having to actually count the size.
Say lots of code needs to know if the table is empty and you provide an isEmpty function for this purpose. Experience shows that if you also provide a getCount function, people will do getCount() == 0 instead of isEmpty. That may not make any difference today, but it may constrain your future implementation choices to keep getCount as cheap as isEmpty.
Where possible, I would strongly suggest making it public if you can easily imagine a non-broken user of the class that would benefit from having access to that functionality and it forms a logical part of the functionality the class provides. Otherwise, you can easily add it later anyway. So don't stress over it too much. It is largely a style issue mixed with trying to predict the future.

A basic principle of object oriented development; an instance of a class should only expose what needs to be accessed by it's clients. See link and search for "encapsulation".

Fields vs Properties for private class variables [duplicate]

This question already has answers here:
Are there any reasons to use private properties in C#?
(19 answers)
Closed 9 years ago.
For private class variables, which one is preferred?
If you have a property like int limit, you want it to be:
int Limit {get; set;}
and use it inside the class, like so:
this.Limit
Is there a reason to use it or not use it? Maybe for performance reasons?
I wonder if this is a good practice.

For a private member, I only make it a property when getting and/or setting the value should cause something else to occur, like:
private int Limit
{
get
{
EnsureValue();
return this._limit;
}
}
Otherwise, fields are fine. If you need to increase their accessibility, it's already a big enough change that making it a property at that point isn't a huge deal.
Edit: as Scott reminds us in the comments, side effects in properties can often cause more pain than anything else. Don't violate Single Responsibility and limit property logic to consistent, logical operations on the value only that must be done at the gate - such as lazy loading (as in the example above), transforming an internal structure into a publicly-useful format, etc.

The only real benefit an auto-property has over a field when the accessibility is private is that you can set a breakpoint on accesses and updates of the variable. If that is important to your scenario then definitely use an auto-property. Otherwise, given there is no substantial advantage, I choose to go with the simplest construct which is a field.

I would say its good practice to use a property. If ever you had to expose the limit value and used a local member it will require more coding while if its a property it would only require a change of its modifier.
I think it's cleaner also.

Granted, since it's a private API, its an implementation detail - you can do whatever you want here. However, there is very little reason to not use a property, even for private classes. The properties get inlined away by the JIT, unless there is extra code in place, so there isn't really a performance impact.
The biggest reasons to prefer properties, IMO, are:
Consistency in your API - You'll want properties in publicly exposed APIs, so making them in the private API will make your programming exprience more consistent, which leads to less bugs due to better maintainability
Easier to convert private class to public

From my perspective, using properties in lieu of variables boils down to:
Pros
Can set a break point for debugging, as Jared mentioned,
Can cause side-effects, like Rex's EnsureValue(),
The get and set can have different access restrictions (public get, protected set),
Can be utilized in Property Editors,
Cons
Slower access, uses method calls.
Code bulk, harder to read (IMO).
More difficult to initialize, like requiring EnsureValue();
Not all of these apply to int Limit {get; set;} style properties.

The point of automatic properties is they are very quick at creating a public access to some field in your class. Now, they offer no benefit over exposing straight up fields to the outside world, other than one big one.
Your class' interface is how it communicates with the outside world. Using automatic properties over fields allows you to change the internals of your class down the road in case you need to make setting the value of that property do something or check authorization rules or something similar on the read.
The fact that you already have a property means you can change your implementation without breaking your public interface.
Therefore, if this is just a private field, an automatic property isn't really that useful, not only that, but you can't initialize public properties at declaration like you can with fields.

I generally follow the following principle: If it's for strictly private use, use a field as it is faster.
If you decide that it should become public, protected or internal some day, it's not difficult to refactor to a property anyway, and with tools like ReSharper, it takes about 3 seconds to do so... :)

There's nothing wrong with having private or protected properties; this is mostly useful when there is some rule or side effect associated with the underlying variable.
The reason why properties seem more natural for public variables is that in the public case, it is a way to hedge one's bet against future implementation changes, whereby the property will remain intact but the implementation details somehow move around (and/or some additional business rule will be needed).
On performance, this is typically insignificant, or indeed identical for straight-assignment properties.
I personally dislike (but often use) plain assignment properties because they just clutter the code. I wish C# would allow for "after the fact refactoring".

Properties provide some very good automatic features (like Json and Xml Serialization)
Fields do not.
Properties can also be a part of an Interface. If you decide to refactor later on... this might be something to consider too.

Properties are just syntactic sugar, C# will compile them into get_PropertyName and set_PropertyName, so performance differences are not a consideration.

If your data member need only set and get logic then properties are very good and fast solution in C#

Is Reflection breaking the encapsulation principle?

Okay, let's say we have a class defined like
public class TestClass
{
private string MyPrivateProperty { get; set; }
// This is for testing purposes
public string GetMyProperty()
{
return MyPrivateProperty;
}
}
then we try:
TestClass t = new TestClass { MyPrivateProperty = "test" };
Compilation fails with TestClass.MyPrivateProperty is inaccessible due to its protection level, as expected.
Try
TestClass t = new TestClass();
t.MyPrivateProperty = "test";
and compilation fails again, with the same message.
All good until now, we were expecting this.
But then one write:
PropertyInfo aProp = t.GetType().GetProperty(
"MyPrivateProperty",
BindingFlags.NonPublic | BindingFlags.Instance);
// This works:
aProp.SetValue(t, "test", null);
// Check
Console.WriteLine(t.GetMyProperty());
and here we are, we managed to change a private field.
Isn't it abnormal to be able to alter some object's internal state just by using reflection?
Edit:
Thanks for the replies so far. For those saying "you don't have to use it": what about a class designer, it looks like he can't assume internal state safety anymore?

Reflection breaks encapsulation principles by giving access to private fields and methods, but it's not the first or only way in which encapsulation can be circumvented; one could argue that serialization exposes all the internal data of a class, information which would normally be private.
It's important to understand that encapsulation is only a technique, one that makes designing behaviour easier, provided consumers agree use an API you have defined. If somebody chooses to circumvent your API using reflection or any other technique, they no longer have the assurance that your object will behave as you designed it. If somebody assigns a value of null to a private field, they'd better be ready to catch a NullReferenceException the next time they try to use your class!
In my experience, programming is all about assertions and assumptions. The language asserts constraints (classes, interfaces, enumerations) which make creating isolated behaviour much easier to produce, on the assumption that a consumer agrees to not violate those boundaries.
This is a fair assertion to make given it makes a divide-and-conquer approach to software development more easy than any technique before it.

Reflection is a tool. You may use it to break encapsulation, when it gives you more than it takes away.
Reflection has a certain "pain" (or cost -- in performance, in readability, in reliability of code) associated with it, so you won't use it for a common problem. It's just easier to follow object-oriented principles for common problems, which is one of the goals of the language, commonly referred to as the pit of success.
On the other hand, there are some tasks that wouldn't be solvable without that mechanism, e.g. working with run-time generate types (though, it is going to be much-much easier starting from .NET 4.0 with it's DLR and the "dynamic" variables in C# 4.0).

You are right that reflection can be opposed to any number of good design principles, but it can also be an essential building block that you can use to support good design principles - e.g. software that can be extended by plugins, inversion of control, etc.
If you're worried that it represents a capability that should be discouraged, you may have a point. But it's not as convenient to use as true language features, so it is easier to do things the right way.
If you think reflection ought to be impossible, you're dreaming!
In C++ there is no reflection as such. But there is an underlying "object model" that the compiler-generated machine code uses to access the structure of objects (and virtual functions). So a C++ programmer can break encapsulation in the same way.
class RealClass
{
private:
int m_secret;
};
class FakeClass
{
public:
int m_notSecret;
};
We can take a pointer to an object of type RealClass and simply cast it to FakeClass and access the "private" member.
Any restricted system has to be implemented on top of a more flexible system, so it is always possible to circumvent it. If reflection wasn't provided as a BCL feature, someone could add it with a library using unsafe code.
In some languages there are ways to encapsulate data so that it is not possible within the language to get at the data except in certain prescribed ways. But it will always be possible to cheat if you can find a way to escape out of the language. An extreme example would be scoped variables in JavaScript:
(function() {
var x = 5;
myGetter = function() { return x; };
mySetter = function(v) { x = v; };
})();
After that executes, the global namespace contains two functions, myGetter and mySetter, which are the only way to access the value of x. Javascript has no reflective ability to get at x any other way. But it has to run in some kind of host interpreter (e.g. in the browser), and so there is certainly some horrible way to manipulate x. A memory corruption bug in a plugin could do it by accident!

I suppose you could say that. You could also say that the CodeDom, Reflection Emit, and Expression Tree APIs break encapsulation by allowing you to write code on the fly. They are merely facilities provided by .NET. You don't have to use them.
Don't forget that you have to (a) be running in full trust, and (b) explicitly specify BindingFlags.NonPublic, in order to access private data. That should be enough of a safety net.

One can say that reflection breaks inheritance and polymorphism as well. When instead of supplying each object type with its own overloaded method version you have a single method that checks object type at run time and switches to a particular behavior for each.
Reflection is a nice tool nevertheless. Sometimes you need to instantiate an object with a private constructor because it's an optimal course of action and you cannot change the class implementation (closed library, can't get in touch with the author any longer). Or you wish to perform some auxiliary operation on a variety of objects in one place. Or maybe you wish to perform logging for certain types. These are the situation when reflection does help without causing any harm.

That's part of the functionality provided by Reflection, but not, I would say, the best use of it.
It only works under Full Trust.

The encapsulation principle is hold by your API, but reflection gives you a way to work around the API.
Whoever uses reflection knows that he is not using your API correctly!

No, it is not abnormal.
It is something reflection allows you to do, in some situation what you did might be of some use, in many other that simply will make your code a time bomb.
Reflection is a tool, it can be used or abused.
Regarding the question after your edit: the answer is simply no.
An API designer could do his best, putting a lot of effort into exposing a clean interface and see his API overly misused by using reflection.
An engineer can model a perfect washing machine that is secure, doesn't shock you with electricity and so on, but if you smash it with an hammer until you see the cables no one is going to blame the engineer if you get a shock :D

Reflection can help you to keep your code clean. E.g. when you use hibernate you can directly bind the private variables to DB values and you don't have to write unnecessary setter methods that you would not need otherwise. Seen from this point of view, reflection can help you to keep encapsulation.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# and Data Hiding - c#

Related

C# Encapsulation (OOP) [duplicate]

Is there any advantage in disallowing interface implementation for existing classes?

When should we use public and when private?

Fields vs Properties for private class variables [duplicate]

Is Reflection breaking the encapsulation principle?

Categories

Resources