I need to derive an important value given 7 potential inputs. Uncle Bob urges me to avoid functions with that many parameters, so I've extracted the class. All parameters now being properties, I'm left with a calculation method with no arguments.
“That”, I think, “could be a property, but I'm not sure if that's idiomatic C#.”
Should I expose the final result as a property, or as a method with no arguments? Would the average C# programmer find properties confusing or offensive? What about the Alt.Net crowd?
decimal consumption = calculator.GetConsumption(); // obviously derived
decimal consumption = calculator.Consumption; // not so obvious
If the latter: should I declare interim results as [private] properties, also? Thanks to heavy method extraction, I have several interim results. Many of these shouldn't be part of the public API. Some of them could be interesting, though, and my expressions would look cleaner if I could access them as properties:
decimal interim2 = this.ImportantInterimValue * otherval;
Happy Experiment Dept.:
While debugging my code in VS2008, I noticed that I kept hovering my mouse over the method calls that compute interim results, expecting a hover-over with their return value. After turning all methods into properties, I found that exposing interim results as properties greatly assisted debugging. I'm well pleased with that, but have lingering concerns about readability.
The interim value declarations look messier. The expressions, however, are easier to read without the brackets. I no longer feel compelled to start the method name with a verb. To contrast:
// Clean method declaration; compulsive verby name; callers need
// parenthesis despite lack of any arguments.
decimal DetermineImportantInterimValue() {
return this.DetermineOtherInterimValue() * this.SomeProperty;
}
// Messier property declaration; clean name; clean access syntax
decimal ImportantInterimValue {
get {
return this.OtherInterimValue * this.SomeProperty;
}
}
I should perhaps explain that I've been coding in Python for a decade. I've been left with a tendency to spend extra time making my code easier to call than to write. I'm not sure the Python community would regard this property-oriented style as acceptably “Pythonic”, however:
def determineImportantInterimValue(self):
"The usual way of doing it."
return self.determineOtherInterimValue() * self.someAttribute
importantInterimValue = property(
lambda self => self.otherInterimValue * self.someAttribute,
doc = "I'm not sure if this is Pythonic...")
The important question here seems to be this:
Which one produces more legible, maintainable code for you in the long run?
In my personal opinion, isolating the individual calculations as properties has a couple of distinct advantages over a single monolothic method call:
You can see the calculations as they're performed in the debugger, regardless of the class method you're in. This is a boon to productivity while you're debugging the class.
If the calculations are discrete, the properties will execute very quickly, which means (in my opinion), they observe the rules for property design. It's absurd to think that a guideline for design should be treated as a straightjacket. Remember: There is no silver bullet.
If the calculations are marked private or internal, they do not add unnecessary complexity to consumers of the class.
If all of the properties are discrete enough, compiler inlining may resolve the performance issues for you.
Finally, if the final method that returns your final calculation is far and away easier to maintain and understand because you can read it, that is an utterly compelling argument in and of itself.
One of the best things you can do is think for yourself and dare to challenge the preconceived One Size Fits All notions of our peers and predecessors. There are exceptions to every rule. This case may very well be one of them.
Postscript:
I do not believe that we should abandon standard property design in the vast majority of cases. But there are cases where deviating from The Standard(TM) is called for, because it makes sense to do so.
Personally, I would prefer if you make your public API as a method instead of property. Properties are supposed to be as 'fast' as possible in C#. More details on this discussion: Properties vs Methods
Internally, GetConsumption can use any number of private properties to arrive at the result, choice is yours.
I usually go by what the method or property will do. If it is something that is going to take a little time, I'll use a method. If it's very quick or has a very small number of operations going on behind the scenes, I'll make it a property.
I use to use methods to denote any action on the object or which changes the state of an object. so, in this case I would name the function as CalculateConsumption() which computes the values from other properties.
You say you are deriving a value from seven inputs, you have implemented seven properties, one for each input, and you have a property getter for the result. Some things you might want to consider are:
What happens if the caller fails to set one or more of the seven "input" properties? Does the result still make sense? Will an exception be thrown (e.g. divide by zero)?
In some cases the API may be less discoverable. If I must call a method that takes seven parameters, I know that I must supply all seven parameters to get the result. And if some of the parameters are optional, different overloads of the method make it clear which ones.
In contrast, it may not be so clear that I have to set seven properties before accessing the "result" property, and could be easy to forget one.
When you have a method with several parameters, you can more easily have richer validation. For example, you could throw an ArgumentException if "parameter A and parameter B are both null".
If you use properties for your inputs, each property will be set independently, so you can't perform the validation when the inputs are being set - only when the result property is being dereferenced, which may be less intuitive.
Related
I'm writing a library that has several public classes and methods, as well as several private or internal classes and methods that the library itself uses.
In the public methods I have a null check and a throw like this:
public int DoSomething(int number)
{
if (number == null)
{
throw new ArgumentNullException(nameof(number));
}
}
But then this got me thinking, to what level should I be adding parameter null checks to methods? Do I also start adding them to private methods? Should I only do it for public methods?
Ultimately, there isn't a uniform consensus on this. So instead of giving a yes or no answer, I'll try to list the considerations for making this decision:
Null checks bloat your code. If your procedures are concise, the null guards at the beginning of them may form a significant part of the overall size of the procedure, without expressing the purpose or behaviour of that procedure.
Null checks expressively state a precondition. If a method is going to fail when one of the values is null, having a null check at the top is a good way to demonstrate this to a casual reader without them having to hunt for where it's dereferenced. To improve this, people often use helper methods with names like Guard.AgainstNull, instead of having to write the check each time.
Checks in private methods are untestable. By introducing a branch in your code which you have no way of fully traversing, you make it impossible to fully test that method. This conflicts with the point of view that tests document the behaviour of a class, and that that class's code exists to provide that behaviour.
The severity of letting a null through depends on the situation. Often, if a null does get into the method, it'll be dereferenced a few lines later and you'll get a NullReferenceException. This really isn't much less clear than throwing an ArgumentNullException. On the other hand, if that reference is passed around quite a bit before being dereferenced, or if throwing an NRE will leave things in a messy state, then throwing early is much more important.
Some libraries, like .NET's Code Contracts, allow a degree of static analysis, which can add an extra benefit to your checks.
If you're working on a project with others, there may be existing team or project standards covering this.
If you're not a library developer, don't be defensive in your code
Write unit tests instead
In fact, even if you're developing a library, throwing is most of the time: BAD
1. Testing null on int must never be done in c# :
It raises a warning CS4072, because it's always false.
2. Throwing an Exception means it's exceptional: abnormal and rare.
It should never raise in production code. Especially because exception stack trace traversal can be a cpu intensive task. And you'll never be sure where the exception will be caught, if it's caught and logged or just simply silently ignored (after killing one of your background thread) because you don't control the user code. There is no "checked exception" in c# (like in java) which means you never know - if it's not well documented - what exceptions a given method could raise. By the way, that kind of documentation must be kept in sync with the code which is not always easy to do (increase maintenance costs).
3. Exceptions increases maintenance costs.
As exceptions are thrown at runtime and under certain conditions, they could be detected really late in the development process. As you may already know, the later an error is detected in the development process, the more expensive the fix will be. I've even seen exception raising code made its way to production code and not raise for a week, only for raising every day hereafter (killing the production. oops!).
4. Throwing on invalid input means you don't control input.
It's the case for public methods of libraries. However if you can check it at compile time with another type (for example a non nullable type like int) then it's the way to go. And of course, as they are public, it's their responsibility to check for input.
Imagine the user who uses what he thinks as valid data and then by a side effect, a method deep in the stack trace trows a ArgumentNullException.
What will be his reaction?
How can he cope with that?
Will it be easy for you to provide an explanation message ?
5. Private and internal methods should never ever throw exceptions related to their input.
You may throw exceptions in your code because an external component (maybe Database, a file or else) is misbehaving and you can't guarantee that your library will continue to run correctly in its current state.
Making a method public doesn't mean that it should (only that it can) be called from outside of your library (Look at Public versus Published from Martin Fowler). Use IOC, interfaces, factories and publish only what's needed by the user, while making the whole library classes available for unit testing. (Or you can use the InternalsVisibleTo mechanism).
6. Throwing exceptions without any explanation message is making fun of the user
No need to remind what feelings one can have when a tool is broken, without having any clue on how to fix it. Yes, I know. You comes to SO and ask a question...
7. Invalid input means it breaks your code
If your code can produce a valid output with the value then it's not invalid and your code should manage it. Add a unit test to test this value.
8. Think in user terms:
Do you like when a library you use throws exceptions for smashing your face ? Like: "Hey, it's invalid, you should have known that!"
Even if from your point of view - with your knowledge of the library internals, the input is invalid, how you can explain it to the user (be kind and polite):
Clear documentation (in Xml doc and an architecture summary may help).
Publish the xml doc with the library.
Clear error explanation in the exception if any.
Give the choice :
Look at Dictionary class, what do you prefer? what call do you think is the fastest ? What call can raises exception ?
Dictionary<string, string> dictionary = new Dictionary<string, string>();
string res;
dictionary.TryGetValue("key", out res);
or
var other = dictionary["key"];
9. Why not using Code Contracts ?
It's an elegant way to avoid the ugly if then throw and isolate the contract from the implementation, permitting to reuse the contract for different implementations at the same time. You can even publish the contract to your library user to further explain him how to use the library.
As a conclusion, even if you can easily use throw, even if you can experience exceptions raising when you use .Net Framework, that doesn't mean it could be used without caution.
Here are my opinions:
General Cases
Generally speaking, it is better to check for any invalid inputs before you process them in a method for robustness reason - be it private, protected, internal, protected internal, or public methods. Although there are some performance costs paid for this approach, in most cases, this is worth doing rather than paying more time to debug and to patch the codes later.
Strictly Speaking, however...
Strictly speaking, however, it is not always needed to do so. Some methods, usually private ones, can be left without any input checking provided that you have full guarantee that there isn't single call for the method with invalid inputs. This may give you some performance benefit, especially if the method is called frequently to do some basic computation/action. For such cases, doing checking for input validity may impair the performance significantly.
Public Methods
Now the public method is trickier. This is because, more strictly speaking, although the access modifier alone can tell who can use the methods, it cannot tell who will use the methods. More over, it also cannot tell how the methods are going to be used (that is, whether the methods are going to be called with invalid inputs in the given scopes or not).
The Ultimate Determining Factor
Although access modifiers for methods in the code can hint on how to use the methods, ultimately, it is humans who will use the methods, and it is up to the humans how they are going to use them and with what inputs. Thus, in some rare cases, it is possible to have a public method which is only called in some private scope and in that private scope, the inputs for the public methods are guaranteed to be valid before the public method is called.
In such cases then, even the access modifier is public, there isn't any real need to check for invalid inputs, except for robust design reason. And why is this so? Because there are humans who know completely when and how the methods shall be called!
Here we can see, there is no guarantee either that public method always require checking for invalid inputs. And if this is true for public methods, it must also be true for protected, internal, protected internal, and private methods as well.
Conclusions
So, in conclusion, we can say a couple of things to help us making decisions:
Generally, it is better to have checks for any invalid inputs for robust design reason, provided that performance is not at stake. This is true for any type of access modifiers.
The invalid inputs check could be skipped if performance gain could be significantly improved by doing so, provided that it can also be guaranteed that the scope where the methods are called are always giving the methods valid inputs.
private method is usually where we skip such checking, but there is no guarantee that we cannot do that for public method as well
Humans are the ones who ultimately use the methods. Regardless of how the access modifiers can hint the use of the methods, how the methods are actually used and called depend on the coders. Thus, we can only say about general/good practice, without restricting it to be the only way of doing it.
The public interface of your library deserves tight checking of preconditions, because you should expect the users of your library to make mistakes and violate the preconditions by accident. Help them understand what is going on in your library.
The private methods in your library do not require such runtime checking because you call them yourself. You are in full control of what you are passing. If you want to add checks because you are afraid to mess up, then use asserts. They will catch your own mistakes, but do not impede performance during runtime.
Though you tagged language-agnostic, it seems to me that it probably doesn't exist a general response.
Notably, in your example you hinted the argument: so with a language accepting hinting it'll fire an error as soon as entering the function, before you can take any action.
In such a case, the only solution is to have checked the argument before calling your function... but since you're writing a library, that cannot have sense!
In the other hand, with no hinting, it remains realistic to check inside the function.
So at this step of the reflexion, I'd already suggest to give up hinting.
Now let's go back to your precise question: to what level should it be checked?
For a given data piece it'd happen only at the highest level where it can "enter" (may be several occurrences for the same data), so logically it'd concern only public methods.
That's for the theory. But maybe you plan a huge, complex, library so it might be not easy to ensure having certainty about registering all "entry points".
In this case, I'd suggest the opposite: consider to merely apply your controls everywhere, then only omit it where you clearly see it's duplicate.
Hope this helps.
In my opinion you should ALWAYS check for "invalid" data - independent whether it is a private or public method.
Looked from the other way... why should you be able to work with something invalid just because the method is private? Doesn't make sense, right? Always try to use defensive programming and you will be happier in life ;-)
This is a question of preference. But consider instead why are you checking for null or rather checking for valid input. It's probably because you want to let the consumer of your library to know when he/she is using it incorrectly.
Let's imagine that we have implemented a class PersonList in a library. This list can only contain objects of the type Person. We have also on our PersonList implemented some operations and therefore we do not want it to contain any null values.
Consider the two following implementations of the Add method for this list:
Implementation 1
public void Add(Person item)
{
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Implementation 2
public void Add(Person item)
{
if(item == null)
{
throw new ArgumentNullException("Cannot add null to PersonList");
}
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Let's say we go with implementation 1
Null values can now be added in the list
All opoerations implemented on the list will have to handle theese null values
If we should check for and throw a exception in our operation, consumer will be notified about the exception when he/she is calling one of the operations and it will at this state be very unclear what he/she has done wrong (it just wouldn't make any sense to go for this approach).
If we instead choose to go with implementation 2, we make sure input to our library has the quality that we require for our class to operate on it. This means we only need to handle this here and then we can forget about it while we are implementing our other operations.
It will also become more clear for the consumer that he/she is using the library in the wrong way when he/she gets a ArgumentNullException on .Add instead of in .Sort or similair.
To sum it up my preference is to check for valid argument when it is being supplied by the consumer and it's not being handled by the private/internal methods of the library. This basically means we have to check arguments in constructors/methods that are public and takes parameters. Our private/internal methods can only be called from our public ones and they have allready checked the input which means we are good to go!
Using Code Contracts should also be considered when verifying input.
So I didn't find any elegant solution for this, either googling or throughout stackoverflow. I guess that I have a very specific situation in my hands, anyway here it goes:
I have a object structure, which I don't have control of, because I receive this structure from an external WS. This is quite a huge object, with various levels of fields and properties, and this fields and properties can or can't be null, in any level. You can think of this object as an anemic model, it doesn't have behaviour, just state.
For the purpose of this question, I'll give you a simplified sample that simulates my situation:
Class A
PropB1
PropC11
PropLeaf111
PropC12
PropLeaf112
PropB2
PropC21
PropLeaf211
PropC22
PropLeaf221
So, throughout my code I have to access a number of these properties, in different levels, to do some math in order to calculate what I need. Basically for each type of calculation that I have to do, I have to test each level of the properties that I need, to check if it's not null, in which case I would return (decimal) 0, or any other default value depending on the business logic.
Sample of a math that I have to do with it:
var value = 0;
if (objClassA.PropB1 != null && objClassA.PropB1.PropC11 != null) {
var leaf = objClassA.PropB1.PropC11.PropLeaf111;
value = leaf.HasValue ? leaf.Value : value;
}
Just to be very, the leaf properties of this structure would always be primitives, or nullable primitives in which case I give the proper treatment. This is "the logic" that I have to do for each property that I need, and sometimes I have to use quite some of them. Also the real structure is quite bigger, so the number of verifications that I would need to do, would also be bigger for each necessary property.
Now, I came up with some ideas, none of them I think is ideal:
Create methods to gather the properties, where it would abstract any necessary verification, or the logic to get default values. The drawback is that it would have, in my opinion, quite some duplicated code, since the verifications and the default values would be similar for some groups of fields.
Create a single generic method, where it receives a object, and a lamba function that access the required field. This method would try to execute the function and return it's result, and in case of an NullReferenceException, it would return a default value. The bright side of this one, is that it is realy generic, I just have to pass lambdas to access the properties, and the method would handle any problem. The drawback of it, is that I am using try -> catch to control logic, which is not the purpose of it, and the code might look confusing for other programmers that would eventually give maintenance to it.
Null Object Pattern, this would be the most elegant solution, I guess. It would have all the good points if it was a normal case. But the thing is the impact of providing Null Objects for this structure. Just to give a bit more of context, the software that I am working on, integrates with government's services, and the structure that I am working with, which is in the government's specifications, have some fields where null have some meaning which is different from a default value like "0". Also this specification changes from time to time, and the classes are generated again, and the post processing that I would have to do to create Null Objects, would also need maintenance, which seems a bit dangerous for me.
I hope that I made myself clear enough.
Thanks in advance.
Solution
This is a response as to how I solved my problem, based on the accepted answer.
I'm quite new to C#, and this kind of discution that was linked really helped me to come up with a elegant solution in many aspects. I still have the problem that depending where the code is executed, it uses .NET 2.0, but I also found a solution for this problem, where I can somewhat define extension methods: https://stackoverflow.com/a/707160/649790
And for the solution itself, I found this one the best:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
I can basically access the properties this way, and just do the math:
objClassA.With(o => o.PropB1).With(o => PropC11).Return(o => PropLeaf111, 0);
For each property that I need. It still isn't just:
objClassA.PropB1.PropC11.PropLeaf111
ofcourse, but it is far better that any solution that I found so far, since I was unfamiliar with Extension Methods, I really learned a lot.
Thanks again.
There is a strategy for dealing with this, involving the "Maybe" Monad.
Basically it works by providing a "fluent" interface where the chain of properties is interrupted by a null somewhere along the chain.
See here for an example: http://smellegantcode.wordpress.com/2008/12/11/the-maybe-monad-in-c/
And also here:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
http://mikehadlow.blogspot.co.uk/2011/01/monads-in-c-5-maybe.html
It's related to but not quite the same as what you seem to need; however, perhaps it can be adapted to your needs. The concepts are fairly fundamental.
I have read in several book and articles about TDD and BDD that one should avoid multiple assertions or expectations in a single unit test or specification. And I can understand the reasons for doing so. Still I am not sure what would be a good way to verify a complex result.
Assuming a method under test returns a complex object as a result (e.g. deserialization or database read) how do I verify the result correctly?
1.Asserting on each property:
Assert.AreEqual(result.Property1, 1);
Assert.AreEqual(result.Property2, "2");
Assert.AreEqual(result.Property3, null);
Assert.AreEqual(result.Property4, 4.0);
2.Relying on a correctly implemented .Equals():
Assert.AreEqual(result, expectedResult);
The disadvantage of 1. is that if the first assert fails all the following asserts are not run, which might have contained valuable information to find the problem. Maintainability might also be a problem as Properties come and go.
The disatvantage of 2. is that I seem to be testing more than one thing with this test. I might get false positives or negatives if .Equals() is not implemented correctly. Also with 2. I do not see, what properties are actually different if the test fails but I assume that can often be addressed with a decent .ToString() override. In any case I think I should avoid to be forced to throw the debugger at the failing tests to see the difference. I should see it right away.
The next problem with 2. is that it compares the whole object even though for some tests only some properties might be significant.
What would be a decent way or best practise for this in TDD and BDD.
Don't take TDD advice literally. What the "good guys" mean is that you should test one thing per test (to avoid a test failing for multiple reasons and subsequently having to debug the test to find the cause).
Now test "one thing" means "one behavior" ; NOT one assert per test IMHO.
It's a guideline not a rule.
So options:
For comparing whole data value objects
If the object already exposes a usable production Equals, use it.
Else do not add a Equals just for testing (See also Equality Pollution). Use a helper/extension method (or you could find one in an assertion library) obj1.HasSamePropertiesAs(obj2)
For comparing unstructured parts of objects (set of arbitrary properties - which should be rare),
create a well named private method AssertCustomerDetailsInOrderEquals(params) so that the test is clear in what part you're actually testing. Move the set of assertions into the private method.
With the context present in the question I'd go for option 1.
It likely depends on context. If I'm using some sort of built in object serialization within the .NET framework, I can be reasonably assured that if no errors were encountered then the entire object was appropriately marshaled. In that case, asserting a single field in the object is probably fine. I trust MS libraries to do the right thing.
If you are using SQL and manually mapping results to domain objects I feel that option 1 makes it quicker to diagnose when something breaks than option 2. Option 2 likely relies on toString methods in order to render the assertion failure:
Expected <1 2 null 4.0> but was <1 2 null null>
Now I am stuck trying to figure out what field 4.0/null was. Of course I could put the field name into the method:
Expected <Property1: 1, Property2: 2, Property3: null, Property4: 4.0>
but was <Property1: 1, Property2: 2, Property3: null, Property4: null>
This is fine for small numbers of properties, but begins to break down larger numbers of properties due to wrapping, etc. Also, the toString maintenance could become an issue as it needs to change at the same rate as the equals method.
Of course there is no correct answer, at the end of the day, it really boils down to your team's (or your own) personal preference.
Hope that helps!
Brandon
I would use the second approach by default. You're right, this fails if Equals() is not implemented correctly, but if you've implemented a custom Equals(), you should have unit-tested it too.
The second approach is in fact more abstract and consise and allows you to modify the code easier later, allowing in the same way to reduce code duplication. Let's say you choose the first approach:
You'll have to compare all properties in several places,
If you add a new property to a class, you'll have to add a new assertion in unit tests; modifying your Equals() would be much easier. Of course you have still to add a value of the property in expected result (if it's not the default value), but it would be shorter to do than adding a new assertion.
Also, it is much easier to see what properties are actually different with the second approach. You just run your tests in debug mode, and compare the properties on break.
By the way, you should never use ToString() for this. I suppose you wanted to say [DebuggerDisplay] attribute?
The next problem with 2. is that it compares the whole object even though for some tests only some properties might be significant.
If you have to compare only some properties, than:
ether you refactor your code by implementing a base class which contains only those properties. Example: if you want to compare a Cat to another Cat but only considering the properties common to Dog and other animals, implement Cat : Animal and compare the base class.
or you do what you've done in your first approach. Example: if you do care only about the quantity of milk the expected and the actual cats have drunk and their respective names, you'll have two assertions in your unit test.
Try "one aspect of behavior per test" rather than "one assertion per test". If you need more than one assertion to illustrate the behavior you're interested in, do that.
For instance, your example might be ShouldHaveSensibleDefaults. Splitting that up into ShouldHaveADefaultNameAsEmptyString, ShouldHaveNullAddress, ShouldHaveAQuantityOfZero etc. won't read as clearly. Nor will it help to hide the sensible defaults in another object then do a comparison.
However, I would separate examples where the values had defaults with any properties derived from some logic somewhere, for instance, ShouldCalculateTheTotalQuantity. Moving small examples like this into their own method makes it more readable.
You might also find that different properties on your object are changed by different contexts. Calling out each of these contexts and looking at those properties separately helps me to see how the context relates to the outcome.
Dave Astels, who came up with the "one assertion per test", now uses the phrase "one aspect of behavior" too, though he still finds it useful to separate that behavior. I tend to err on the side of readability and maintainability, so if it makes pragmatic sense to have more than one assertion, I'll do that.
I tend to assume that getters are little more than an access control wrapper around an otherwise fairly lightweight set of instructions to return a value (or set of values).
As a result, when I find myself writing longer and more CPU-hungry setters, I feel Perhaps this is not the smartest move. In calling a getter in my own code (in particular let's refer to C# where there is a syntactical difference between method vs. getter calls), then I make an implicit assumption that these are lightweight -- when in fact that may well not be the case.
What's the general consensus on this? Use of other people's libraries aside, do you write heavy getters? Or do you tend to treat heavier getters as "full methods"?
PS. Due to language differences, I expect there'll be quite a number of different thoughts on this...
Property getters are intended to retrieve a value. So when developers call them, there is an expectation that the call will return (almost) immediately with a value. If that expectation cannot be met, it is better to use a method instead of a property.
From MSDN:
Property Usage Guidelines
Use a method when:
[...]
The operation is expensive enough that you want to communicate to the
user that they should consider caching
the result.the result.
And also:
Choosing Between Properties and Methods
Do use a method, rather than a
property, in the following situations.
The operation is orders of magnitude slower than a field set would be. If
you are even considering providing an
asynchronous version of an operation
to avoid blocking the thread, it is
very likely that the operation is too
expensive to be a property. In
particular, operations that access the
network or the file system (other than
once for initialization) should most
likely be methods, not properties.
True. Getters should either access a simple member, or should compute and cache a derived value and then return the cached value (subsequent gets without interleaved sets should merely return that value). If I have a function that is going to do a lot of computation, then I name it computeX, not getX.
All in all, very few of my methods are so expensive in terms of time that it would matter based on the guidelines as posted by Thomas. But the thing is that generally calls to a getter should not affect that state of the class. I have no problem writing a getter that actually runs a calculation when called though.
In general, I write short, efficient ones. But you might have complex ones -- you need to consider how the getter will be used. And if it is an external API, you don't have any control how it is used - so shoot for efficiency.
I would agree with this. It is useful to have calculated properties for example for things like Age based on DateOfBirth. But I would avoid complex logic like having to go to a database just to calculate the value of an object's property. Use method in that case.
My opinion is that getter should be lightweight, but again as you say there is a broad definition of "lightweight", adding a logger is fine for tracing purpose, and probably some cache logic too and database/web service retrieval .. ouch. your getter is already considered heavy.
Getter are syntaxic sugar like setters, I consider that method are more flexible because of the simplicity of using them asynchronously.
But there is no expectation set for your getter performance (maybe try to mention it in the cough documentation ), as it could be trying to retrieve fresh values from slow source.
Others are certainly considering getter for simple objects, but as your object could be a proxy for your backend object, I really see not point too set performance expectations as it helps you makes the code more readable and more maintainable.
So my answer would be, "it depends", mainly on the level of abstraction of your object ( short logic for low level object as the value should probably be calculated on the setter level, long ones for hight level ).
I would like to get your opinion on as how far to go with side-effect-free setters.
Consider the following example:
Activity activity;
activity.Start = "2010-01-01";
activity.Duration = "10 days"; // sets Finish property to "2010-01-10"
Note that values for date and duration are shown only for indicative purposes.
So using setter for any of the properties Start, Finish and Duration will consequently change other properties and thus cannot be considered side-effect-free.
Same applies for instances of the Rectangle class, where setter for X is changing the values of Top and Bottom and so on.
The question is where would you draw a line between using setters, which have side-effects of changing values of logically related properties, and using methods, which couldn't be much more descriptive anyway. For example, defining a method called SetDurationTo(Duration duration) also doesn't reflect that either Start or Finish will be changed.
I think you're misunderstanding the term "side-effect" as it applies to program design. Setting a property is a side effect, no matter how much or how little internal state it changes, as long as it changes some sort of state. A "side-effect-free setter" would not be very useful.
Side-effects are something you want to avoid on property getters. Reading the value of a property is something that the caller does not expect to change any state (i.e. cause side-effects), so if it does, it's usually wrong or at least questionable (there are exceptions, such as lazy loading). But getters and setters alike are just wrappers for methods anyway. The Duration property, as far as the CLR is concerned, is just syntactic sugar for a set_Duration method.
This is exactly what abstractions such as classes are meant for - providing coarse-grained operations while keeping a consistent internal state. If you deliberately try to avoid having multiple side-effects in a single property assignment then your classes end up being not much more than dumb data containers.
So, answering the question directly: Where do I draw the line? Nowhere, as long as the method/property actually does what its name implies. If setting the Duration also changed the ActivityName, that might be a problem. If it changes the Finish property, that ought to be obvious; it should be impossible to change the Duration and have both the Start and Finish stay the same. The basic premise of OOP is that objects are intelligent enough to manage these operations by themselves.
If this bothers you at a conceptual level then don't have mutator properties at all - use an immutable data structure with read-only properties where all of the necessary arguments are supplied in the constructor. Then have two overloads, one that takes a Start/Duration and another that takes a Start/Finish. Or make only one of the properties writable - let's say Finish to keep it consistent with Start - and then make Duration read-only. Use the appropriate combination of mutable and immutable properties to ensure that there is only one way to change a certain state.
Otherwise, don't worry so much about this. Properties (and methods) shouldn't have unintended or undocumented side effects, but that's about the only guideline I would use.
Personally, I think it makes sense to have a side-effect to maintain a consistent state. Like you said, it makes sense to change logically-related values. In a sense, the side-effect is expected. But the important thing is to make that point clear. That is, it should be evident that the task the method is performing has some sort of side-effect. So instead of SetDurationTo you could call your function ChangeDurationTo, which implies something else is going on. You could also do this another way by having a function/method that adjusts the duration AdjustDurationTo and pass in a delta value. It would help if you document the function as having a side-effect.
I think another way to look at it is to see if a side-effect is expected. In your example of a Rectangle, I would expect it to change the values of top or bottom to maintain an internally-consistent state. I don't know if this is subjective; it just seems to make sense to me. As always, I think documentation wins out. If there is a side-effect, document it really well. Preferably by the name of the method and through supporting documentation.
One option is to make your class immutable and have methods create and return new instances of the class which have all appropriate values changed. Then there are no side effects or setters. Think of something like DateTime where you can call things like AddDays and AddHours which will return a new DateTime instance with the change applied.
I have always worked with the general rule of not allowing public setters on properties that are not side-effect free since callers of your public setters can't be certain of what might happen, but of course, people that modify the assembly itself should have a pretty good idea as they can see the code.
Of course, there are always times where you have to break the rule for the sake of either readability, to make your object model logical, or just to make things work right. Like you said, really a matter of preference in general.
I think it's mostly a matter of common-sense.
In this particular example, my problem is not so much that you've got properties that adjust "related" properties, it's that you've got properties taking string values that you're then internaly parsing into DateTime (or whatever) values.
I would much rather see something like this:
Activity activity;
activity.Start = DateTime.Parse("2010-01-01");
activity.Duration = Duration.Parse("10 days");
That is, you explicity note that you're doing parsing of strings. Allow the programmer to specify strong-typed objects when that is appropriate as well.