Good or bad practice? Initializing objects in getter - c#

I have a strange habit it seems... according to my co-worker at least. We've been working on a small project together. The way I wrote the classes is (simplified example):
[Serializable()]
public class Foo
{
public Foo()
{ }
private Bar _bar;
public Bar Bar
{
get
{
if (_bar == null)
_bar = new Bar();
return _bar;
}
set { _bar = value; }
}
}
So, basically, I only initialize any field when a getter is called and the field is still null. I figured this would reduce overload by not initializing any properties that aren't used anywhere.
ETA: The reason I did this is that my class has several properties that return an instance of another class, which in turn also have properties with yet more classes, and so on. Calling the constructor for the top class would subsequently call all constructors for all these classes, when they are not always all needed.
Are there any objections against this practice, other than personal preference?
UPDATE: I have considered the many differing opinions in regards to this question and I will stand by my accepted answer. However, I have now come to a much better understanding of the concept and I'm able to decide when to use it and when not.
Cons:
Thread safety issues
Not obeying a "setter" request when the value passed is null
Micro-optimizations
Exception handling should take place in a constructor
Need to check for null in class' code
Pros:
Micro-optimizations
Properties never return null
Delay or avoid loading "heavy" objects
Most of the cons are not applicable to my current library, however I would have to test to see if the "micro-optimizations" are actually optimizing anything at all.
LAST UPDATE:
Okay, I changed my answer. My original question was whether or not this is a good habit. And I'm now convinced that it's not. Maybe I will still use it in some parts of my current code, but not unconditionally and definitely not all the time. So I'm going to lose my habit and think about it before using it. Thanks everyone!

What you have here is a - naive - implementation of "lazy initialization".
Short answer:
Using lazy initialization unconditionally is not a good idea. It has its places but one has to take into consideration the impacts this solution has.
Background and explanation:
Concrete implementation:
Let's first look at your concrete sample and why I consider its implementation naive:
It violates the Principle of Least Surprise (POLS). When a value is assigned to a property, it is expected that this value is returned. In your implementation this is not the case for null:
foo.Bar = null;
Assert.Null(foo.Bar); // This will fail
It introduces quite some threading issues: Two callers of foo.Bar on different threads can potentially get two different instances of Bar and one of them will be without a connection to the Foo instance. Any changes made to that Bar instance are silently lost.
This is another case of a violation of POLS. When only the stored value of a property is accessed it is expected to be thread-safe. While you could argue that the class simply isn't thread-safe - including the getter of your property - you would have to document this properly as that's not the normal case. Furthermore the introduction of this issue is unnecessary as we will see shortly.
In general:
It's now time to look at lazy initialization in general:
Lazy initialization is usually used to delay the construction of objects that take a long time to be constructed or that take a lot of memory once fully constructed.
That is a very valid reason for using lazy initialization.
However, such properties normally don't have setters, which gets rid of the first issue pointed out above.
Furthermore, a thread-safe implementation would be used - like Lazy<T> - to avoid the second issue.
Even when considering these two points in the implementation of a lazy property, the following points are general problems of this pattern:
Construction of the object could be unsuccessful, resulting in an exception from a property getter. This is yet another violation of POLS and therefore should be avoided. Even the section on properties in the "Design Guidelines for Developing Class Libraries" explicitly states that property getters shouldn't throw exceptions:
Avoid throwing exceptions from property getters.
Property getters should be simple operations without any preconditions. If a getter might throw an exception, consider redesigning the property to be a method.
Automatic optimizations by the compiler are hurt, namely inlining and branch prediction. Please see Bill K's answer for a detailed explanation.
The conclusion of these points is the following:
For each single property that is implemented lazily, you should have considered these points.
That means, that it is a per-case decision and can't be taken as a general best practice.
This pattern has its place, but it is not a general best practice when implementing classes. It should not be used unconditionally, because of the reasons stated above.
In this section I want to discuss some of the points others have brought forward as arguments for using lazy initialization unconditionally:
Serialization:
EricJ states in one comment:
An object that may be serialized will not have it's contructor invoked when it is deserialized (depends on the serializer, but many common ones behave like this). Putting initialization code in the constructor means that you have to provide additional support for deserialization. This pattern avoids that special coding.
There are several problems with this argument:
Most objects never will be serialized. Adding some sort of support for it when it is not needed violates YAGNI.
When a class needs to support serialization there exist ways to enable it without a workaround that doesn't have anything to do with serialization at first glance.
Micro-optimization:
Your main argument is that you want to construct the objects only when someone actually accesses them. So you are actually talking about optimizing the memory usage.
I don't agree with this argument for the following reasons:
In most cases, a few more objects in memory have no impact whatsoever on anything. Modern computers have way enough memory. Without a case of actual problems confirmed by a profiler, this is pre-mature optimization and there are good reasons against it.
I acknowledge the fact that sometimes this kind of optimization is justified. But even in these cases lazy initialization doesn't seem to be the correct solution. There are two reasons speaking against it:
Lazy initialization potentially hurts performance. Maybe only marginally, but as Bill's answer showed, the impact is greater than one might think at first glance. So this approach basically trades performance versus memory.
If you have a design where it is a common use case to use only parts of the class, this hints at a problem with the design itself: The class in question most likely has more than one responsibility. The solution would be to split the class into several more focused classes.

It is a good design choice. Strongly recommended for library code or core classes.
It is called by some "lazy initialization" or "delayed initialization" and it is generally considered by all to be a good design choice.
First, if you initialize in the declaration of class level variables or constructor, then when your object is constructed, you have the overhead of creating a resource that may never be used.
Second, the resource only gets created if needed.
Third, you avoid garbage collecting an object that was not used.
Lastly, it is easier to handle initialization exceptions that may occur in the property then exceptions that occur during initialization of class level variables or the constructor.
There are exceptions to this rule.
Regarding the performance argument of the additional check for initialization in the "get" property, it is insignificant. Initializing and disposing an object is a more significant performance hit than a simple null pointer check with a jump.
Design Guidelines for Developing Class Libraries at http://msdn.microsoft.com/en-US/library/vstudio/ms229042.aspx
Regarding Lazy<T>
The generic Lazy<T> class was created exactly for what the poster wants, see Lazy Initialization at http://msdn.microsoft.com/en-us/library/dd997286(v=vs.100).aspx. If you have older versions of .NET, you have to use the code pattern illustrated in the question. This code pattern has become so common that Microsoft saw fit to include a class in the latest .NET libraries to make it easier to implement the pattern. In addition, if your implementation needs thread safety, then you have to add it.
Primitive Data Types and Simple Classes
Obvioulsy, you are not going to use lazy-initialization for primitive data type or simple class use like List<string>.
Before Commenting about Lazy
Lazy<T> was introduced in .NET 4.0, so please don't add yet another comment regarding this class.
Before Commenting about Micro-Optimizations
When you are building libraries, you must consider all optimizations. For instance, in the .NET classes you will see bit arrays used for Boolean class variables throughout the code to reduce memory consumption and memory fragmentation, just to name two "micro-optimizations".
Regarding User-Interfaces
You are not going to use lazy initialization for classes that are directly used by the user-interface. Last week I spent the better part of a day removing lazy loading of eight collections used in a view-model for combo-boxes. I have a LookupManager that handles lazy loading and caching of collections needed by any user-interface element.
"Setters"
I have never used a set-property ("setters") for any lazy loaded property. Therefore, you would never allow foo.Bar = null;. If you need to set Bar then I would create a method called SetBar(Bar value) and not use lazy-initialization
Collections
Class collection properties are always initialized when declared because they should never be null.
Complex Classes
Let me repeat this differently, you use lazy-initialization for complex classes. Which are usually, poorly designed classes.
Lastly
I never said to do this for all classes or in all cases. It is a bad habit.

Do you consider implementing such pattern using Lazy<T>?
In addition to easy creation of lazy-loaded objects, you get thread safety while the object is initialized:
http://msdn.microsoft.com/en-us/library/dd642331.aspx
As others said, you lazily-load objects if they're really resource-heavy or it takes some time to load them during object construction-time.

I think it depends on what you are initialising. I probably wouldn't do it for a list as the construction cost is quite small, so it can go in the constructor. But if it was a pre-populated list then I probably wouldn't until it was needed for the first time.
Basically, if the cost of construction outweighs the cost of doing an conditional check on each access then lazy create it. If not, do it in the constructor.

Lazy instantiation/initialization is a perfectly viable pattern. Keep in mind, though, that as a general rule consumers of your API do not expect getters and setters to take discernable time from the end user POV (or to fail).

The downside that I can see is that if you want to ask if Bars is null, it would never be, and you would be creating the list there.

I was just going to put a comment on Daniel's answer but I honestly don't think it goes far enough.
Although this is a very good pattern to use in certain situations (for instance, when the object is initialized from the database), it's a HORRIBLE habit to get into.
One of the best things about an object is that it offeres a secure, trusted environment. The very best case is if you make as many fields as possible "Final", filling them all in with the constructor. This makes your class quite bulletproof. Allowing fields to be changed through setters is a little less so, but not terrible. For instance:
class SafeClass
{
String name="";
Integer age=0;
public void setName(String newName)
{
assert(newName != null)
name=newName;
}// follow this pattern for age
...
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
}
With your pattern, the toString method would look like this:
if(name == null)
throw new IllegalStateException("SafeClass got into an illegal state! name is null")
if(age == null)
throw new IllegalStateException("SafeClass got into an illegal state! age is null")
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
Not only this, but you need null checks everywhere you might possibly use that object in your class (Outside your class is safe because of the null check in the getter, but you should be mostly using your classes members inside the class)
Also your class is perpetually in an uncertain state--for instance if you decided to make that class a hibernate class by adding a few annotations, how would you do it?
If you make any decision based on some micro-optomization without requirements and testing, it's almost certainly the wrong decision. In fact, there is a really really good chance that your pattern is actually slowing down the system even under the most ideal of circumstances because the if statement can cause a branch prediction failure on the CPU which will slow things down many many many more times than just assigning a value in the constructor unless the object you are creating is fairly complex or coming from a remote data source.
For an example of the brance prediction problem (which you are incurring repeatedly, nost just once), see the first answer to this awesome question: Why is it faster to process a sorted array than an unsorted array?

Let me just add one more point to many good points made by others...
The debugger will (by default) evaluate the properties when stepping through the code, which could potentially instantiate the Bar sooner than would normally happen by just executing the code. In other words, the mere act of debugging is changing the execution of the program.
This may or may not be a problem (depending on side-effects), but is something to be aware of.

Are you sure Foo should be instantiating anything at all?
To me it seems smelly (though not necessarily wrong) to let Foo instantiate anything at all. Unless it is Foo's express purpose to be a factory, it should not instantiate it's own collaborators, but instead get them injected in its constructor.
If however Foo's purpose of being is to create instances of type Bar, then I don't see anything wrong with doing it lazily.

Related

Is it best practice to create a variable if accessing a property of an object more than once in a routine?

When I first began as a junior C# dev, I was always told during code reviews that if I was accessing an object's property more than once in a given scope then I should create a local variable within the routine as it was cheaper than having to retrieve it from the object. I never really questioned it as it came from more people I perceived to be quite knowledgeable at the time.
Below is a rudimentary example
Example 1: storing an objects identifer in a local variable
public void DoWork(MyDataType object)
{
long id = object.Id;
if (ObjectLookup.TryAdd(id, object))
{
DoSomeOtherWork(id);
}
}
Example 2: retrieving the identifier from the Id property of the object property anytime it is needed
public void DoWork(MyDataType object)
{
if (ObjectLookup.TryAdd(object.Id, object))
{
DoSomeOtherWork(object.Id);
}
}
Does it actually matter or was it more a preference of coding style where I was working? Or perhaps a situational design time choice for the developer to make?
As explained in this answer, if the property is a basic getter/setter than the CLR "will inline the property access and generate code that’s as efficient as accessing a field directly". However, if your property, for example, does some calculations every time the property is accessed, then storing the value of the property in a local variable will avoid the overhead of additional calculations being done.
All the memory allocation stuff aside, there is the principle of DRY(don't repeat yourself). When you can deal with one variable with a short name rather than repeating the object nesting to access the external property, why not do that?
Apart from that, by creating that local variable you are respecting the single responsibility principle by isolating the methods from the external entity they don't need to know about.
And lastly if the so-called resuing leads to unwanted instantiation of reference types or any repetitive calculation, then it is a must to create the local var and reuse it throughout the class/method.
Any way you look at it, this practice helps with readability and more maintainable code, and possibly safer too.
I don't know if it is faster or not (though I would say that the difference is negligible and thus unimportant), but I'll cook up some benchmark for you.
What IS important though will be made evident to you with an example;
public Class MyDataType
{
publig int id {
get {
// Some actual code
return this.GetHashCode() * 2;
}
}
}
Does this make more sense? The first time I will access the id Getter, some code will be executed. The second time, the same code will be executed costing twice as much with no need.
It is very probable, that the reviewers had some such case in mind and instead of going into every single one property and check what you are doing and if it is safe to access, they created a new rule.
Another reason to store, would be useability.
Imagine the following example
object.subObject.someOtherSubObject.id
In this case I ask in reviews to store to a variable even if they use it just once. That is because if this is used in a complicated if statement, it will reduce the readability and maintainability of the code in the future.
A local variable is essentially guaranteed to be fast, whereas there is an unknown amount of overhead involved in accessing the property.
It's almost always a good idea to avoid repeating code whenever possible. Storing the value once means that there is only one thing to change if it needs changing, rather than two or more.
Using a variable allows you to provide a name, which gives you an opportunity to describe your intent.
I would also point out that if you're referring to other members of an object a lot in one place, that can often be a strong indication that the code you're writing actually belongs in that other type instead.
You should consider that getting a value from a method that is calculated from an I/O-bound or CPU-bound process can be irrational. Therefore, it's better to define a var and store the result to avoid multiple same processing.
In the case that you are using a value like object.Id, utilizing a variable decorated with const keyword guarantees that the value will not change in the scope.
Finally, it's better to use a local var in the classes and methods.

Should I throw on null parameters in private/internal methods?

I'm writing a library that has several public classes and methods, as well as several private or internal classes and methods that the library itself uses.
In the public methods I have a null check and a throw like this:
public int DoSomething(int number)
{
if (number == null)
{
throw new ArgumentNullException(nameof(number));
}
}
But then this got me thinking, to what level should I be adding parameter null checks to methods? Do I also start adding them to private methods? Should I only do it for public methods?
Ultimately, there isn't a uniform consensus on this. So instead of giving a yes or no answer, I'll try to list the considerations for making this decision:
Null checks bloat your code. If your procedures are concise, the null guards at the beginning of them may form a significant part of the overall size of the procedure, without expressing the purpose or behaviour of that procedure.
Null checks expressively state a precondition. If a method is going to fail when one of the values is null, having a null check at the top is a good way to demonstrate this to a casual reader without them having to hunt for where it's dereferenced. To improve this, people often use helper methods with names like Guard.AgainstNull, instead of having to write the check each time.
Checks in private methods are untestable. By introducing a branch in your code which you have no way of fully traversing, you make it impossible to fully test that method. This conflicts with the point of view that tests document the behaviour of a class, and that that class's code exists to provide that behaviour.
The severity of letting a null through depends on the situation. Often, if a null does get into the method, it'll be dereferenced a few lines later and you'll get a NullReferenceException. This really isn't much less clear than throwing an ArgumentNullException. On the other hand, if that reference is passed around quite a bit before being dereferenced, or if throwing an NRE will leave things in a messy state, then throwing early is much more important.
Some libraries, like .NET's Code Contracts, allow a degree of static analysis, which can add an extra benefit to your checks.
If you're working on a project with others, there may be existing team or project standards covering this.
If you're not a library developer, don't be defensive in your code
Write unit tests instead
In fact, even if you're developing a library, throwing is most of the time: BAD
1. Testing null on int must never be done in c# :
It raises a warning CS4072, because it's always false.
2. Throwing an Exception means it's exceptional: abnormal and rare.
It should never raise in production code. Especially because exception stack trace traversal can be a cpu intensive task. And you'll never be sure where the exception will be caught, if it's caught and logged or just simply silently ignored (after killing one of your background thread) because you don't control the user code. There is no "checked exception" in c# (like in java) which means you never know - if it's not well documented - what exceptions a given method could raise. By the way, that kind of documentation must be kept in sync with the code which is not always easy to do (increase maintenance costs).
3. Exceptions increases maintenance costs.
As exceptions are thrown at runtime and under certain conditions, they could be detected really late in the development process. As you may already know, the later an error is detected in the development process, the more expensive the fix will be. I've even seen exception raising code made its way to production code and not raise for a week, only for raising every day hereafter (killing the production. oops!).
4. Throwing on invalid input means you don't control input.
It's the case for public methods of libraries. However if you can check it at compile time with another type (for example a non nullable type like int) then it's the way to go. And of course, as they are public, it's their responsibility to check for input.
Imagine the user who uses what he thinks as valid data and then by a side effect, a method deep in the stack trace trows a ArgumentNullException.
What will be his reaction?
How can he cope with that?
Will it be easy for you to provide an explanation message ?
5. Private and internal methods should never ever throw exceptions related to their input.
You may throw exceptions in your code because an external component (maybe Database, a file or else) is misbehaving and you can't guarantee that your library will continue to run correctly in its current state.
Making a method public doesn't mean that it should (only that it can) be called from outside of your library (Look at Public versus Published from Martin Fowler). Use IOC, interfaces, factories and publish only what's needed by the user, while making the whole library classes available for unit testing. (Or you can use the InternalsVisibleTo mechanism).
6. Throwing exceptions without any explanation message is making fun of the user
No need to remind what feelings one can have when a tool is broken, without having any clue on how to fix it. Yes, I know. You comes to SO and ask a question...
7. Invalid input means it breaks your code
If your code can produce a valid output with the value then it's not invalid and your code should manage it. Add a unit test to test this value.
8. Think in user terms:
Do you like when a library you use throws exceptions for smashing your face ? Like: "Hey, it's invalid, you should have known that!"
Even if from your point of view - with your knowledge of the library internals, the input is invalid, how you can explain it to the user (be kind and polite):
Clear documentation (in Xml doc and an architecture summary may help).
Publish the xml doc with the library.
Clear error explanation in the exception if any.
Give the choice :
Look at Dictionary class, what do you prefer? what call do you think is the fastest ? What call can raises exception ?
Dictionary<string, string> dictionary = new Dictionary<string, string>();
string res;
dictionary.TryGetValue("key", out res);
or
var other = dictionary["key"];
9. Why not using Code Contracts ?
It's an elegant way to avoid the ugly if then throw and isolate the contract from the implementation, permitting to reuse the contract for different implementations at the same time. You can even publish the contract to your library user to further explain him how to use the library.
As a conclusion, even if you can easily use throw, even if you can experience exceptions raising when you use .Net Framework, that doesn't mean it could be used without caution.
Here are my opinions:
General Cases
Generally speaking, it is better to check for any invalid inputs before you process them in a method for robustness reason - be it private, protected, internal, protected internal, or public methods. Although there are some performance costs paid for this approach, in most cases, this is worth doing rather than paying more time to debug and to patch the codes later.
Strictly Speaking, however...
Strictly speaking, however, it is not always needed to do so. Some methods, usually private ones, can be left without any input checking provided that you have full guarantee that there isn't single call for the method with invalid inputs. This may give you some performance benefit, especially if the method is called frequently to do some basic computation/action. For such cases, doing checking for input validity may impair the performance significantly.
Public Methods
Now the public method is trickier. This is because, more strictly speaking, although the access modifier alone can tell who can use the methods, it cannot tell who will use the methods. More over, it also cannot tell how the methods are going to be used (that is, whether the methods are going to be called with invalid inputs in the given scopes or not).
The Ultimate Determining Factor
Although access modifiers for methods in the code can hint on how to use the methods, ultimately, it is humans who will use the methods, and it is up to the humans how they are going to use them and with what inputs. Thus, in some rare cases, it is possible to have a public method which is only called in some private scope and in that private scope, the inputs for the public methods are guaranteed to be valid before the public method is called.
In such cases then, even the access modifier is public, there isn't any real need to check for invalid inputs, except for robust design reason. And why is this so? Because there are humans who know completely when and how the methods shall be called!
Here we can see, there is no guarantee either that public method always require checking for invalid inputs. And if this is true for public methods, it must also be true for protected, internal, protected internal, and private methods as well.
Conclusions
So, in conclusion, we can say a couple of things to help us making decisions:
Generally, it is better to have checks for any invalid inputs for robust design reason, provided that performance is not at stake. This is true for any type of access modifiers.
The invalid inputs check could be skipped if performance gain could be significantly improved by doing so, provided that it can also be guaranteed that the scope where the methods are called are always giving the methods valid inputs.
private method is usually where we skip such checking, but there is no guarantee that we cannot do that for public method as well
Humans are the ones who ultimately use the methods. Regardless of how the access modifiers can hint the use of the methods, how the methods are actually used and called depend on the coders. Thus, we can only say about general/good practice, without restricting it to be the only way of doing it.
The public interface of your library deserves tight checking of preconditions, because you should expect the users of your library to make mistakes and violate the preconditions by accident. Help them understand what is going on in your library.
The private methods in your library do not require such runtime checking because you call them yourself. You are in full control of what you are passing. If you want to add checks because you are afraid to mess up, then use asserts. They will catch your own mistakes, but do not impede performance during runtime.
Though you tagged language-agnostic, it seems to me that it probably doesn't exist a general response.
Notably, in your example you hinted the argument: so with a language accepting hinting it'll fire an error as soon as entering the function, before you can take any action.
In such a case, the only solution is to have checked the argument before calling your function... but since you're writing a library, that cannot have sense!
In the other hand, with no hinting, it remains realistic to check inside the function.
So at this step of the reflexion, I'd already suggest to give up hinting.
Now let's go back to your precise question: to what level should it be checked?
For a given data piece it'd happen only at the highest level where it can "enter" (may be several occurrences for the same data), so logically it'd concern only public methods.
That's for the theory. But maybe you plan a huge, complex, library so it might be not easy to ensure having certainty about registering all "entry points".
In this case, I'd suggest the opposite: consider to merely apply your controls everywhere, then only omit it where you clearly see it's duplicate.
Hope this helps.
In my opinion you should ALWAYS check for "invalid" data - independent whether it is a private or public method.
Looked from the other way... why should you be able to work with something invalid just because the method is private? Doesn't make sense, right? Always try to use defensive programming and you will be happier in life ;-)
This is a question of preference. But consider instead why are you checking for null or rather checking for valid input. It's probably because you want to let the consumer of your library to know when he/she is using it incorrectly.
Let's imagine that we have implemented a class PersonList in a library. This list can only contain objects of the type Person. We have also on our PersonList implemented some operations and therefore we do not want it to contain any null values.
Consider the two following implementations of the Add method for this list:
Implementation 1
public void Add(Person item)
{
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Implementation 2
public void Add(Person item)
{
if(item == null)
{
throw new ArgumentNullException("Cannot add null to PersonList");
}
if(_size == _items.Length)
{
EnsureCapacity(_size + 1);
}
_items[_size++] = item;
}
Let's say we go with implementation 1
Null values can now be added in the list
All opoerations implemented on the list will have to handle theese null values
If we should check for and throw a exception in our operation, consumer will be notified about the exception when he/she is calling one of the operations and it will at this state be very unclear what he/she has done wrong (it just wouldn't make any sense to go for this approach).
If we instead choose to go with implementation 2, we make sure input to our library has the quality that we require for our class to operate on it. This means we only need to handle this here and then we can forget about it while we are implementing our other operations.
It will also become more clear for the consumer that he/she is using the library in the wrong way when he/she gets a ArgumentNullException on .Add instead of in .Sort or similair.
To sum it up my preference is to check for valid argument when it is being supplied by the consumer and it's not being handled by the private/internal methods of the library. This basically means we have to check arguments in constructors/methods that are public and takes parameters. Our private/internal methods can only be called from our public ones and they have allready checked the input which means we are good to go!
Using Code Contracts should also be considered when verifying input.

"Getters should not include large amounts of logic." True or false?

I tend to assume that getters are little more than an access control wrapper around an otherwise fairly lightweight set of instructions to return a value (or set of values).
As a result, when I find myself writing longer and more CPU-hungry setters, I feel Perhaps this is not the smartest move. In calling a getter in my own code (in particular let's refer to C# where there is a syntactical difference between method vs. getter calls), then I make an implicit assumption that these are lightweight -- when in fact that may well not be the case.
What's the general consensus on this? Use of other people's libraries aside, do you write heavy getters? Or do you tend to treat heavier getters as "full methods"?
PS. Due to language differences, I expect there'll be quite a number of different thoughts on this...
Property getters are intended to retrieve a value. So when developers call them, there is an expectation that the call will return (almost) immediately with a value. If that expectation cannot be met, it is better to use a method instead of a property.
From MSDN:
Property Usage Guidelines
Use a method when:
[...]
The operation is expensive enough that you want to communicate to the
user that they should consider caching
the result.the result.
And also:
Choosing Between Properties and Methods
Do use a method, rather than a
property, in the following situations.
The operation is orders of magnitude slower than a field set would be. If
you are even considering providing an
asynchronous version of an operation
to avoid blocking the thread, it is
very likely that the operation is too
expensive to be a property. In
particular, operations that access the
network or the file system (other than
once for initialization) should most
likely be methods, not properties.
True. Getters should either access a simple member, or should compute and cache a derived value and then return the cached value (subsequent gets without interleaved sets should merely return that value). If I have a function that is going to do a lot of computation, then I name it computeX, not getX.
All in all, very few of my methods are so expensive in terms of time that it would matter based on the guidelines as posted by Thomas. But the thing is that generally calls to a getter should not affect that state of the class. I have no problem writing a getter that actually runs a calculation when called though.
In general, I write short, efficient ones. But you might have complex ones -- you need to consider how the getter will be used. And if it is an external API, you don't have any control how it is used - so shoot for efficiency.
I would agree with this. It is useful to have calculated properties for example for things like Age based on DateOfBirth. But I would avoid complex logic like having to go to a database just to calculate the value of an object's property. Use method in that case.
My opinion is that getter should be lightweight, but again as you say there is a broad definition of "lightweight", adding a logger is fine for tracing purpose, and probably some cache logic too and database/web service retrieval .. ouch. your getter is already considered heavy.
Getter are syntaxic sugar like setters, I consider that method are more flexible because of the simplicity of using them asynchronously.
But there is no expectation set for your getter performance (maybe try to mention it in the cough documentation ), as it could be trying to retrieve fresh values from slow source.
Others are certainly considering getter for simple objects, but as your object could be a proxy for your backend object, I really see not point too set performance expectations as it helps you makes the code more readable and more maintainable.
So my answer would be, "it depends", mainly on the level of abstraction of your object ( short logic for low level object as the value should probably be calculated on the setter level, long ones for hight level ).

Approach to side-effect-free setters

I would like to get your opinion on as how far to go with side-effect-free setters.
Consider the following example:
Activity activity;
activity.Start = "2010-01-01";
activity.Duration = "10 days"; // sets Finish property to "2010-01-10"
Note that values for date and duration are shown only for indicative purposes.
So using setter for any of the properties Start, Finish and Duration will consequently change other properties and thus cannot be considered side-effect-free.
Same applies for instances of the Rectangle class, where setter for X is changing the values of Top and Bottom and so on.
The question is where would you draw a line between using setters, which have side-effects of changing values of logically related properties, and using methods, which couldn't be much more descriptive anyway. For example, defining a method called SetDurationTo(Duration duration) also doesn't reflect that either Start or Finish will be changed.
I think you're misunderstanding the term "side-effect" as it applies to program design. Setting a property is a side effect, no matter how much or how little internal state it changes, as long as it changes some sort of state. A "side-effect-free setter" would not be very useful.
Side-effects are something you want to avoid on property getters. Reading the value of a property is something that the caller does not expect to change any state (i.e. cause side-effects), so if it does, it's usually wrong or at least questionable (there are exceptions, such as lazy loading). But getters and setters alike are just wrappers for methods anyway. The Duration property, as far as the CLR is concerned, is just syntactic sugar for a set_Duration method.
This is exactly what abstractions such as classes are meant for - providing coarse-grained operations while keeping a consistent internal state. If you deliberately try to avoid having multiple side-effects in a single property assignment then your classes end up being not much more than dumb data containers.
So, answering the question directly: Where do I draw the line? Nowhere, as long as the method/property actually does what its name implies. If setting the Duration also changed the ActivityName, that might be a problem. If it changes the Finish property, that ought to be obvious; it should be impossible to change the Duration and have both the Start and Finish stay the same. The basic premise of OOP is that objects are intelligent enough to manage these operations by themselves.
If this bothers you at a conceptual level then don't have mutator properties at all - use an immutable data structure with read-only properties where all of the necessary arguments are supplied in the constructor. Then have two overloads, one that takes a Start/Duration and another that takes a Start/Finish. Or make only one of the properties writable - let's say Finish to keep it consistent with Start - and then make Duration read-only. Use the appropriate combination of mutable and immutable properties to ensure that there is only one way to change a certain state.
Otherwise, don't worry so much about this. Properties (and methods) shouldn't have unintended or undocumented side effects, but that's about the only guideline I would use.
Personally, I think it makes sense to have a side-effect to maintain a consistent state. Like you said, it makes sense to change logically-related values. In a sense, the side-effect is expected. But the important thing is to make that point clear. That is, it should be evident that the task the method is performing has some sort of side-effect. So instead of SetDurationTo you could call your function ChangeDurationTo, which implies something else is going on. You could also do this another way by having a function/method that adjusts the duration AdjustDurationTo and pass in a delta value. It would help if you document the function as having a side-effect.
I think another way to look at it is to see if a side-effect is expected. In your example of a Rectangle, I would expect it to change the values of top or bottom to maintain an internally-consistent state. I don't know if this is subjective; it just seems to make sense to me. As always, I think documentation wins out. If there is a side-effect, document it really well. Preferably by the name of the method and through supporting documentation.
One option is to make your class immutable and have methods create and return new instances of the class which have all appropriate values changed. Then there are no side effects or setters. Think of something like DateTime where you can call things like AddDays and AddHours which will return a new DateTime instance with the change applied.
I have always worked with the general rule of not allowing public setters on properties that are not side-effect free since callers of your public setters can't be certain of what might happen, but of course, people that modify the assembly itself should have a pretty good idea as they can see the code.
Of course, there are always times where you have to break the rule for the sake of either readability, to make your object model logical, or just to make things work right. Like you said, really a matter of preference in general.
I think it's mostly a matter of common-sense.
In this particular example, my problem is not so much that you've got properties that adjust "related" properties, it's that you've got properties taking string values that you're then internaly parsing into DateTime (or whatever) values.
I would much rather see something like this:
Activity activity;
activity.Start = DateTime.Parse("2010-01-01");
activity.Duration = Duration.Parse("10 days");
That is, you explicity note that you're doing parsing of strings. Allow the programmer to specify strong-typed objects when that is appropriate as well.

Should you use accessor properties from within the class, or just from outside of the class? [duplicate]

This question already has answers here:
What is the best way to access properties from the same class, via accessors or directly? [closed]
(5 answers)
Closed 9 years ago.
I have a class 'Data' that uses a getter to access some array. If the array is null, then I want Data to access the file, fill up the array, and then return the specific value.
Now here's my question:
When creating getters and setters should you also use those same accessor properties as your way of accessing that array (in this case)? Or should you just access the array directly?
The problem I am having using the accessors from within the class is that I get infinite loops as the calling class looks for some info in Data.array, the getter finds the array null so goes to get it from the file, and that function ends up calling the getter again from within Data, array is once again null, and we're stuck in an infinite loop.
EDIT:
So is there no official stance on this? I see the wisdom in not using Accessors with file access in them, but some of you are saying to always use accessors from within a class, and others are saying to never use accessors from with the class............................................
I agree with krosenvold, and want to generalize his advice a bit:
Do not use Property getters and setters for expensive operations, like reading a file or accessing the network. Use explicit function calls for the expensive operations.
Generally, users of the class will not expect that a simple property retrieval or assignment may take a lot of time.
This is also recommended in Microsoft's Framework Design Guidelines.;
Do use a method, rather than a
property, in the following situations.
The operation is orders of magnitude
slower than a field set would be. If
you are even considering providing an
asynchronous version of an operation
to avoid blocking the thread, it is
very likely that the operation is too
expensive to be a property. In
particular, operations that access the
network or the file system (other than
once for initialization) should most
likely be methods, not properties.
I think its a good idea to always use the accessors. Then if you need any special logic when getting or setting the property, you know that everything is performing that logic.
Can you post the getter and setter for one of these properties? Maybe we can help debug it.
I have written a getter that opens a file and always regretted it later. Nowdays I would never solve that problem by lazy-constructing through the getter - period. There's the issue of getters with side-effects where people don't expect all kinds of crazy activity to be going on behind the getter. Furthermore you probably have to ensure thread safety, which can further pollute this code. Unit-Testing can also become slightly harder each time you do this.
Explicit construction is a much better solution than all sorts of lazy-init getters. It may be because I'm using DI frameworks that give me all of this as part of the standard usage patterns. I really try to treat construction logic as distinctly as possible and not hide too much, it makes code easier to understand.
No. I don't believe you should, the reason: maintainable code.
I've seen people use properties within the defining class and at first all looks well. Then someone else comes along and adds features to the properties, then someone else comes along and tries to change the class, they don't fully understand the class and all hell breaks loose.
It shouldn't because maintenance teams should fully understand what they are trying to change but they are often looking at a different problem or error and the encapsulated property often escapes them. I've see this a lot and so never use properties internally.
They can also be a performance hog, what should be a simple lookup can turn nasty if someone puts database code in the properties - and I have seen people do that too!
The KISS principle is still valid after all these years...!
Aside from the point made by others, whether to use an accessor or a field directly may need to be informed by semantics. Some times the semantics of an external consumer accessing a property is different from the mechanical necessity of accessing its value by internal code.
Eric Lippert recently blogged on this subject in a couple of posts:-
automatic-vs-explicit-properties
future-proofing-a-design
If using an Get method leads to this kind of error, you should access the value directly. Otherwise, it is good practice to use your accessors. If you should modify either the getter or setter to take specific actions in the future, you'll break your object if you fail to use that path.
I guess what you are trying to implement is some sort of a lazy-loading property, where you load the data only when it is accessed for the first time.
In such a case I would use the following approach to prevent the infinite loop:
private MyData _data = null;
public MyData Data
{
get
{
if (_data == null)
_data = LoadDataFromFile();
return _data;
}
}
private MyData LoadDataFromFile()
{
// ...
}
In other words:
don't implement a setter
always use the property to access the data (never use the field directly)
You should always use the accessors, but the function that reads the value from the file (which should be private, and called something like getValueFromFile) should only be called when the value has to be read from the file, and should just read the file and return the value(s). That function might even be better off in another class, dedicated to reading values from your data file.
If I am understanding it right, you are trying to access a property from within it's implementation (by using a method that calls the same property in the property's implementation code). I am not sure if there any official standards regarding this, but I would consider it a bad practice, unless there would be a specific need to do it.
I always prefer using private members within a class instead of properties, unless I need the functionality property implementation provides.

Categories

Resources