I'm programming a board game in C# where the two most important classes are Piece and Square. Typically every instance of Piece has a Square (as a property) and every instance of Square may have a Piece (also as a property) or may be empty.
I placed code in the set methods of Square.Piece and Piece.Square to ensure that this relationship was maintained (e.g. when a piece is moved from one square to another) and to avoid the obvious danger of the linked properties calling each other's set methods in an endless loop.
But when a piece is removed from the board and its square set to 'null' I seem to have too many if statements to avoid null exceptions and what seemed a very simple pattern conceptually becomes far too complex and error-prone in practice.
I'm wondering whether my whole approach is wrong. Should a piece have a square as a property when Square also has Piece as a property? Have I in fact started coding an anti-pattern? A Piece's Square may be null on creation, and a Square's Piece is frequently null (representing empty). Should I use another way to represent empty Squares and Pieces which are not on the board?
I'm assuming there are preferred, robust solutions to the more general case of when one class is linked to another in a two-way relationship such as this.
Many thanks for any ideas.
Use an higher level abstraction, with methods like
move(piece, originalSquare, destinationSqueare)
place(piece, square)
remove(piece)
promote(originalPiece, finalPiece)
...
these methods will use your basic methods from Piece and Square, and will be the ones used by your main logic
It seems to me that you're overthinking the process of removing a Piece from the Board.
The game is going to be centered on the Board class. Once you remove a Piece from the Board, that piece is gone; you're not going to do anything with it anymore. You won't be calling any of its methods or anything. That object is now dead -- unreachable. So there's not actually any reason to change the Piece's state to null out its Board reference.
This is indeed a very common pattern, and the usual practice is to just let it go. Once nobody references the Piece anymore, you're done; it no longer matters that the Piece still happens to reference the Board. Let it keep its reference; it won't cause any harm (since nobody will be doing anything with that Piece anymore), and it'll be garbage collected soon anyway.
Related
There is a simulator. In this simulator we have to pass a corridor. In the corridor there is a door and a puzzle, the solution of which opens this door. As soon as we solve the puzzle, the value of the boolean attribute of the class (something like isOpen) changes to true
This corridor needs to be traversed several times. The corridor itself doesn't change, but the puzzle is random each time.
So, I decided to create a macro application that reaches the puzzle and waits until I solve it
And since the simulation has the boolean variable I need, I was wondering: can I get it, in order to then create a delay in the macro until it is true?
The main problem here is that the two programs are not connected in any way.
I also want to note that I have an understanding that all variables lose their names after compilation, and that variable values subsequently occupy a random place in memory
Also, I have experience with programs like CheatEngine, which is to find the address of a value by its value
But I may just not know all the details, thinking that it's impossible, even if in reality there are ways to do it.
For this reason, I would appreciate it if you could explain to me how this can be done, or, if it is not possible, explain why.
Also, I wouldn't mind a response like "Read this "
I understand that you want to inspect one or more properties of an instance of an object at runtime and this can be achieved by using the so-called Reflection.
The latter provides functionalities that allow you to examine objects at runtime, get their Type, read their properties and invoke their methods. It should be used carefully.
Using reflection you can do
// retrieves the value of the property "NameOfProperty" for the instance of object myobj
bool myFlag = myobj.GetType().GetProperty("NameOfProperty").GetValue(myobj, null);
I'm making a tetris-like game.
There are two classes in the game: simulator and block (i.e. a tetris block)
Right now, block only holds information about itself that is independent of any other parts of the game. So it's information like "shape", "color", "rotation" etc. (Notably excluding position)
simulator on the other hand, contains an data structure of block's and is responsible for their behaviours (falling, etc.)
Currently, to track the position of each block, simulator actually has an array called array_block_position, which is an array containing sub-arrays in the form of [block-instance, position], e.g. [block#3, (10,12)]
Now, I have this dilemma:
I am tempted to add a "position" variable inside the block class itself instead of tracking their positions in simulator, so simulator can have an array of visible_blocks that is simply an array of block instances rather than an array of arrays. (Visible is emphasized because there may be blocks that are not a part of the playing field, such as the block that is about to be deployed but not yet)
I am uncertain whether I should add "position" as a variable to the block class itself because of some vague memory of learning that this being a bad idea in college, but I no longer remember why it's supposed to be bad, if at all.
I want to know, in general, whether (and when) it's acceptable to store information about an object's relationship with the external world inside the object itself.
(My instincts tells me "no", because a variable like "position" is not meaningful to an object like block without some reference to the external world, but my reasoning sees no immediate problem with something like this in practice)
So I didn't find any elegant solution for this, either googling or throughout stackoverflow. I guess that I have a very specific situation in my hands, anyway here it goes:
I have a object structure, which I don't have control of, because I receive this structure from an external WS. This is quite a huge object, with various levels of fields and properties, and this fields and properties can or can't be null, in any level. You can think of this object as an anemic model, it doesn't have behaviour, just state.
For the purpose of this question, I'll give you a simplified sample that simulates my situation:
Class A
PropB1
PropC11
PropLeaf111
PropC12
PropLeaf112
PropB2
PropC21
PropLeaf211
PropC22
PropLeaf221
So, throughout my code I have to access a number of these properties, in different levels, to do some math in order to calculate what I need. Basically for each type of calculation that I have to do, I have to test each level of the properties that I need, to check if it's not null, in which case I would return (decimal) 0, or any other default value depending on the business logic.
Sample of a math that I have to do with it:
var value = 0;
if (objClassA.PropB1 != null && objClassA.PropB1.PropC11 != null) {
var leaf = objClassA.PropB1.PropC11.PropLeaf111;
value = leaf.HasValue ? leaf.Value : value;
}
Just to be very, the leaf properties of this structure would always be primitives, or nullable primitives in which case I give the proper treatment. This is "the logic" that I have to do for each property that I need, and sometimes I have to use quite some of them. Also the real structure is quite bigger, so the number of verifications that I would need to do, would also be bigger for each necessary property.
Now, I came up with some ideas, none of them I think is ideal:
Create methods to gather the properties, where it would abstract any necessary verification, or the logic to get default values. The drawback is that it would have, in my opinion, quite some duplicated code, since the verifications and the default values would be similar for some groups of fields.
Create a single generic method, where it receives a object, and a lamba function that access the required field. This method would try to execute the function and return it's result, and in case of an NullReferenceException, it would return a default value. The bright side of this one, is that it is realy generic, I just have to pass lambdas to access the properties, and the method would handle any problem. The drawback of it, is that I am using try -> catch to control logic, which is not the purpose of it, and the code might look confusing for other programmers that would eventually give maintenance to it.
Null Object Pattern, this would be the most elegant solution, I guess. It would have all the good points if it was a normal case. But the thing is the impact of providing Null Objects for this structure. Just to give a bit more of context, the software that I am working on, integrates with government's services, and the structure that I am working with, which is in the government's specifications, have some fields where null have some meaning which is different from a default value like "0". Also this specification changes from time to time, and the classes are generated again, and the post processing that I would have to do to create Null Objects, would also need maintenance, which seems a bit dangerous for me.
I hope that I made myself clear enough.
Thanks in advance.
Solution
This is a response as to how I solved my problem, based on the accepted answer.
I'm quite new to C#, and this kind of discution that was linked really helped me to come up with a elegant solution in many aspects. I still have the problem that depending where the code is executed, it uses .NET 2.0, but I also found a solution for this problem, where I can somewhat define extension methods: https://stackoverflow.com/a/707160/649790
And for the solution itself, I found this one the best:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
I can basically access the properties this way, and just do the math:
objClassA.With(o => o.PropB1).With(o => PropC11).Return(o => PropLeaf111, 0);
For each property that I need. It still isn't just:
objClassA.PropB1.PropC11.PropLeaf111
ofcourse, but it is far better that any solution that I found so far, since I was unfamiliar with Extension Methods, I really learned a lot.
Thanks again.
There is a strategy for dealing with this, involving the "Maybe" Monad.
Basically it works by providing a "fluent" interface where the chain of properties is interrupted by a null somewhere along the chain.
See here for an example: http://smellegantcode.wordpress.com/2008/12/11/the-maybe-monad-in-c/
And also here:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
http://mikehadlow.blogspot.co.uk/2011/01/monads-in-c-5-maybe.html
It's related to but not quite the same as what you seem to need; however, perhaps it can be adapted to your needs. The concepts are fairly fundamental.
I have a strange habit it seems... according to my co-worker at least. We've been working on a small project together. The way I wrote the classes is (simplified example):
[Serializable()]
public class Foo
{
public Foo()
{ }
private Bar _bar;
public Bar Bar
{
get
{
if (_bar == null)
_bar = new Bar();
return _bar;
}
set { _bar = value; }
}
}
So, basically, I only initialize any field when a getter is called and the field is still null. I figured this would reduce overload by not initializing any properties that aren't used anywhere.
ETA: The reason I did this is that my class has several properties that return an instance of another class, which in turn also have properties with yet more classes, and so on. Calling the constructor for the top class would subsequently call all constructors for all these classes, when they are not always all needed.
Are there any objections against this practice, other than personal preference?
UPDATE: I have considered the many differing opinions in regards to this question and I will stand by my accepted answer. However, I have now come to a much better understanding of the concept and I'm able to decide when to use it and when not.
Cons:
Thread safety issues
Not obeying a "setter" request when the value passed is null
Micro-optimizations
Exception handling should take place in a constructor
Need to check for null in class' code
Pros:
Micro-optimizations
Properties never return null
Delay or avoid loading "heavy" objects
Most of the cons are not applicable to my current library, however I would have to test to see if the "micro-optimizations" are actually optimizing anything at all.
LAST UPDATE:
Okay, I changed my answer. My original question was whether or not this is a good habit. And I'm now convinced that it's not. Maybe I will still use it in some parts of my current code, but not unconditionally and definitely not all the time. So I'm going to lose my habit and think about it before using it. Thanks everyone!
What you have here is a - naive - implementation of "lazy initialization".
Short answer:
Using lazy initialization unconditionally is not a good idea. It has its places but one has to take into consideration the impacts this solution has.
Background and explanation:
Concrete implementation:
Let's first look at your concrete sample and why I consider its implementation naive:
It violates the Principle of Least Surprise (POLS). When a value is assigned to a property, it is expected that this value is returned. In your implementation this is not the case for null:
foo.Bar = null;
Assert.Null(foo.Bar); // This will fail
It introduces quite some threading issues: Two callers of foo.Bar on different threads can potentially get two different instances of Bar and one of them will be without a connection to the Foo instance. Any changes made to that Bar instance are silently lost.
This is another case of a violation of POLS. When only the stored value of a property is accessed it is expected to be thread-safe. While you could argue that the class simply isn't thread-safe - including the getter of your property - you would have to document this properly as that's not the normal case. Furthermore the introduction of this issue is unnecessary as we will see shortly.
In general:
It's now time to look at lazy initialization in general:
Lazy initialization is usually used to delay the construction of objects that take a long time to be constructed or that take a lot of memory once fully constructed.
That is a very valid reason for using lazy initialization.
However, such properties normally don't have setters, which gets rid of the first issue pointed out above.
Furthermore, a thread-safe implementation would be used - like Lazy<T> - to avoid the second issue.
Even when considering these two points in the implementation of a lazy property, the following points are general problems of this pattern:
Construction of the object could be unsuccessful, resulting in an exception from a property getter. This is yet another violation of POLS and therefore should be avoided. Even the section on properties in the "Design Guidelines for Developing Class Libraries" explicitly states that property getters shouldn't throw exceptions:
Avoid throwing exceptions from property getters.
Property getters should be simple operations without any preconditions. If a getter might throw an exception, consider redesigning the property to be a method.
Automatic optimizations by the compiler are hurt, namely inlining and branch prediction. Please see Bill K's answer for a detailed explanation.
The conclusion of these points is the following:
For each single property that is implemented lazily, you should have considered these points.
That means, that it is a per-case decision and can't be taken as a general best practice.
This pattern has its place, but it is not a general best practice when implementing classes. It should not be used unconditionally, because of the reasons stated above.
In this section I want to discuss some of the points others have brought forward as arguments for using lazy initialization unconditionally:
Serialization:
EricJ states in one comment:
An object that may be serialized will not have it's contructor invoked when it is deserialized (depends on the serializer, but many common ones behave like this). Putting initialization code in the constructor means that you have to provide additional support for deserialization. This pattern avoids that special coding.
There are several problems with this argument:
Most objects never will be serialized. Adding some sort of support for it when it is not needed violates YAGNI.
When a class needs to support serialization there exist ways to enable it without a workaround that doesn't have anything to do with serialization at first glance.
Micro-optimization:
Your main argument is that you want to construct the objects only when someone actually accesses them. So you are actually talking about optimizing the memory usage.
I don't agree with this argument for the following reasons:
In most cases, a few more objects in memory have no impact whatsoever on anything. Modern computers have way enough memory. Without a case of actual problems confirmed by a profiler, this is pre-mature optimization and there are good reasons against it.
I acknowledge the fact that sometimes this kind of optimization is justified. But even in these cases lazy initialization doesn't seem to be the correct solution. There are two reasons speaking against it:
Lazy initialization potentially hurts performance. Maybe only marginally, but as Bill's answer showed, the impact is greater than one might think at first glance. So this approach basically trades performance versus memory.
If you have a design where it is a common use case to use only parts of the class, this hints at a problem with the design itself: The class in question most likely has more than one responsibility. The solution would be to split the class into several more focused classes.
It is a good design choice. Strongly recommended for library code or core classes.
It is called by some "lazy initialization" or "delayed initialization" and it is generally considered by all to be a good design choice.
First, if you initialize in the declaration of class level variables or constructor, then when your object is constructed, you have the overhead of creating a resource that may never be used.
Second, the resource only gets created if needed.
Third, you avoid garbage collecting an object that was not used.
Lastly, it is easier to handle initialization exceptions that may occur in the property then exceptions that occur during initialization of class level variables or the constructor.
There are exceptions to this rule.
Regarding the performance argument of the additional check for initialization in the "get" property, it is insignificant. Initializing and disposing an object is a more significant performance hit than a simple null pointer check with a jump.
Design Guidelines for Developing Class Libraries at http://msdn.microsoft.com/en-US/library/vstudio/ms229042.aspx
Regarding Lazy<T>
The generic Lazy<T> class was created exactly for what the poster wants, see Lazy Initialization at http://msdn.microsoft.com/en-us/library/dd997286(v=vs.100).aspx. If you have older versions of .NET, you have to use the code pattern illustrated in the question. This code pattern has become so common that Microsoft saw fit to include a class in the latest .NET libraries to make it easier to implement the pattern. In addition, if your implementation needs thread safety, then you have to add it.
Primitive Data Types and Simple Classes
Obvioulsy, you are not going to use lazy-initialization for primitive data type or simple class use like List<string>.
Before Commenting about Lazy
Lazy<T> was introduced in .NET 4.0, so please don't add yet another comment regarding this class.
Before Commenting about Micro-Optimizations
When you are building libraries, you must consider all optimizations. For instance, in the .NET classes you will see bit arrays used for Boolean class variables throughout the code to reduce memory consumption and memory fragmentation, just to name two "micro-optimizations".
Regarding User-Interfaces
You are not going to use lazy initialization for classes that are directly used by the user-interface. Last week I spent the better part of a day removing lazy loading of eight collections used in a view-model for combo-boxes. I have a LookupManager that handles lazy loading and caching of collections needed by any user-interface element.
"Setters"
I have never used a set-property ("setters") for any lazy loaded property. Therefore, you would never allow foo.Bar = null;. If you need to set Bar then I would create a method called SetBar(Bar value) and not use lazy-initialization
Collections
Class collection properties are always initialized when declared because they should never be null.
Complex Classes
Let me repeat this differently, you use lazy-initialization for complex classes. Which are usually, poorly designed classes.
Lastly
I never said to do this for all classes or in all cases. It is a bad habit.
Do you consider implementing such pattern using Lazy<T>?
In addition to easy creation of lazy-loaded objects, you get thread safety while the object is initialized:
http://msdn.microsoft.com/en-us/library/dd642331.aspx
As others said, you lazily-load objects if they're really resource-heavy or it takes some time to load them during object construction-time.
I think it depends on what you are initialising. I probably wouldn't do it for a list as the construction cost is quite small, so it can go in the constructor. But if it was a pre-populated list then I probably wouldn't until it was needed for the first time.
Basically, if the cost of construction outweighs the cost of doing an conditional check on each access then lazy create it. If not, do it in the constructor.
Lazy instantiation/initialization is a perfectly viable pattern. Keep in mind, though, that as a general rule consumers of your API do not expect getters and setters to take discernable time from the end user POV (or to fail).
The downside that I can see is that if you want to ask if Bars is null, it would never be, and you would be creating the list there.
I was just going to put a comment on Daniel's answer but I honestly don't think it goes far enough.
Although this is a very good pattern to use in certain situations (for instance, when the object is initialized from the database), it's a HORRIBLE habit to get into.
One of the best things about an object is that it offeres a secure, trusted environment. The very best case is if you make as many fields as possible "Final", filling them all in with the constructor. This makes your class quite bulletproof. Allowing fields to be changed through setters is a little less so, but not terrible. For instance:
class SafeClass
{
String name="";
Integer age=0;
public void setName(String newName)
{
assert(newName != null)
name=newName;
}// follow this pattern for age
...
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
}
With your pattern, the toString method would look like this:
if(name == null)
throw new IllegalStateException("SafeClass got into an illegal state! name is null")
if(age == null)
throw new IllegalStateException("SafeClass got into an illegal state! age is null")
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
Not only this, but you need null checks everywhere you might possibly use that object in your class (Outside your class is safe because of the null check in the getter, but you should be mostly using your classes members inside the class)
Also your class is perpetually in an uncertain state--for instance if you decided to make that class a hibernate class by adding a few annotations, how would you do it?
If you make any decision based on some micro-optomization without requirements and testing, it's almost certainly the wrong decision. In fact, there is a really really good chance that your pattern is actually slowing down the system even under the most ideal of circumstances because the if statement can cause a branch prediction failure on the CPU which will slow things down many many many more times than just assigning a value in the constructor unless the object you are creating is fairly complex or coming from a remote data source.
For an example of the brance prediction problem (which you are incurring repeatedly, nost just once), see the first answer to this awesome question: Why is it faster to process a sorted array than an unsorted array?
Let me just add one more point to many good points made by others...
The debugger will (by default) evaluate the properties when stepping through the code, which could potentially instantiate the Bar sooner than would normally happen by just executing the code. In other words, the mere act of debugging is changing the execution of the program.
This may or may not be a problem (depending on side-effects), but is something to be aware of.
Are you sure Foo should be instantiating anything at all?
To me it seems smelly (though not necessarily wrong) to let Foo instantiate anything at all. Unless it is Foo's express purpose to be a factory, it should not instantiate it's own collaborators, but instead get them injected in its constructor.
If however Foo's purpose of being is to create instances of type Bar, then I don't see anything wrong with doing it lazily.
I would like to get your opinion on as how far to go with side-effect-free setters.
Consider the following example:
Activity activity;
activity.Start = "2010-01-01";
activity.Duration = "10 days"; // sets Finish property to "2010-01-10"
Note that values for date and duration are shown only for indicative purposes.
So using setter for any of the properties Start, Finish and Duration will consequently change other properties and thus cannot be considered side-effect-free.
Same applies for instances of the Rectangle class, where setter for X is changing the values of Top and Bottom and so on.
The question is where would you draw a line between using setters, which have side-effects of changing values of logically related properties, and using methods, which couldn't be much more descriptive anyway. For example, defining a method called SetDurationTo(Duration duration) also doesn't reflect that either Start or Finish will be changed.
I think you're misunderstanding the term "side-effect" as it applies to program design. Setting a property is a side effect, no matter how much or how little internal state it changes, as long as it changes some sort of state. A "side-effect-free setter" would not be very useful.
Side-effects are something you want to avoid on property getters. Reading the value of a property is something that the caller does not expect to change any state (i.e. cause side-effects), so if it does, it's usually wrong or at least questionable (there are exceptions, such as lazy loading). But getters and setters alike are just wrappers for methods anyway. The Duration property, as far as the CLR is concerned, is just syntactic sugar for a set_Duration method.
This is exactly what abstractions such as classes are meant for - providing coarse-grained operations while keeping a consistent internal state. If you deliberately try to avoid having multiple side-effects in a single property assignment then your classes end up being not much more than dumb data containers.
So, answering the question directly: Where do I draw the line? Nowhere, as long as the method/property actually does what its name implies. If setting the Duration also changed the ActivityName, that might be a problem. If it changes the Finish property, that ought to be obvious; it should be impossible to change the Duration and have both the Start and Finish stay the same. The basic premise of OOP is that objects are intelligent enough to manage these operations by themselves.
If this bothers you at a conceptual level then don't have mutator properties at all - use an immutable data structure with read-only properties where all of the necessary arguments are supplied in the constructor. Then have two overloads, one that takes a Start/Duration and another that takes a Start/Finish. Or make only one of the properties writable - let's say Finish to keep it consistent with Start - and then make Duration read-only. Use the appropriate combination of mutable and immutable properties to ensure that there is only one way to change a certain state.
Otherwise, don't worry so much about this. Properties (and methods) shouldn't have unintended or undocumented side effects, but that's about the only guideline I would use.
Personally, I think it makes sense to have a side-effect to maintain a consistent state. Like you said, it makes sense to change logically-related values. In a sense, the side-effect is expected. But the important thing is to make that point clear. That is, it should be evident that the task the method is performing has some sort of side-effect. So instead of SetDurationTo you could call your function ChangeDurationTo, which implies something else is going on. You could also do this another way by having a function/method that adjusts the duration AdjustDurationTo and pass in a delta value. It would help if you document the function as having a side-effect.
I think another way to look at it is to see if a side-effect is expected. In your example of a Rectangle, I would expect it to change the values of top or bottom to maintain an internally-consistent state. I don't know if this is subjective; it just seems to make sense to me. As always, I think documentation wins out. If there is a side-effect, document it really well. Preferably by the name of the method and through supporting documentation.
One option is to make your class immutable and have methods create and return new instances of the class which have all appropriate values changed. Then there are no side effects or setters. Think of something like DateTime where you can call things like AddDays and AddHours which will return a new DateTime instance with the change applied.
I have always worked with the general rule of not allowing public setters on properties that are not side-effect free since callers of your public setters can't be certain of what might happen, but of course, people that modify the assembly itself should have a pretty good idea as they can see the code.
Of course, there are always times where you have to break the rule for the sake of either readability, to make your object model logical, or just to make things work right. Like you said, really a matter of preference in general.
I think it's mostly a matter of common-sense.
In this particular example, my problem is not so much that you've got properties that adjust "related" properties, it's that you've got properties taking string values that you're then internaly parsing into DateTime (or whatever) values.
I would much rather see something like this:
Activity activity;
activity.Start = DateTime.Parse("2010-01-01");
activity.Duration = Duration.Parse("10 days");
That is, you explicity note that you're doing parsing of strings. Allow the programmer to specify strong-typed objects when that is appropriate as well.