I started writing a fluent interface and took a look at an older piece Martin Fowler wrote on fluent interfaces (which I didn't realize he and Eric Evans coined the term). In the piece, Martin mentions that setters usually return an instance of the object being configured or worked on, which he says is a violation of CQS.
The common convention in the curly brace world is that modifier
methods are void, which I like because it follows the principle of
CommandQuerySeparation. This convention does get in the way of a
fluent interface, so I'm inclined to suspend the convention for this
case.
So if my fluent interface does something like:
myObject
.useRepository("Stuff")
.withTransactionSupport()
.retries(3)
.logWarnings()
.logErrors();
Is this truly a violation of CQS?
UPDATE I broke out my sample to show logging warnings and errors as separate behaviors.
Yes, it is. All those methods are obviously returning something, and equally obviously they have side effects (judging from the fact that you don't do anything with the return value, yet you do bother to call them). Since the definition of CQS states that mutators should not return a value we have a clear-cut violation in our hands.
But does it matter to you that CQS is violated? If the fluent interface makes you more productive all things considered, and if you consider it a well-known pattern with equally well-known benefits and drawbacks, why should it matter that it violates principle X on paper?
It violates this principle when it changes objects but not when it only returns a new object.
var newObject = myObject
.useRepository("Stuff")
.withTransactionSupport()
.retries(3)
.logWarningsAndErrors();
If myObject is unchanged after this statement, everything is OK. Generally spoken, a fluent interface violates the CQS principle, if, and only if it has side effects.
However the question is, if your example does represent a query at all. Does "fluent" necessarily mean "query"? It could probably just be perceived as an action-fluent-interface where the same object is passed from one action to the next.
No. The pattern here is "Configuration". Such configuration commands return the configuration object itself as opposite to something unrelated to the command. A violation of the Command/Query segregation would occur if the commands which serve the configuration purpose returned some unrelated data, for example:
if (myObject.UseRepository("Stuff") > 1 && myObject.UseRepository("Bla") < 5) {
// oh, good, some invisible stuff internal to myObject is in right interval...
}
I think it depends on what those methods are doing. If each one is it's own command, then yes, it could be breaking CQS.
However, you could fix this easily 2 different ways.
Just don't chain the commands. Just do myObject.useRepository(".."). Then call the next one, etc. But if the next item in the chain requires information from the previous one you would be in trouble.
Instead of making each of these their own command, instead these chained things are simply updating data on the DTO directly. Then at the end, you run a method called .Configure() that then sends this DTO to a single command that does all of the processing.
If we ignore the type system DSL design, fluent interface is exactly the same as method chaining.
o.A().B() is equivalent to a = o.A(); a.B()
It does violate command-query separation in a mutable data structure. We have to explicitly add superfluous return this in method implementation (Here a & o refer to the same object) (By the way I prefer method cascading in such case)
However, we also often see it in immutable data structure, because a pure function has to return the result. (Here a & o refer to different objects`) In this case it does not violate command-query separation
Related
I have a strange habit it seems... according to my co-worker at least. We've been working on a small project together. The way I wrote the classes is (simplified example):
[Serializable()]
public class Foo
{
public Foo()
{ }
private Bar _bar;
public Bar Bar
{
get
{
if (_bar == null)
_bar = new Bar();
return _bar;
}
set { _bar = value; }
}
}
So, basically, I only initialize any field when a getter is called and the field is still null. I figured this would reduce overload by not initializing any properties that aren't used anywhere.
ETA: The reason I did this is that my class has several properties that return an instance of another class, which in turn also have properties with yet more classes, and so on. Calling the constructor for the top class would subsequently call all constructors for all these classes, when they are not always all needed.
Are there any objections against this practice, other than personal preference?
UPDATE: I have considered the many differing opinions in regards to this question and I will stand by my accepted answer. However, I have now come to a much better understanding of the concept and I'm able to decide when to use it and when not.
Cons:
Thread safety issues
Not obeying a "setter" request when the value passed is null
Micro-optimizations
Exception handling should take place in a constructor
Need to check for null in class' code
Pros:
Micro-optimizations
Properties never return null
Delay or avoid loading "heavy" objects
Most of the cons are not applicable to my current library, however I would have to test to see if the "micro-optimizations" are actually optimizing anything at all.
LAST UPDATE:
Okay, I changed my answer. My original question was whether or not this is a good habit. And I'm now convinced that it's not. Maybe I will still use it in some parts of my current code, but not unconditionally and definitely not all the time. So I'm going to lose my habit and think about it before using it. Thanks everyone!
What you have here is a - naive - implementation of "lazy initialization".
Short answer:
Using lazy initialization unconditionally is not a good idea. It has its places but one has to take into consideration the impacts this solution has.
Background and explanation:
Concrete implementation:
Let's first look at your concrete sample and why I consider its implementation naive:
It violates the Principle of Least Surprise (POLS). When a value is assigned to a property, it is expected that this value is returned. In your implementation this is not the case for null:
foo.Bar = null;
Assert.Null(foo.Bar); // This will fail
It introduces quite some threading issues: Two callers of foo.Bar on different threads can potentially get two different instances of Bar and one of them will be without a connection to the Foo instance. Any changes made to that Bar instance are silently lost.
This is another case of a violation of POLS. When only the stored value of a property is accessed it is expected to be thread-safe. While you could argue that the class simply isn't thread-safe - including the getter of your property - you would have to document this properly as that's not the normal case. Furthermore the introduction of this issue is unnecessary as we will see shortly.
In general:
It's now time to look at lazy initialization in general:
Lazy initialization is usually used to delay the construction of objects that take a long time to be constructed or that take a lot of memory once fully constructed.
That is a very valid reason for using lazy initialization.
However, such properties normally don't have setters, which gets rid of the first issue pointed out above.
Furthermore, a thread-safe implementation would be used - like Lazy<T> - to avoid the second issue.
Even when considering these two points in the implementation of a lazy property, the following points are general problems of this pattern:
Construction of the object could be unsuccessful, resulting in an exception from a property getter. This is yet another violation of POLS and therefore should be avoided. Even the section on properties in the "Design Guidelines for Developing Class Libraries" explicitly states that property getters shouldn't throw exceptions:
Avoid throwing exceptions from property getters.
Property getters should be simple operations without any preconditions. If a getter might throw an exception, consider redesigning the property to be a method.
Automatic optimizations by the compiler are hurt, namely inlining and branch prediction. Please see Bill K's answer for a detailed explanation.
The conclusion of these points is the following:
For each single property that is implemented lazily, you should have considered these points.
That means, that it is a per-case decision and can't be taken as a general best practice.
This pattern has its place, but it is not a general best practice when implementing classes. It should not be used unconditionally, because of the reasons stated above.
In this section I want to discuss some of the points others have brought forward as arguments for using lazy initialization unconditionally:
Serialization:
EricJ states in one comment:
An object that may be serialized will not have it's contructor invoked when it is deserialized (depends on the serializer, but many common ones behave like this). Putting initialization code in the constructor means that you have to provide additional support for deserialization. This pattern avoids that special coding.
There are several problems with this argument:
Most objects never will be serialized. Adding some sort of support for it when it is not needed violates YAGNI.
When a class needs to support serialization there exist ways to enable it without a workaround that doesn't have anything to do with serialization at first glance.
Micro-optimization:
Your main argument is that you want to construct the objects only when someone actually accesses them. So you are actually talking about optimizing the memory usage.
I don't agree with this argument for the following reasons:
In most cases, a few more objects in memory have no impact whatsoever on anything. Modern computers have way enough memory. Without a case of actual problems confirmed by a profiler, this is pre-mature optimization and there are good reasons against it.
I acknowledge the fact that sometimes this kind of optimization is justified. But even in these cases lazy initialization doesn't seem to be the correct solution. There are two reasons speaking against it:
Lazy initialization potentially hurts performance. Maybe only marginally, but as Bill's answer showed, the impact is greater than one might think at first glance. So this approach basically trades performance versus memory.
If you have a design where it is a common use case to use only parts of the class, this hints at a problem with the design itself: The class in question most likely has more than one responsibility. The solution would be to split the class into several more focused classes.
It is a good design choice. Strongly recommended for library code or core classes.
It is called by some "lazy initialization" or "delayed initialization" and it is generally considered by all to be a good design choice.
First, if you initialize in the declaration of class level variables or constructor, then when your object is constructed, you have the overhead of creating a resource that may never be used.
Second, the resource only gets created if needed.
Third, you avoid garbage collecting an object that was not used.
Lastly, it is easier to handle initialization exceptions that may occur in the property then exceptions that occur during initialization of class level variables or the constructor.
There are exceptions to this rule.
Regarding the performance argument of the additional check for initialization in the "get" property, it is insignificant. Initializing and disposing an object is a more significant performance hit than a simple null pointer check with a jump.
Design Guidelines for Developing Class Libraries at http://msdn.microsoft.com/en-US/library/vstudio/ms229042.aspx
Regarding Lazy<T>
The generic Lazy<T> class was created exactly for what the poster wants, see Lazy Initialization at http://msdn.microsoft.com/en-us/library/dd997286(v=vs.100).aspx. If you have older versions of .NET, you have to use the code pattern illustrated in the question. This code pattern has become so common that Microsoft saw fit to include a class in the latest .NET libraries to make it easier to implement the pattern. In addition, if your implementation needs thread safety, then you have to add it.
Primitive Data Types and Simple Classes
Obvioulsy, you are not going to use lazy-initialization for primitive data type or simple class use like List<string>.
Before Commenting about Lazy
Lazy<T> was introduced in .NET 4.0, so please don't add yet another comment regarding this class.
Before Commenting about Micro-Optimizations
When you are building libraries, you must consider all optimizations. For instance, in the .NET classes you will see bit arrays used for Boolean class variables throughout the code to reduce memory consumption and memory fragmentation, just to name two "micro-optimizations".
Regarding User-Interfaces
You are not going to use lazy initialization for classes that are directly used by the user-interface. Last week I spent the better part of a day removing lazy loading of eight collections used in a view-model for combo-boxes. I have a LookupManager that handles lazy loading and caching of collections needed by any user-interface element.
"Setters"
I have never used a set-property ("setters") for any lazy loaded property. Therefore, you would never allow foo.Bar = null;. If you need to set Bar then I would create a method called SetBar(Bar value) and not use lazy-initialization
Collections
Class collection properties are always initialized when declared because they should never be null.
Complex Classes
Let me repeat this differently, you use lazy-initialization for complex classes. Which are usually, poorly designed classes.
Lastly
I never said to do this for all classes or in all cases. It is a bad habit.
Do you consider implementing such pattern using Lazy<T>?
In addition to easy creation of lazy-loaded objects, you get thread safety while the object is initialized:
http://msdn.microsoft.com/en-us/library/dd642331.aspx
As others said, you lazily-load objects if they're really resource-heavy or it takes some time to load them during object construction-time.
I think it depends on what you are initialising. I probably wouldn't do it for a list as the construction cost is quite small, so it can go in the constructor. But if it was a pre-populated list then I probably wouldn't until it was needed for the first time.
Basically, if the cost of construction outweighs the cost of doing an conditional check on each access then lazy create it. If not, do it in the constructor.
Lazy instantiation/initialization is a perfectly viable pattern. Keep in mind, though, that as a general rule consumers of your API do not expect getters and setters to take discernable time from the end user POV (or to fail).
The downside that I can see is that if you want to ask if Bars is null, it would never be, and you would be creating the list there.
I was just going to put a comment on Daniel's answer but I honestly don't think it goes far enough.
Although this is a very good pattern to use in certain situations (for instance, when the object is initialized from the database), it's a HORRIBLE habit to get into.
One of the best things about an object is that it offeres a secure, trusted environment. The very best case is if you make as many fields as possible "Final", filling them all in with the constructor. This makes your class quite bulletproof. Allowing fields to be changed through setters is a little less so, but not terrible. For instance:
class SafeClass
{
String name="";
Integer age=0;
public void setName(String newName)
{
assert(newName != null)
name=newName;
}// follow this pattern for age
...
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
}
With your pattern, the toString method would look like this:
if(name == null)
throw new IllegalStateException("SafeClass got into an illegal state! name is null")
if(age == null)
throw new IllegalStateException("SafeClass got into an illegal state! age is null")
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
Not only this, but you need null checks everywhere you might possibly use that object in your class (Outside your class is safe because of the null check in the getter, but you should be mostly using your classes members inside the class)
Also your class is perpetually in an uncertain state--for instance if you decided to make that class a hibernate class by adding a few annotations, how would you do it?
If you make any decision based on some micro-optomization without requirements and testing, it's almost certainly the wrong decision. In fact, there is a really really good chance that your pattern is actually slowing down the system even under the most ideal of circumstances because the if statement can cause a branch prediction failure on the CPU which will slow things down many many many more times than just assigning a value in the constructor unless the object you are creating is fairly complex or coming from a remote data source.
For an example of the brance prediction problem (which you are incurring repeatedly, nost just once), see the first answer to this awesome question: Why is it faster to process a sorted array than an unsorted array?
Let me just add one more point to many good points made by others...
The debugger will (by default) evaluate the properties when stepping through the code, which could potentially instantiate the Bar sooner than would normally happen by just executing the code. In other words, the mere act of debugging is changing the execution of the program.
This may or may not be a problem (depending on side-effects), but is something to be aware of.
Are you sure Foo should be instantiating anything at all?
To me it seems smelly (though not necessarily wrong) to let Foo instantiate anything at all. Unless it is Foo's express purpose to be a factory, it should not instantiate it's own collaborators, but instead get them injected in its constructor.
If however Foo's purpose of being is to create instances of type Bar, then I don't see anything wrong with doing it lazily.
As the question shows,
As we are using string functions like IsNullOrEmpty or IsNullOrWhiteSpace as the name of functions shows , these are doing more than one job , is it not a violation of SRP?
rather should it not be string.isValid(Enum typeofValidation) than using strategey pattern to choose the correct strategey to validate.
or is it perfectly OK to violate SRP in utilities class or static classes.
The SRP says that a function or class should have only one reason to change. What is a reason to change? A reason to change is a user who requests changes. So a class or function should have only one user who requests changes.
Now a function that does some calculations and then some formatting, has two different users that could request a change. One would request changes to the calculations and the other would request changes to the formatting. Since these users have different needs and will make their requests and different times, we'd like them to be served by different functions.
IsNullOrEmpty(String) is not likely to be serving two different users. The user who cares about null is likely the same user who cares about empty, so isNullOrEmpty does not violate the SRP.
In object-oriented programming, the single responsibility principle states that every object should have a single responsibility
You're describing methods: IsNullOrEmpty or IsNullOrWhiteSpace, which are also self-describing in what they do, they're not objects. string has a single responsibility - to be responsible for text strings!
Static helpers can perform many tasks if you choose: the whole point of the Single Responsibility principle is to ultimately make your code more maintainable and readable for future teams and yourself. As a comment says, don't overthink it. You're not designing the framework here but just consuming some parts of it that will clean your strings for you, and validate incoming data.
The SRP applies to classes, not methods. Still, it's a good idea to have methods that do one thing only. But you can't take that to extremes. For example, a console application would be fairly useless if its Main method could contain only one statement (and, if the statement is a method call, that method could also contain only one statement, etc., recursively).
Think about the implementation of IsNullOrEmpty:
static bool IsNullOrEmpty(string s)
{
return ReferenceEquals(s, null) || Equals(s, string.Empty);
}
So, yes, it's doing two things, but they're done in a single expression. If you go to the level of expressions, any boolean expression involving binary boolean operators could be said to be "doing more than one thing" because it is evaluating the truth of more than one condition.
If the names of the methods bother you because they imply too much activity for a single method, wrap them in your own methods with names that imply the evaluation of a single condition. For example:
static bool HasNoVisibleCharacters(string s) { return string.IsNullOrWhitespace(s); }
static bool HasNoCharacters(string s) { return string.IsNullOrEmpty(s); }
In response to your comment:
say I wrote the function like SerilizeAndValidate(ObjectToSerilizeAndValidate) , clearly this method / class , is doing 2 things , Serialize and Validation, clearly a violation , some time methods in a class leads to maintenance nightmare like above example of serialize and validation
Yes, you are right to be concerned about this, but again, you cannot literally have methods that do one thing only. Remember that different methods will deal with different levels of abstraction. You might have a very high-level method that calls SerializeAndValidate as part of a long sequence of actions. At that level of abstraction, it might be very reasonable to think of SerializeAndValidate as a single action.
Imagine writing a set of step-by-step instructions for an experienced user to open a file's "properties" dialogue:
Right-click the file
Choose "Properties"
Now imagine writing the same instructions for someone who's never used a mouse before:
Position the mouse pointer over the file's icon
Press and release the right mouse button
A menu appears. Position the mouse pointer over the word "Properties"
Press and release the left mouse button
When we write computer programs, we need to operate at both levels of abstraction. Or, rather, at any given time, we're operating at one level of abstraction or another, so as not to confuse ourselves. Furthermore, we rely on library code that operates at lower levels of abstraction still.
Methods also allow you to comply with the "do not repeat yourself" principle (often known as "DRY"). If you need to both serialize and validate objects in many parts of your application, you'd want to have a SerializeAndValidate method to reduce duplicative code. You'd be very well advised to implement the method as a simple convenience method:
void SerializeAndValidate(SomeClass obj)
{
Serialize(obj);
Validate(obj);
}
This allows you the convenience of calling one method, while preserving the separation of serialization logic from validation logic, which should make the program easier to maintain.
I don't see this as doing more than one thing. It is just making sure your string passes a required condition.
This pattern pops up a lot. It looks like a very verbose way to move what would otherwise be separate named methods into a single method and then distinguished by a parameter.
Is there any good reason to have this pattern over just having two methods Method1() and Method2() ? The real kicker is that this pattern tends to be invoked only with constants at runtime-- i.e. the arguments are all known before compiling is done.
public enum Commands
{
Method1,
Method2
}
public void ClientCode()
{
//Always invoked with constants! Never user input.
RunCommands(Commands.Method1);
RunCommands(Commands.Method2);
}
public void RunCommands(Commands currentCommand)
{
switch (currentCommand)
{
case Commands.Method1:
// Stuff happens
break;
case Commands.Method2:
// Other stuff happens
break;
default:
throw new ArgumentOutOfRangeException("currentCommand");
}
}
To an OO programmer, this looks horrible.
The switch and enum would need synchronised maintenance and the default case seems like make-work.
The OO programmer would substitute an object with named methods: Then the names like method1 would only appear once in the library. Also all the default cases would be obviated.
Yes, your clients still need to be synchronised with the methods you supply - a static language always insists on method names being known at compile time.
You could argue that this pattern allows you to put shared logging (or other) code for method entry and exit in a single place. But I wouldn't. AOP is a better approach for this sort of thing.
That pattern could be valid if you needed the coupling to be very loose. For example you might have an interface
interface CommandProcessor{
void process(Command c);
}
If you have a method per command then each time you add a new command you would need to add a new method, if you have multiple implementations then you would need to add the method to each Processor. This could be resolved by having some base class, but if the needs diverge you could end up with a very deep class heirarchy as you add new abstraction layers (or you may already be extending another class in with the processor. If it is based on switch's over the constant you can have you default case that handles new cases appropriately by default (exceptions, whatever may be appropriate).
I have used a pattern similar to this in my code with the addition of a factory. The operations started as a small set, but I knew they would be increasing, so I had a mechanism to describe the command and then a factory that produced CommandProcessors. The factory would generate the appropriate processor and then the single method of that processor would accept the command and perform its processing.
That said if your list of command is fairly static and you don't need to worry about how tightly things are coupled then the one-method-per-command approach certainly lends itself to much more readable code.
I can't see any obvious advantages. Quite the opposite; by splitting the blocks into separate methods, each method will be smaller, easier to read and easier to test.
If needed, you could still have the same "entry point" method, where each case would just branch out and call another method. Whether that would be a good or bad idea is impossible to say without knowing more about specific cases. Either way, I would definitely avoid implementing the code for each case in the RunCommands method.
If RunCommands is only ever invoked with the names constants, then I don't see any advantage in this pattern at all.
The only advantage I see (and it could be a big one) would be that the decision between Method1 and Method2 and the code that actually executes the choice could be entirely unrelated. Of course that advantage is lost, when only constants are ever used to invoke RunCommand.
if the code being run inside each case block is completely separate, no value added. however, if there is any common code to be executed before or after the parameter-specific code, this allows it to not be repeated.
still not really the best pattern, though. each separate method could just have calls to helper methods to handle the common code. and if there needs to be another call, but this one doesn't need the common code in front or at the end, the whole model is broken (or you surround that code with and IF). at this point, all value is lost.
so, really, the answer is no.
I need to derive an important value given 7 potential inputs. Uncle Bob urges me to avoid functions with that many parameters, so I've extracted the class. All parameters now being properties, I'm left with a calculation method with no arguments.
“That”, I think, “could be a property, but I'm not sure if that's idiomatic C#.”
Should I expose the final result as a property, or as a method with no arguments? Would the average C# programmer find properties confusing or offensive? What about the Alt.Net crowd?
decimal consumption = calculator.GetConsumption(); // obviously derived
decimal consumption = calculator.Consumption; // not so obvious
If the latter: should I declare interim results as [private] properties, also? Thanks to heavy method extraction, I have several interim results. Many of these shouldn't be part of the public API. Some of them could be interesting, though, and my expressions would look cleaner if I could access them as properties:
decimal interim2 = this.ImportantInterimValue * otherval;
Happy Experiment Dept.:
While debugging my code in VS2008, I noticed that I kept hovering my mouse over the method calls that compute interim results, expecting a hover-over with their return value. After turning all methods into properties, I found that exposing interim results as properties greatly assisted debugging. I'm well pleased with that, but have lingering concerns about readability.
The interim value declarations look messier. The expressions, however, are easier to read without the brackets. I no longer feel compelled to start the method name with a verb. To contrast:
// Clean method declaration; compulsive verby name; callers need
// parenthesis despite lack of any arguments.
decimal DetermineImportantInterimValue() {
return this.DetermineOtherInterimValue() * this.SomeProperty;
}
// Messier property declaration; clean name; clean access syntax
decimal ImportantInterimValue {
get {
return this.OtherInterimValue * this.SomeProperty;
}
}
I should perhaps explain that I've been coding in Python for a decade. I've been left with a tendency to spend extra time making my code easier to call than to write. I'm not sure the Python community would regard this property-oriented style as acceptably “Pythonic”, however:
def determineImportantInterimValue(self):
"The usual way of doing it."
return self.determineOtherInterimValue() * self.someAttribute
importantInterimValue = property(
lambda self => self.otherInterimValue * self.someAttribute,
doc = "I'm not sure if this is Pythonic...")
The important question here seems to be this:
Which one produces more legible, maintainable code for you in the long run?
In my personal opinion, isolating the individual calculations as properties has a couple of distinct advantages over a single monolothic method call:
You can see the calculations as they're performed in the debugger, regardless of the class method you're in. This is a boon to productivity while you're debugging the class.
If the calculations are discrete, the properties will execute very quickly, which means (in my opinion), they observe the rules for property design. It's absurd to think that a guideline for design should be treated as a straightjacket. Remember: There is no silver bullet.
If the calculations are marked private or internal, they do not add unnecessary complexity to consumers of the class.
If all of the properties are discrete enough, compiler inlining may resolve the performance issues for you.
Finally, if the final method that returns your final calculation is far and away easier to maintain and understand because you can read it, that is an utterly compelling argument in and of itself.
One of the best things you can do is think for yourself and dare to challenge the preconceived One Size Fits All notions of our peers and predecessors. There are exceptions to every rule. This case may very well be one of them.
Postscript:
I do not believe that we should abandon standard property design in the vast majority of cases. But there are cases where deviating from The Standard(TM) is called for, because it makes sense to do so.
Personally, I would prefer if you make your public API as a method instead of property. Properties are supposed to be as 'fast' as possible in C#. More details on this discussion: Properties vs Methods
Internally, GetConsumption can use any number of private properties to arrive at the result, choice is yours.
I usually go by what the method or property will do. If it is something that is going to take a little time, I'll use a method. If it's very quick or has a very small number of operations going on behind the scenes, I'll make it a property.
I use to use methods to denote any action on the object or which changes the state of an object. so, in this case I would name the function as CalculateConsumption() which computes the values from other properties.
You say you are deriving a value from seven inputs, you have implemented seven properties, one for each input, and you have a property getter for the result. Some things you might want to consider are:
What happens if the caller fails to set one or more of the seven "input" properties? Does the result still make sense? Will an exception be thrown (e.g. divide by zero)?
In some cases the API may be less discoverable. If I must call a method that takes seven parameters, I know that I must supply all seven parameters to get the result. And if some of the parameters are optional, different overloads of the method make it clear which ones.
In contrast, it may not be so clear that I have to set seven properties before accessing the "result" property, and could be easy to forget one.
When you have a method with several parameters, you can more easily have richer validation. For example, you could throw an ArgumentException if "parameter A and parameter B are both null".
If you use properties for your inputs, each property will be set independently, so you can't perform the validation when the inputs are being set - only when the result property is being dereferenced, which may be less intuitive.
I was recently watching a webcast about how to create a fluent DSL and I have to admit, I don't understand the reasons why one would use such an approach (at least for the given example).
The webcast presented an image resizing class, that allows you to specify an input-image, resize it and save it to an output-file using the following syntax (using C#):
Sizer sizer = new Sizer();
sizer.FromImage(inputImage)
.ToLocation(outputImage)
.ReduceByPercent(50)
.OutputImageFormat(ImageFormat.Jpeg)
.Save();
I don't understand how this is better than a "conventional" method that takes some parameters:
sizer.ResizeImage(inputImage, outputImage, 0.5, ImageFormat.Jpeg);
From a usability point of view, this seems a lot easier to use, since it clearly tells you what the method expects as input. In contrast, with the fluent interface, nothing stops you from omitting/forgetting a parameter/method-call, for example:
sizer.ToLocation(outputImage).Save();
So on to my questions:
1 - Is there some way to improve the usability of a fluent interface (i.e. tell the user what he is expected to do)?
2 - Is this fluent interface approach just a replacement for the non existing named method parameters in C#? Would named parameters make fluent interfaces obsolete, e.g. something similar objective-C offers:
sizer.Resize(from:input, to:output, resizeBy:0.5, ..)
3 - Are fluent interfaces over-used simply because they are currently popular?
4 - Or was it just a bad example that was chosen for the webcast? In that case, tell me what the advantages of such an approach are, where does it make sense to use it.
BTW: I know about jquery, and see how easy it makes things, so I'm not looking for comments about that or other existing examples.
I'm more looking for some (general) comments to help me understand (for example) when to implement a fluent interface (instead of a classical class-library), and what to watch out for when implementing one.
2 - Is this fluent interface approach
just a replacement for the non
existing named method parameters in
C#? Would named parameters make fluent
interfaces obsolete, e.g. something
similar objective-C offers:
Well yes and no. The fluent interface gives you a larger amount of flexibility. Something that could not be achieved with named params is:
sizer.FromImage(i)
.ReduceByPercent(x)
.Pixalize()
.ReduceByPercent(x)
.OutputImageFormat(ImageFormat.Jpeg)
.ToLocation(o)
.Save();
The FromImage, ToLocation and OutputImageFormat in the fluid interface, smell a bit to me. Instead I would have done something along these lines, which I think is much clearer.
new Sizer("bob.jpeg")
.ReduceByPercent(x)
.Pixalize()
.ReduceByPercent(x)
.Save("file.jpeg",ImageFormat.Jpeg);
Fluent interfaces have the same problems many programming techniques have, they can be misused, overused or underused. I think that when this technique is used effectively it can create a richer and more concise programming model. Even StringBuilder supports it.
var sb = new StringBuilder();
sb.AppendLine("Hello")
.AppendLine("World");
I would say that fluent interfaces are slightly overdone and I would think that you have picked just one such example.
I find fluent interfaces particularly strong when you are constructing a complex model with it. With model I mean e.g. a complex relationship of instantiated objects. The fluent interface is then a way to guide the developer to correctly construct instances of the semantic model. Such a fluent interface is then an excellent way to separate the mechanics and relationships of a model from the "grammar" that you use to construct the model, essentially shielding details from the end user and reducing the available verbs to maybe just those relevant in a particular scenario.
Your example seems a bit like overkill.
I have lately done some fluent interface on top of the SplitterContainer from Windows Forms. Arguably, the semantic model of a hierarchy of controls is somewhat complex to correctly construct. By providing a small fluent API a developer can now declaratively express how his SplitterContainer should work. Usage goes like
var s = new SplitBoxSetup();
s.AddVerticalSplit()
.PanelOne().PlaceControl(()=> new Label())
.PanelTwo()
.AddHorizontalSplit()
.PanelOne().PlaceControl(()=> new Label())
.PanelTwo().PlaceControl(()=> new Panel());
form.Controls.Add(s.TopControl);
I have now reduced the complex mechanics of the control hierarchy to a couple of verbs that are relevant for the issue at hand.
Hope this helps
Consider:
sizer.ResizeImage(inputImage, outputImage, 0.5, ImageFormat.Jpeg);
What if you used less clear variable names:
sizer.ResizeImage(i, o, x, ImageFormat.Jpeg);
Imagine you've printed this code out. It's harder to infer what these arguments are, as you don't have access to the method signature.
With the fluent interface, this is clearer:
sizer.FromImage(i)
.ToLocation(o)
.ReduceByPercent(x)
.OutputImageFormat(ImageFormat.Jpeg)
.Save();
Also, the order of methods is not important. This is equivalent:
sizer.FromImage(i)
.ReduceByPercent(x)
.OutputImageFormat(ImageFormat.Jpeg)
.ToLocation(o)
.Save();
In addition, perhaps you might have defaults for the output image format, and the reduction, so this could become:
sizer.FromImage(i)
.ToLocation(o)
.Save();
This would require overloaded constructors to achieve the same effect.
It's one way to implement things.
For objects that do nothing but manipulate the same item over and over again, there's nothing really wrong with it. Consider C++ Streams: they're the ultimate in this interface. Every operation returns the stream again, so you can chain together another stream operation.
If you're doing LINQ, and doing manipulation of an object over and over, this makes some sense.
However, in your design, you have to be careful. What should the behavior be if you want to deviate halfway through? (IE,
var obj1 = object.Shrink(0.50); // obj1 is now 50% of obj2
var obj2 = object.Shrink(0.75); // is ojb2 now 75% of ojb1 or is it 75% of the original?
If obj2 was 75% of the original object, then that means you're making a full copy of the object every time (and has its advantages in many cases, like if you're trying to make two instances of the same thing, but slightly differently).
If the methods simply manipulate the original object, then this kind of syntax is somewhat disingenuous. Those are manipulations on the object instead of manipulations to create a changed object.
Not all classes work like this, nor does it make sense to do this kind of design. For example, this style of design would have little to no usefulness in the design of a hardware driver or the core of a GUI application. As long as the design involves nothing but manipulating some data, this pattern isn't a bad one.
You should read Domain Driven Design by Eric Evans to get some idea why is DSL considered good design choice.
Book is full of good examples, best practice advices and design patterns. Highly recommended.
It's possible to use a variation on a Fluent interface to enforce certain combinations of optional parameters (e.g. require that at least one parameter from a group is present, and require that if a certain parameter is specified, some other parameter must be omitted). For example, one could provide a functionality similar to Enumerable.Range, but with a syntax like IntRange.From(5).Upto(19) or IntRange.From(5).LessThan(10).Stepby(2) or IntRange(3).Count(19).StepBy(17). Compile-time enforcement of overly-complex parameter requirements may require the definition of an annoying number of intermediate-value structures or classes, but the approach can in some cases prove useful in simpler cases.
Further to #sam-saffron's suggestion regarding the flexibility of a Fluent Interface when adding a new operation:
If we needed to add a new operation, such as Pixalize(), then, in the 'method with multiple parameters' scenario, this would require a new parameter to be added to the method signature. This may then require a modification to every invocation of this method throughout the codebase in order to add a value for this new parameter (unless the language in use would allow an optional parameter).
Hence, one possible benefit of a Fluent Interface is limiting the impact of future change.