Both in SQL and C#, I've never really liked output parameters. I never passed parameters ByRef in VB6, either. Something about counting on side effects to get something done just bothers me.
I know they're a way around not being able to return multiple results from a function, but a rowset in SQL or a complex datatype in C# and VB work just as well, and seem more self-documenting to me.
Is there something wrong with my thinking, or are there resources from authoritative sources that back me up? What's your personal take on this and why? What can I say to colleagues that want to design with output parameters that might convince them to use different structures?
EDIT: interesting turn- the output parameter I was asking this question about was used in place of a return value. When the return value is "ERROR", the caller is supposed to handle it as an exception. I was doing that but not pleased with the idea. A coworker wasn't informed of the need to handle this condition and as a result, a great deal of money was lost as the procedure failed silently!
Output parameters can be a code smell indicating that your method is doing too much. If you need to return more than one value, the method is likely doing more than one thing. If the data is tightly related, then it would probably benefit from a class that holds both values.
Of course, this is not ALWAYS the case, but I have found that it is usually the case.
In other words, I think you are right to avoid them.
They have their place. Int32.TryParse method is a good example of an effective use of an out parameter.
bool result = Int32.TryParse(value, out number);
if (result)
{
Console.WriteLine("Converted '{0}' to {1}.", value, number);
}
Bob Martin wrote about this Clean Code. Output params break the fundamental idea of a function.
output = someMethod(input)
I think they're useful for getting IDs of newly-inserted rows in the same SQL command, but i don't think i've used them for much else.
I too see very little use of out/ref parameters, although in SQL it sometimes is easier to pass a value back by a parameter than by a resultset (which would then require the use of a DataReader, etc.)
Though, as luck would have it, I just created one such rare function in C# today. It validated a table-like data structure and returned the number of rows and columns in it (which was tricky to calculate because the table could have rowspans/colspans like in HTML). In this case the calculation of both values was done at the same time. Separating it into two functions would have resulted in double the code, memory and CPU time requirements. Creating a custom type just for this one function to return also seems like an overkill to me.
So - there are times when they are the best thing, but mostly you can do just fine without them.
The OUTPUT clause in SQL Server 2005 onwards is a great step forward for getting any field values for rows affected by your DML statements. Ithink that there are a lot of situations where this does away with output parameters.
In VB6, ByRef parameters are good for passing ADO objects around.
other than those two specific cases that come to mind, I tend to avoid using them.
In SQL only...
Stored procedure output parameters are useful.
Say you need one value back. Do you "create #table, insert... exec, select #var = ". Or use an output parameter?
For client calls, an output parameter is far quicker than processing a recordset.
Using RETURN values is limited to signed integer.
Easier to re-use (eg a security check helper procedure)
When using both: recordsets = data, output parameters = status/messages/rowcount etc
Stored procedures recordset output can not be strongly typed like UDFs or client code
You can't always use a UDF (eg logging during security check above)
However, as long as you don't generally use the same parameter for input and output, then until SQL changes completely your options are limited. Saying that, I have one case where I use a paramter for in and out values, but I have a good reason.
My Two Cents:
I agree that output parameters are a concerning practice. VBA is often maintained by people very new to programming and if someone maintaining your code fails to notice that a parameter is ByRef they could introduce some serious logical errors. Also it does tend to break the Property/Function/Sub paradigm.
Another reason that using out parameters is bad practice is that if you really do need to be returning more than one value, chances are that you should have those values in a data structure such as a class or a User Defined Type.
They can however solve some problems. VB5 (and therefore VBA for Office 97) did not allow for a function to return an array. This meant anything returning or altering an array would have to do so via an "out" parameter. In VB6 this ability has been added, but VB6 still forces array parameters to be by reference (to prevent excessive copying in memory). Now you can return a value from a function that alters an array. But it will be just a hair slow (due to the acrobatics going on behind the scenes); it can also confuse newbies into thinking that the array input will not be altered (which will only be true if someone specifically structured it that way). So I find that if I have a function that alters an array it reduces confusion to just use a sub instead of a function (and it will be a tiny bit faster too).
Another possible scenario would be if you are maintaining code and you want to add an out value without breaking the interface you can add an optional out parameter and be confident you won't be breaking any old code. It's not good practice, but if someone wants something fixed right now and you don't have time to do it the "right way" and restructure everything, this can be a handy addition to your tool box.
However if you are developing things from the ground up and you need to return multiple values you should consider:
1. Breaking up the function.
2. Returning a UDT.
3. Returning a Class.
I generally never use them, I think they are confusing and too easy to abuse. We do occasionally use ref parameters but that has more to do with passing in structures vs. getting them back.
Your opinion sounds reasonable to me.
Another drawback of output parameters is the extra code needed to pass results from one function to another. You have to declare the variable(s), call the function to get their values, and then pass the values to another function. You can't just nest function calls. This makes code read very imperatively, rather than declaratively.
C++0x is getting tuples, an anonymous struct-like thing, whose members you access by index. C++ programmers will be able to pack multiple values into one of those and return it. Does C# have something like that? Can it return an array, perhaps, instead? But yeah output parameters are a bit awkward and unclear.
Related
I am a student and I am currently preparing for my OOP Basics Exam.
When in the controller you have methods which return a value and such that are void - how do you invoke them without using a if-else statement?
In my code "status" is the only one which should return a string to be printed on the Console - the others are void. So I put a if-esle and 2 methods in the CommandHandler.
Since I know "if-else" is a code smell, is there a more High Quality approach to deal with the situation?
if (commandName == "status")
{
this.Writer.WriteLine(this.CommandHandler.ExecuteStatusCommand(commandName));
}
else
{
this.CommandHandler.ExecuteCommand(commandName, commandParameters);
}
This is the project.
Thank you very much.
First, don't worry about if/else. If anybody tells you if/else is a code smell, put it through the Translator: What comes out is he's telling you he's too crazy, clueless, and/or fanatical to be taken seriously.
If by ill chance you get an instructor who requires you to say the Earth is flat to get an A, sure, tell him the Earth is flat. But if you're planning on a career or even a hobby as a navigator, don't ever forget that it's actually round.
So. It sounds to me like CommandHandler.ExecuteStatusCommand() executes the named command, which is implemented as a method somewhere. If the command method is void, ExecuteStatusCommand() returns null. Otherwise, the command method may return a string, in which case you want to write it to what looks like a stream.
OK, so one approach here is to say "A command is implemented via a method that takes a parameter and returns either null or a string representing a status. If it returns anything but null, write that to the stream".
This is standard stuff: You're defining a "contract". It's not at all inappropriate for command methods which actually return nothing to have a String return type, because they're fulfilling the terms of contract. "Return a string" is an option that's open to all commands; some take advantage, some don't.
This allows knowledge of the command's internals to be limited to the command method itself, which is a huge advantage. You don't need to worry about special cases at the point where you call the methods. The code below doesn't need to know which commands return a status and which don't. The commands themselves are given a means to communicate that information back to the caller, so only they need to know. It's incredibly beneficial to have a design which allows different parts of your code not to care about the details of other parts. Clean "interfaces" like this make that possible. The calling code gets simpler and stays simpler. Less code, with less need to change it over time, means less effort and fewer bugs.
As you noted, if you've got a "status" command that prints a result, and then later on you add a "print" command that also prints a result, you've got to not only implement the print command itself, but you've also got to remember to return to this part of your code and add a special case branch to the if/else.
That kind of tedious error-prone PITA is exactly the kind of nonsense OOP is meant to eliminate. If a new feature can be added without making a single edit to existing code, that's a sort of Platonic ideal of OOP.
So if ExecuteCommand() returns void, we'll want to be calling ExecuteStatusCommand() instead. I'm guessing at some things here. It would have been helpful if you had sketched out the semantics of those two methods.
var result = this.CommandHandler.ExecuteCommand(commandName, commandParameters);
if (result != null)
{
this.Writer.WriteLine(result);
}
If my assumptions about your design are accurate, that's the whole deal. commandParameters, like the status result, are an optional part of the contract. There's nothing inherently wrong with if/else, but sometimes you don't need one.
So I didn't find any elegant solution for this, either googling or throughout stackoverflow. I guess that I have a very specific situation in my hands, anyway here it goes:
I have a object structure, which I don't have control of, because I receive this structure from an external WS. This is quite a huge object, with various levels of fields and properties, and this fields and properties can or can't be null, in any level. You can think of this object as an anemic model, it doesn't have behaviour, just state.
For the purpose of this question, I'll give you a simplified sample that simulates my situation:
Class A
PropB1
PropC11
PropLeaf111
PropC12
PropLeaf112
PropB2
PropC21
PropLeaf211
PropC22
PropLeaf221
So, throughout my code I have to access a number of these properties, in different levels, to do some math in order to calculate what I need. Basically for each type of calculation that I have to do, I have to test each level of the properties that I need, to check if it's not null, in which case I would return (decimal) 0, or any other default value depending on the business logic.
Sample of a math that I have to do with it:
var value = 0;
if (objClassA.PropB1 != null && objClassA.PropB1.PropC11 != null) {
var leaf = objClassA.PropB1.PropC11.PropLeaf111;
value = leaf.HasValue ? leaf.Value : value;
}
Just to be very, the leaf properties of this structure would always be primitives, or nullable primitives in which case I give the proper treatment. This is "the logic" that I have to do for each property that I need, and sometimes I have to use quite some of them. Also the real structure is quite bigger, so the number of verifications that I would need to do, would also be bigger for each necessary property.
Now, I came up with some ideas, none of them I think is ideal:
Create methods to gather the properties, where it would abstract any necessary verification, or the logic to get default values. The drawback is that it would have, in my opinion, quite some duplicated code, since the verifications and the default values would be similar for some groups of fields.
Create a single generic method, where it receives a object, and a lamba function that access the required field. This method would try to execute the function and return it's result, and in case of an NullReferenceException, it would return a default value. The bright side of this one, is that it is realy generic, I just have to pass lambdas to access the properties, and the method would handle any problem. The drawback of it, is that I am using try -> catch to control logic, which is not the purpose of it, and the code might look confusing for other programmers that would eventually give maintenance to it.
Null Object Pattern, this would be the most elegant solution, I guess. It would have all the good points if it was a normal case. But the thing is the impact of providing Null Objects for this structure. Just to give a bit more of context, the software that I am working on, integrates with government's services, and the structure that I am working with, which is in the government's specifications, have some fields where null have some meaning which is different from a default value like "0". Also this specification changes from time to time, and the classes are generated again, and the post processing that I would have to do to create Null Objects, would also need maintenance, which seems a bit dangerous for me.
I hope that I made myself clear enough.
Thanks in advance.
Solution
This is a response as to how I solved my problem, based on the accepted answer.
I'm quite new to C#, and this kind of discution that was linked really helped me to come up with a elegant solution in many aspects. I still have the problem that depending where the code is executed, it uses .NET 2.0, but I also found a solution for this problem, where I can somewhat define extension methods: https://stackoverflow.com/a/707160/649790
And for the solution itself, I found this one the best:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
I can basically access the properties this way, and just do the math:
objClassA.With(o => o.PropB1).With(o => PropC11).Return(o => PropLeaf111, 0);
For each property that I need. It still isn't just:
objClassA.PropB1.PropC11.PropLeaf111
ofcourse, but it is far better that any solution that I found so far, since I was unfamiliar with Extension Methods, I really learned a lot.
Thanks again.
There is a strategy for dealing with this, involving the "Maybe" Monad.
Basically it works by providing a "fluent" interface where the chain of properties is interrupted by a null somewhere along the chain.
See here for an example: http://smellegantcode.wordpress.com/2008/12/11/the-maybe-monad-in-c/
And also here:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
http://mikehadlow.blogspot.co.uk/2011/01/monads-in-c-5-maybe.html
It's related to but not quite the same as what you seem to need; however, perhaps it can be adapted to your needs. The concepts are fairly fundamental.
When writing an API or reusable object, is there any technical reason why all method calls that return 'void' shouldn't just return 'this' (*this in C++)?
For example, using the string class, we can do this kind of thing:
string input= ...;
string.Join(input.TrimStart().TrimEnd().Split("|"), "-");
but we can't do this:
string.Join(input.TrimStart().TrimEnd().Split("|").Reverse(), "-");
..because Array.Reverse() returns void.
There are many other examples where an API has lots of void-returning operations, so code ends up looking like:
api.Method1();
api.Method2();
api.Method3();
..but it would be perfectly possible to write:
api.Method1().Method2().Method3()
..if the API designer had allowed this.
Is there a technical reason for following this route? Or is it just a style thing, to indicate mutability/new object?
(x-ref Stylistic question concerning returning void)
EPILOGUE
I've accepted Luvieere's answer as I think this best represents the intention/design, but it seems there are popular API examples out there that deviate from this :
In C++ cout << setprecision(..) << number << setwidth(..) << othernumber; seems to alter the cout object in order to modify the next datum inserted.
In .NET, Stack.Pop() and Queue.Dequeue() both return an item but change the collection too.
Props to ChrisW and others for getting detailed on the actual performance costs.
Methods that return void state more clearly that they have side effects. The ones that return the modified result are supposed to have no side effects, including modifying the original input. Making a method return void implies that it changes its input or some other internal state of the API.
If you had Reverse() return a string, then it wouldn't be obvious to a user of the API whether it returned a new string or the same-one, reversed in-place.
string my_string = "hello";
string your_string = my_string.reverse(); // is my_string reversed or not?
That is why, for instance, in Python, list.sort() returns None; it distinguishes the in-place sort from sorted(my_list).
Is there a technical reason for following this route?
One of the C++ design guidelines is "don't pay for features you don't use"; returning this would have some (slight) performance penalty, for a feature which many people (I, for one) wouldn't be inclined to make use of.
The technical principal that many others have mentioned (that void emphasizes the fact the function has a side-effect) is known as Command-Query Separation.
While there are pros and cons to this principle, e.g., (subjectively) clearer intent vs. more concise API, the most important part is to be consistent.
I'd imagine one reason might be simplicity. Quite simply, an API should generally be as minimal as possible. It should be clear with every aspect of it, what it is for.
If I see a function that returns void, I know that the return type is not important. Whatever the function does, it doesn't return anything for me to work with.
If a function returns something non-void, I have to stop and wonder why. What is this object that might be returned? Why is it returned? Can I assume that this is always returned, or will it sometimes be null? Or an entirely different object? And so on.
In a third-party API, I'd prefer if that kind of questions just never arise.
If the function doesn't need to return anything, it shouldn't return anything.
If you intend your API to be called from F#, please do return void unless you're convinced that this particular method call is going to be chained with another nearly every time it's used.
If you don't care about making your API easy to use from F#, you can stop reading here.
F# is more strict than C# in certain areas - it wants you to be explicit about whether you're calling a method in order to get a value, or purely for its side-effects. As a result, calling a method for its side-effects when that method also returns a value becomes awkward, because the returned value has to be explicitly ignored in order to avoid compiler errors. This makes "fluent interfaces" somewhat awkward to use from F#, which has its own superior syntax for chaining a series of calls together.
For example, suppose we have a logging system with a Log method that returns this to allow for some sort of method chaining:
let add x y =
Logger.Log(String.Format("Adding {0} and {1}", x, y)) // #1
x + y // #2
In F#, because line #1 is a method call that returns a value, and we're not doing anything with that value, the add function is considered to take two values and return that Logger instance. However, line #2 not only also returns a value, it appears after what F# considers to be the "return" statement for the function, effectively making two "return" statements. This will cause a compiler error, and we need to explicitly ignore the return value of the Log method to avoid this error, so that our add method has only a single return statement.
let add x y =
Logger.Log(String.Format("Adding {0} and {1}", x, y)) |> ignore
x + y
As you might guess, making lots of "Fluent API" calls that are mainly about side-effects becomes a somewhat frustrating exercise in sprinkling lots of ignore statements all over the place.
You can, of course, have the best of both worlds and please both C# and F# developers by providing both a fluent API and an F# module for working with your code. But if you're not going to do that, and you intend your API for public consumption, please think twice before returning this from every single method.
Besides the design reasons, there is also a slight performance cost (both in speed and space) for returning this.
The example below may not be problematic as is, but it should be enough to illustrate a point. Imagine that there is a lot more work than trimming going on.
public string Thingy
{
set
{
// I guess we can throw a null reference exception here on null.
value = value.Trim(); // Well, imagine that there is so much processing to do
this.thingy = value; // That this.thingy = value.Trim() would not fit on one line
...
So, if the assignment has to take two lines, then I either have to abusereuse the parameter, or create a temporary variable. I am not a big fan of temporary variables. On the other hand, I am not a fan of convoluted code. I did not include an example where a function is involved, but I am sure you can imagine it. One concern I have is if a function accepted a string and the parameter was "abused", and then someone changed the signature to ref in both places - this ought to mess things up, but ... who would knowingly make such a change if it already worked without a ref? Seems like it is their responsibility in this case. If I mess with the value of value, am I doing something non-trivial under the hood? If you think that both approaches are acceptable, then which do you prefer and why?
Thanks.
Edit: Here is what I mean when I say I am not a fan of temp variables. I do not like code like this:
string userName = userBox.Text;
if (userName.Length < 5) {
MessageBox.Show("The user name " + userName + " that you entered is too short.");
....
Again, this may not be the best way to communicate a problem to the user, but it is just an illustration. The variable userName is unnecessary in my strong opinion in this case. I am not always against temporary variables, but when their use is very limited and they do not save that much typing, I strongly prefer not to use them.
First off, it's not a big deal.
But I would introduce a temp variable here. It costs nothing and is less prone to errors. Imagine someone has to maintain the code later. Better if value only has 1 meaning and purpose.
And don't call it temp, call it cleanedValue or something.
It is a good practice not to change the values of incoming parameters, even if you technically can. Don't touch the value.
I am not a big fan of temporary variables.
Well, programming is largely about creating temporary variables all over the place, reading and assigning values. You'd better start to love them. :)
One more remark regarding properties. Although you could technically put a lot of logic there, it is recommended to keep properties simple and try not to use any code that could throw exceptions. A need to call other functions may indicate that this property is better be made a method or that there is some initialization code needed somewhere. Just rethink what you're doing and whether it does really look like a property.
I need to derive an important value given 7 potential inputs. Uncle Bob urges me to avoid functions with that many parameters, so I've extracted the class. All parameters now being properties, I'm left with a calculation method with no arguments.
“That”, I think, “could be a property, but I'm not sure if that's idiomatic C#.”
Should I expose the final result as a property, or as a method with no arguments? Would the average C# programmer find properties confusing or offensive? What about the Alt.Net crowd?
decimal consumption = calculator.GetConsumption(); // obviously derived
decimal consumption = calculator.Consumption; // not so obvious
If the latter: should I declare interim results as [private] properties, also? Thanks to heavy method extraction, I have several interim results. Many of these shouldn't be part of the public API. Some of them could be interesting, though, and my expressions would look cleaner if I could access them as properties:
decimal interim2 = this.ImportantInterimValue * otherval;
Happy Experiment Dept.:
While debugging my code in VS2008, I noticed that I kept hovering my mouse over the method calls that compute interim results, expecting a hover-over with their return value. After turning all methods into properties, I found that exposing interim results as properties greatly assisted debugging. I'm well pleased with that, but have lingering concerns about readability.
The interim value declarations look messier. The expressions, however, are easier to read without the brackets. I no longer feel compelled to start the method name with a verb. To contrast:
// Clean method declaration; compulsive verby name; callers need
// parenthesis despite lack of any arguments.
decimal DetermineImportantInterimValue() {
return this.DetermineOtherInterimValue() * this.SomeProperty;
}
// Messier property declaration; clean name; clean access syntax
decimal ImportantInterimValue {
get {
return this.OtherInterimValue * this.SomeProperty;
}
}
I should perhaps explain that I've been coding in Python for a decade. I've been left with a tendency to spend extra time making my code easier to call than to write. I'm not sure the Python community would regard this property-oriented style as acceptably “Pythonic”, however:
def determineImportantInterimValue(self):
"The usual way of doing it."
return self.determineOtherInterimValue() * self.someAttribute
importantInterimValue = property(
lambda self => self.otherInterimValue * self.someAttribute,
doc = "I'm not sure if this is Pythonic...")
The important question here seems to be this:
Which one produces more legible, maintainable code for you in the long run?
In my personal opinion, isolating the individual calculations as properties has a couple of distinct advantages over a single monolothic method call:
You can see the calculations as they're performed in the debugger, regardless of the class method you're in. This is a boon to productivity while you're debugging the class.
If the calculations are discrete, the properties will execute very quickly, which means (in my opinion), they observe the rules for property design. It's absurd to think that a guideline for design should be treated as a straightjacket. Remember: There is no silver bullet.
If the calculations are marked private or internal, they do not add unnecessary complexity to consumers of the class.
If all of the properties are discrete enough, compiler inlining may resolve the performance issues for you.
Finally, if the final method that returns your final calculation is far and away easier to maintain and understand because you can read it, that is an utterly compelling argument in and of itself.
One of the best things you can do is think for yourself and dare to challenge the preconceived One Size Fits All notions of our peers and predecessors. There are exceptions to every rule. This case may very well be one of them.
Postscript:
I do not believe that we should abandon standard property design in the vast majority of cases. But there are cases where deviating from The Standard(TM) is called for, because it makes sense to do so.
Personally, I would prefer if you make your public API as a method instead of property. Properties are supposed to be as 'fast' as possible in C#. More details on this discussion: Properties vs Methods
Internally, GetConsumption can use any number of private properties to arrive at the result, choice is yours.
I usually go by what the method or property will do. If it is something that is going to take a little time, I'll use a method. If it's very quick or has a very small number of operations going on behind the scenes, I'll make it a property.
I use to use methods to denote any action on the object or which changes the state of an object. so, in this case I would name the function as CalculateConsumption() which computes the values from other properties.
You say you are deriving a value from seven inputs, you have implemented seven properties, one for each input, and you have a property getter for the result. Some things you might want to consider are:
What happens if the caller fails to set one or more of the seven "input" properties? Does the result still make sense? Will an exception be thrown (e.g. divide by zero)?
In some cases the API may be less discoverable. If I must call a method that takes seven parameters, I know that I must supply all seven parameters to get the result. And if some of the parameters are optional, different overloads of the method make it clear which ones.
In contrast, it may not be so clear that I have to set seven properties before accessing the "result" property, and could be easy to forget one.
When you have a method with several parameters, you can more easily have richer validation. For example, you could throw an ArgumentException if "parameter A and parameter B are both null".
If you use properties for your inputs, each property will be set independently, so you can't perform the validation when the inputs are being set - only when the result property is being dereferenced, which may be less intuitive.