When writing an API or reusable object, is there any technical reason why all method calls that return 'void' shouldn't just return 'this' (*this in C++)?
For example, using the string class, we can do this kind of thing:
string input= ...;
string.Join(input.TrimStart().TrimEnd().Split("|"), "-");
but we can't do this:
string.Join(input.TrimStart().TrimEnd().Split("|").Reverse(), "-");
..because Array.Reverse() returns void.
There are many other examples where an API has lots of void-returning operations, so code ends up looking like:
api.Method1();
api.Method2();
api.Method3();
..but it would be perfectly possible to write:
api.Method1().Method2().Method3()
..if the API designer had allowed this.
Is there a technical reason for following this route? Or is it just a style thing, to indicate mutability/new object?
(x-ref Stylistic question concerning returning void)
EPILOGUE
I've accepted Luvieere's answer as I think this best represents the intention/design, but it seems there are popular API examples out there that deviate from this :
In C++ cout << setprecision(..) << number << setwidth(..) << othernumber; seems to alter the cout object in order to modify the next datum inserted.
In .NET, Stack.Pop() and Queue.Dequeue() both return an item but change the collection too.
Props to ChrisW and others for getting detailed on the actual performance costs.
Methods that return void state more clearly that they have side effects. The ones that return the modified result are supposed to have no side effects, including modifying the original input. Making a method return void implies that it changes its input or some other internal state of the API.
If you had Reverse() return a string, then it wouldn't be obvious to a user of the API whether it returned a new string or the same-one, reversed in-place.
string my_string = "hello";
string your_string = my_string.reverse(); // is my_string reversed or not?
That is why, for instance, in Python, list.sort() returns None; it distinguishes the in-place sort from sorted(my_list).
Is there a technical reason for following this route?
One of the C++ design guidelines is "don't pay for features you don't use"; returning this would have some (slight) performance penalty, for a feature which many people (I, for one) wouldn't be inclined to make use of.
The technical principal that many others have mentioned (that void emphasizes the fact the function has a side-effect) is known as Command-Query Separation.
While there are pros and cons to this principle, e.g., (subjectively) clearer intent vs. more concise API, the most important part is to be consistent.
I'd imagine one reason might be simplicity. Quite simply, an API should generally be as minimal as possible. It should be clear with every aspect of it, what it is for.
If I see a function that returns void, I know that the return type is not important. Whatever the function does, it doesn't return anything for me to work with.
If a function returns something non-void, I have to stop and wonder why. What is this object that might be returned? Why is it returned? Can I assume that this is always returned, or will it sometimes be null? Or an entirely different object? And so on.
In a third-party API, I'd prefer if that kind of questions just never arise.
If the function doesn't need to return anything, it shouldn't return anything.
If you intend your API to be called from F#, please do return void unless you're convinced that this particular method call is going to be chained with another nearly every time it's used.
If you don't care about making your API easy to use from F#, you can stop reading here.
F# is more strict than C# in certain areas - it wants you to be explicit about whether you're calling a method in order to get a value, or purely for its side-effects. As a result, calling a method for its side-effects when that method also returns a value becomes awkward, because the returned value has to be explicitly ignored in order to avoid compiler errors. This makes "fluent interfaces" somewhat awkward to use from F#, which has its own superior syntax for chaining a series of calls together.
For example, suppose we have a logging system with a Log method that returns this to allow for some sort of method chaining:
let add x y =
Logger.Log(String.Format("Adding {0} and {1}", x, y)) // #1
x + y // #2
In F#, because line #1 is a method call that returns a value, and we're not doing anything with that value, the add function is considered to take two values and return that Logger instance. However, line #2 not only also returns a value, it appears after what F# considers to be the "return" statement for the function, effectively making two "return" statements. This will cause a compiler error, and we need to explicitly ignore the return value of the Log method to avoid this error, so that our add method has only a single return statement.
let add x y =
Logger.Log(String.Format("Adding {0} and {1}", x, y)) |> ignore
x + y
As you might guess, making lots of "Fluent API" calls that are mainly about side-effects becomes a somewhat frustrating exercise in sprinkling lots of ignore statements all over the place.
You can, of course, have the best of both worlds and please both C# and F# developers by providing both a fluent API and an F# module for working with your code. But if you're not going to do that, and you intend your API for public consumption, please think twice before returning this from every single method.
Besides the design reasons, there is also a slight performance cost (both in speed and space) for returning this.
Related
Often times a developer on my team writes code in a loop that makes a call that is relatively slow (i.e. database access or web service call or other slow method). This is a super common mistake.
Yes, we practice code reviews, and we try to catch these and fix them before merging. However, failing early is better, right?
So is there a way to catch this mistake via the compiler?
Example:
Imagine this method
public ReturnObject SlowMethod(Something thing)
{
// method work
}
Below the method is called in a loop, which is a mistake.
public ReturnObject Call(IEnumerable<Something> things)
{
foreach(var thing in Things)
SlowMethod(thing); // Should throw compiler error or warning in a loop
}
Is there any way to decorate the above SlowMethod() with an attribute or compiler statement so that it would complain if used in a loop?
No, there is nothing in regular C# to prevent a method being used in a loop.
Your options:
discourage usage in a loop by providing easier to use alternatives. Providing second (or only) method that deals with collections will likely discourage one from writing calls in a loop enough so it is no longer a major concern.
try to write your own code analysis rule (stating tutorial - https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/tutorials/how-to-write-csharp-analyzer-code-fix)
add run-time protection to the method if it is called more often than you'd like.
Obviously it makes sense to invoke those slow methods in a loop - you're trying to put work into preventing that, but that's putting work into something fundamentally negative. Why not do something positive instead? Obviously, you've provided an API that's convenient to use in a loop. So, provide some alternatives that are easier to use correctly where formerly an incorrect use in a loop would take place, like:
an iterable-based API that would make the loop implicit, to remove some of the latency since you'd have a full view of what will be iterated, and can hide the latency appropriately,
an async API that won't block the thread, with example code showing how to use it in the typical situations you've encountered thus far; remember that an API that's too hard to use correctly won't get used!
a lowest-common-denominator API: split the methods into a requester and a result provider, so that there'd naturally be two loops: one to submit all the requests, another to collect and process the results (I dislike this approach, since it doesn't make the code any nicer)
I am a student and I am currently preparing for my OOP Basics Exam.
When in the controller you have methods which return a value and such that are void - how do you invoke them without using a if-else statement?
In my code "status" is the only one which should return a string to be printed on the Console - the others are void. So I put a if-esle and 2 methods in the CommandHandler.
Since I know "if-else" is a code smell, is there a more High Quality approach to deal with the situation?
if (commandName == "status")
{
this.Writer.WriteLine(this.CommandHandler.ExecuteStatusCommand(commandName));
}
else
{
this.CommandHandler.ExecuteCommand(commandName, commandParameters);
}
This is the project.
Thank you very much.
First, don't worry about if/else. If anybody tells you if/else is a code smell, put it through the Translator: What comes out is he's telling you he's too crazy, clueless, and/or fanatical to be taken seriously.
If by ill chance you get an instructor who requires you to say the Earth is flat to get an A, sure, tell him the Earth is flat. But if you're planning on a career or even a hobby as a navigator, don't ever forget that it's actually round.
So. It sounds to me like CommandHandler.ExecuteStatusCommand() executes the named command, which is implemented as a method somewhere. If the command method is void, ExecuteStatusCommand() returns null. Otherwise, the command method may return a string, in which case you want to write it to what looks like a stream.
OK, so one approach here is to say "A command is implemented via a method that takes a parameter and returns either null or a string representing a status. If it returns anything but null, write that to the stream".
This is standard stuff: You're defining a "contract". It's not at all inappropriate for command methods which actually return nothing to have a String return type, because they're fulfilling the terms of contract. "Return a string" is an option that's open to all commands; some take advantage, some don't.
This allows knowledge of the command's internals to be limited to the command method itself, which is a huge advantage. You don't need to worry about special cases at the point where you call the methods. The code below doesn't need to know which commands return a status and which don't. The commands themselves are given a means to communicate that information back to the caller, so only they need to know. It's incredibly beneficial to have a design which allows different parts of your code not to care about the details of other parts. Clean "interfaces" like this make that possible. The calling code gets simpler and stays simpler. Less code, with less need to change it over time, means less effort and fewer bugs.
As you noted, if you've got a "status" command that prints a result, and then later on you add a "print" command that also prints a result, you've got to not only implement the print command itself, but you've also got to remember to return to this part of your code and add a special case branch to the if/else.
That kind of tedious error-prone PITA is exactly the kind of nonsense OOP is meant to eliminate. If a new feature can be added without making a single edit to existing code, that's a sort of Platonic ideal of OOP.
So if ExecuteCommand() returns void, we'll want to be calling ExecuteStatusCommand() instead. I'm guessing at some things here. It would have been helpful if you had sketched out the semantics of those two methods.
var result = this.CommandHandler.ExecuteCommand(commandName, commandParameters);
if (result != null)
{
this.Writer.WriteLine(result);
}
If my assumptions about your design are accurate, that's the whole deal. commandParameters, like the status result, are an optional part of the contract. There's nothing inherently wrong with if/else, but sometimes you don't need one.
I often find myself using lambdas as some sort of "local functions" to make my life easier with repetetive operations like those:
Func<string, string> GetText = (resource) => this.resourceManager.GetString(resource);
Func<float, object, string> FormatF1 = (f, o) => String.Format("{0:F1} {1}", f, o);
Func<float, object, string> FormatF2 = (f, o) => String.Format("{0:F2} {1}", f, o);
Instead of writing the String.Format-thing over and over, I can happily blow away with FormatF2 e.g. and save myself time and when I need to change something about the formatting, only one place to make edits.
Especially when I need the functionality in the given function exclusively, I'm very reluctant to turn them into a real function. While the lambdas above were relatively small... sometimes I have larger ones like (the following is supposed to add data to a table for print output):
Action<string, string, string> AddSurfaceData = (resource, col, unit) => {
renderTable.Cells[tableRowIndex, 0].Text = "\t" + this.GetText(resource);
renderTable.Cells[tableRowIndex, 1].Text = FormatF2(paraHydReader.GetFloat(paraHydReader.GetOrdinal(col)), "");
renderTable.Cells[tableRowIndex, 1].Style.TextAlignHorz = C1.C1Preview.AlignHorzEnum.Right;
renderTable.Cells[tableRowIndex, 2].Text = " " + this.GetText(unit);
renderTable.Cells[tableRowIndex, 2].Style.TextAlignHorz = C1.C1Preview.AlignHorzEnum.Left;
++tableRowIndex;
};
Again, I need this often and all the benefits of above apply, too. However, as you can see, this one is quite long for a lambda expression.. the question is: When do you draw the line? Is my last lambda too much? What other ways (other than using real functions or trying to stuff the data in containers and loop over them) exist to avoid writing the same code over and over again?
Thanks in advance
Christian
It is something you use potentially many times within a method, and only that inside that method. I like this idea. Do it if it doesn't make your code hard to read. I would say that you should reconsider if you find it difficult to see what is the content of the lambda function vs. what is the real content of the method. In that case it might be cleaner to pull it out in a separate private method.
At the end, this is really a matter of taste...
I agree with awe: for small scale reuse inside a method (or even a class) this is perfect. Like the string.Format examples. I use this quite often. It's basically the same thing as using a local variable for an intermediate value that you use more than once, but then for code.
Your second example seems to be pushing it a bit. Somehow this gives me the feeling a private method AddSurfaceData (possibly static, depending on its use?) would be a better fit. That is of course outside of the context that you have, so use your own good judgement.
A Lambda method is an anonymous method.
This means that you should not give it a name.
If you are doing that, (in your case, you are assigning a name with your reference), it's just another way to declare a function.
C# has already got a way to declare functions, and it's not the lambda way, which was added
uniquely to pass functions via parameters and returns them as return values.
Think, as an example, in javascript:
function f(var1,var2,...,varX)
{
some code
}
or
var f = function() {
some code
}
Different syntax (almost) same thing.
For more information on why it's not the same thing: Javascript: var functionName = function() {} vs function functionName() {}
Another example: in Haskell You can define two functions:
function1 :: Int -> Int
function1 x = x + 2
or
function2 :: Int -> Int
function2 = \x -> x + 2
Same thing (this time I think it's the very same), different syntax. I prefer the first one, it's more clear.
C# 3.5, as Javascript, has got a lot of functional influences. Some of them should it be used wisely, IMHO.
Someone said local lambda functions with assignment in a reference is a good substitute for a method defined within another method, similar to a "let", or a "where" clause in Haskell.
I say "similar" because the twos have very different semantics, for instance, in Haskell I can use function name which is not declared yet and define it later with "where", while in C#, with function/reference assignment I can't do this.
By the way I think it's a good idea, I'm not banning this use of lambda function, I just want to make people think about it.
Every language has got his abstraction mechanism, use it wisely.
I like the idea. I don't see a better way to maintain code locality without violating the DRY principle. And I think it's only harder to read if you're not accustomed to lambdas.
+1 on nikie re DRY being good in general.
I wouldnt use PascalCase naming for them though.
Be careful though - in most cases the stuff you have in there is just an Extract Method in a dress or a potential helper or extension function. e.g., GetText is a Method and FormatF* is probably a helper method...
I have no problem with the long example. I see that you are repackaging compound data very elegantly.
It's the short ones that will drive your colleagues to investigate the advantages of voluntary institutionalization. Please declare some kind of constant to hold your format string and use "that String.Format-thing over and over". As a C# programmer, I know what that does without looking elsewhere for home-spun functions. That way, when I need to know what the formatted string will look like, I can just examine the constant.
I agree with volothamp in general, but in addition ...
Think of the other people that have to maintain your code. I think this is easier to understand than your first example and still offers the maintenance benefits you mention:
String.Format(this.resourceManager.GetString("BasicFormat"), f, o);
String.Format(this.resourceManager.GetString("AdvancedFormat"), f, o);
And your second example appears to be just a different way to declare a function. I don't see any useful benefit over declaring a helper method. And declaring a helper method will be more understandable to the majority of coders.
I personally think its not in good taste to use lambda functions when there is no need for it. I personally wont use a lambda function to replace a simple few lines of procedural code. Lambda functions offer many enhancements, but make the code slightly more complicated to read.
I wouldnt use it to replace string.format.
I need to derive an important value given 7 potential inputs. Uncle Bob urges me to avoid functions with that many parameters, so I've extracted the class. All parameters now being properties, I'm left with a calculation method with no arguments.
“That”, I think, “could be a property, but I'm not sure if that's idiomatic C#.”
Should I expose the final result as a property, or as a method with no arguments? Would the average C# programmer find properties confusing or offensive? What about the Alt.Net crowd?
decimal consumption = calculator.GetConsumption(); // obviously derived
decimal consumption = calculator.Consumption; // not so obvious
If the latter: should I declare interim results as [private] properties, also? Thanks to heavy method extraction, I have several interim results. Many of these shouldn't be part of the public API. Some of them could be interesting, though, and my expressions would look cleaner if I could access them as properties:
decimal interim2 = this.ImportantInterimValue * otherval;
Happy Experiment Dept.:
While debugging my code in VS2008, I noticed that I kept hovering my mouse over the method calls that compute interim results, expecting a hover-over with their return value. After turning all methods into properties, I found that exposing interim results as properties greatly assisted debugging. I'm well pleased with that, but have lingering concerns about readability.
The interim value declarations look messier. The expressions, however, are easier to read without the brackets. I no longer feel compelled to start the method name with a verb. To contrast:
// Clean method declaration; compulsive verby name; callers need
// parenthesis despite lack of any arguments.
decimal DetermineImportantInterimValue() {
return this.DetermineOtherInterimValue() * this.SomeProperty;
}
// Messier property declaration; clean name; clean access syntax
decimal ImportantInterimValue {
get {
return this.OtherInterimValue * this.SomeProperty;
}
}
I should perhaps explain that I've been coding in Python for a decade. I've been left with a tendency to spend extra time making my code easier to call than to write. I'm not sure the Python community would regard this property-oriented style as acceptably “Pythonic”, however:
def determineImportantInterimValue(self):
"The usual way of doing it."
return self.determineOtherInterimValue() * self.someAttribute
importantInterimValue = property(
lambda self => self.otherInterimValue * self.someAttribute,
doc = "I'm not sure if this is Pythonic...")
The important question here seems to be this:
Which one produces more legible, maintainable code for you in the long run?
In my personal opinion, isolating the individual calculations as properties has a couple of distinct advantages over a single monolothic method call:
You can see the calculations as they're performed in the debugger, regardless of the class method you're in. This is a boon to productivity while you're debugging the class.
If the calculations are discrete, the properties will execute very quickly, which means (in my opinion), they observe the rules for property design. It's absurd to think that a guideline for design should be treated as a straightjacket. Remember: There is no silver bullet.
If the calculations are marked private or internal, they do not add unnecessary complexity to consumers of the class.
If all of the properties are discrete enough, compiler inlining may resolve the performance issues for you.
Finally, if the final method that returns your final calculation is far and away easier to maintain and understand because you can read it, that is an utterly compelling argument in and of itself.
One of the best things you can do is think for yourself and dare to challenge the preconceived One Size Fits All notions of our peers and predecessors. There are exceptions to every rule. This case may very well be one of them.
Postscript:
I do not believe that we should abandon standard property design in the vast majority of cases. But there are cases where deviating from The Standard(TM) is called for, because it makes sense to do so.
Personally, I would prefer if you make your public API as a method instead of property. Properties are supposed to be as 'fast' as possible in C#. More details on this discussion: Properties vs Methods
Internally, GetConsumption can use any number of private properties to arrive at the result, choice is yours.
I usually go by what the method or property will do. If it is something that is going to take a little time, I'll use a method. If it's very quick or has a very small number of operations going on behind the scenes, I'll make it a property.
I use to use methods to denote any action on the object or which changes the state of an object. so, in this case I would name the function as CalculateConsumption() which computes the values from other properties.
You say you are deriving a value from seven inputs, you have implemented seven properties, one for each input, and you have a property getter for the result. Some things you might want to consider are:
What happens if the caller fails to set one or more of the seven "input" properties? Does the result still make sense? Will an exception be thrown (e.g. divide by zero)?
In some cases the API may be less discoverable. If I must call a method that takes seven parameters, I know that I must supply all seven parameters to get the result. And if some of the parameters are optional, different overloads of the method make it clear which ones.
In contrast, it may not be so clear that I have to set seven properties before accessing the "result" property, and could be easy to forget one.
When you have a method with several parameters, you can more easily have richer validation. For example, you could throw an ArgumentException if "parameter A and parameter B are both null".
If you use properties for your inputs, each property will be set independently, so you can't perform the validation when the inputs are being set - only when the result property is being dereferenced, which may be less intuitive.
This is related to conventions used in C#.
I've got a method that has two parameters (X and Y coordinates). These coordinates represent the position at which a "tile" may reside. If a tile resides at these coordinates, the method returns its number. If no tile resides at these coordinates, I'm wondering how the method should behave.
I see three options:
Use exceptions. I may raise an exception every time Method finds no tile. However, as this situation is not rare, this option is the worst one.
Do it the old fashioned C++ way and return -1 if there is no tile.
Make the tile number a reference parameter and change the return type of method to boolean to show whether there is a tile or not. But this seems a bit complicated to me.
So, what should I do?
You can return null, and check for this on the calling code.
Of course you'd have to use a nullable type:
int? i = YourMethodHere(x, y);
Return -1.
This is not just a C++ convention, it's also common in the .NET Framework - e.g. methods like String.IndexOf or properties like SelectedIndex for controls that represent lists.
EDIT
Just to elaborate, of the three options in your question (Exception, return -1, out parameter), returning -1 is the way to go. Exceptions are for exceptional situations, and the Microsoft coding guidelines recommends avoiding out parameters where possible.
In my view returning -1 (provided it's always going to be an invalid value), returning a nullable int, or returning a Tile object are all acceptable solutions, and you should choose whichever is most consistent with the rest of your app. I can't imagine any developer would have the slightest difficulty with any of the following:
int tileNumber = GetTile(x,y);
if (tileNumber != -1)
{
... use tileNumber ...
}
int? result = GetTile(x,y);
if (result.HasValue)
{
int tileNumber = result.Value;
... use tileNumber ...
}
Tile tile = GetTile(x,y);
if (tile != null)
{
... use tile ...
}
I'm not sure I understand Peter Ruderman's comment about using an int being "much more efficient than returning a nullable type". I'd have thought any difference would be negligible.
Exceptions are for exceptional cases, so using exceptions on a known and expected error situation is "bad". You also are more likely, now, to have try-catches everywhere to handle this error specifically because you expect this error situation to happen.
Making your return value a parameter is acceptable if your only error condition (say -1) is confusable with a real value. If you can have a negative tile number then this is a better way to go.
A nullable int is a possible alternative to a reference parameter but you are creating objects with this so if an "error" is routine they you may be making more work this way than a reference parameter. As Roman pointed out in a comment elsewhere you will have C# vs. VB issues with the nullable type being introduced too late for VB to provide nice syntactic sugar like C# has.
If your tiles can only be non-negative then returning -1 is an acceptable and traditional way to indicate an error. It would also be the least expensive in terms of performance and memory.
Something else to consider is self-documentation. Using -1 and an exception are convention: you'd have to write documentation to make sure the developer is aware of them. Using an int? return or a reference parameter would better self-describe itself and wouldn't require documentation for a developer to know how to handle the error situation. Of course :) you should always write the documentation, just like how you should floss your teeth daily.
Use a nullable return value.
int? GetTile(int x, int y) {
if (...)
return SomeValue;
else
return null;
}
This is the clearest solution.
If your method has access to the underlying tile objects, another possibility would be to return the tile object itself, or null if there is no such tile.
I would go with option 2. You're right, throwing an exception in such a common case may be bad for performance, and using an out parameter and returning a true or false is useful but screwy to read.
Also, think of the string.IndexOf() method. If nothing is found, it returns -1. I'd follow that example.
You could return -1, as that is a fairly common C# approach. However, it might be better to actually return the tile that was clicked, and in the event that no tile was clicked, return a reference to a singleton NullTile instance. The benefit of doing it this way is that you give a concrete meaning to each value returned, rather than it just being a number that has no intrinsic meaning beyond its numeric value. A type 'NullTile' is very specific as to its meaning, leaving little to doubt for other readers of your code.
The best options are to return a boolean as well or return null.
e.g.
bool TryGetTile(int x, int y, out int tile);
or,
int? GetTile(int x, int y);
There are several reasons to prefer the "TryGetValue" pattern. For one, it returns a boolean, so client code is incredibly straight forward, eg: if (TryGetValue(out someVal)) { /* some code */ }. Compare this to client code which requires hard-coded sentinel value comparisons (to -1, 0, null, catching a particular set of exceptions, etc.) "Magic numbers" crop up quickly with those designs and factoring out the tight-coupling becomes a chore.
When sentinel values, null, or exceptions are expected it's absolutely vital that you check the documentation on which mechanism is used. If documentation doesn't exist or isn't accessible, a common scenario, then you have to infer based on other evidence, if you make the wrong choice you are simply setting yourself up for a null-reference exception or other bad defects. Whereas, the TryGetValue() pattern is pretty close to self-documenting by it's name and method signature alone.
I have my own opinion on the question that you asked, but it's stated above and I've voted accordingly.
As to the question that you didn't ask, or at least as an extension to all of the answers above: I would be sure to keep the solution to similar situations consistent across the app. In other words, whatever answer you settle on, keep it the same within the app.
If the method is part of a low level library, then your standard .NET design probably dictates that you throw exceptions from your method.
This is how the .NET framework generally works. Your higher level callers should catch your exceptions.
However since you seem to be doing this from a UI thread, which has performance implications since you are responding to UI events - I do what Jay Riggs already suggested, return null, and make sure your callers check for a null return value.
I'd break it into two methods. Have something like CheckTileExists(x,y) and GetTile(x,y). The former returns a boolean that indicates whether or not there is a tile at the given coordinates. The second method is essentially the one you're talking about in your original post, except it should throw an exception when given invalid coordinates (since that indicates the caller didn't first call CheckTileExists(), so it is legitimately an exceptional situation. For the sake of speed, you'd probably want these two methods to share a cache, so that in the event they're called one after the other, the overhead on the GetTile() function would be negligible. I don't know if you already have a suitable object to put these methods on or if perhaps you should make them two methods on a new class. IMHO, the performance penalty of this approach is negligible and the increase in code clarity far outweighs it.
Is it possible you have created (or could create) a Tile object that is referenced at the coordinates? If so, you can return a reference to that tile or null if there is no tile at the given coordinates:
public Tile GetTile(int x, int y) {
if (!TileExists(x, y))
return null;
// ... tile lookup here...
}