In C# it's conventional to write in a fairly objective manner, like so:
MyObj obj = new MyObj();
MyReturn ret = obj.DoSomething();
AnotherReturn rett = ret.DoSomethingElse();
I could just write the above like this:
AnotherReturn rett = new MyObj().DoSomething().DoSomethingElse();
However, how does the stackframe work when you have a bunch of function calls in a sequence like this? The example is fairly straightforward but imagine if I've got 50+ function calls chained (this can possibly happen in the likes of JavaScript (/w jQuery)).
My assumption was that, for each function call, a return address is made (to the "dot"?) and the return value (a new object with other methods) is then immediately pumped into the next function call at that return address. How does this work w.r.t. getting to the overall return value (in this example the return address will assign the final function value to rett)? If I kept chaining calls would I eventually overflow? In that case, is it considered wiser to take the objective route (at the cost of "needless" memory assignment?).
It's exactly the same as if you called each method on a separate line, assigning the return value to a variable each time and then using that variable to call the next method.
So your two samples are the same, effectively.
Do you have Reflector? You could try the two methods and inspect the generated IL code to see exactly what differences there are.
Although the 2 calls are same but if you have lot of "Dots" then somewhere it is a code smell (Law of Demeter).
See below discussion
Related
I tried to search for this but couldn't quite find an answer.
I have a method, and inside it there's a code block call very often, so I refactored it into a local Func.
Now because I don't use that code block anywhere else, it makes sense to have this instead of another method.
But is it better, performance-wise, to use another method? Does the Func get allocated or in some other way use extra processing time or memory because it's declared inside the function, or does it get cached or even actually made into a method behind the scenes by the compiler?
I know it sounds like a micro-optimization thing, but in my case, the method gets called very often. So maybe that changes the consideration.
So, basically:
public T CalledVeryOften(...)
{
Func<...> block = () => ...;
//code that calls 'block' several times
}
or
public T CalledVeryOften(...)
{
//code that calls 'block()' several times
}
private ... block()
{
...
}
Nah, there shouldn't be a huge difference in performance. A Func either compiles to a static or instance method depending on whether you use closures.
However, if you can inline the Func code it can increase performance.. maybe. Not sure how to do that though.
By inline, I'm referring to the inline keyword we can have in C++. It tells the compiler to embed the function instructions in that code block. I'm not sure if C# offers that benefit.
Btw, if the private method really belongs to a method block that can be reusable and you are using Func for the sake of performance increase, I'd refactor it back to the way it was.
It is a micro optimisation :) Unless your program is noticeably slowing down to an unacceptable level and profiling determines that the root cause is the fact you're making the function call, then you can consider alternatives.
The overhead really is negligible in the grand scheme of things. I would definitely file this under "Things I need not be concerned about".
Besides, you've probably made your code more readable in the process.
I am currently working on converting some VB source code over to C#. While I understand there are converters to automate this, and that I could actually use this particular dll without rewriting it, I'm doing it partially so I can understand VB better. Not so much as to expect to write it, but it's at least helping me to be able to read it.
In doing this though, I've come across something that is quite confusing. The following code snippets are examples, but I've seen it throughout the program.
VB Source Code:
Friend Function AllocateObjectNumber() As Long
AllocateObjectNumber = _nextFreeObjectNumber
_nextFreeObjectNumber += 1
_objectAllocatedCount += 1
End Function
My translated C# Code:
internal long AllocateObjectNumber()
{
cvNextFreeObjectNumber += 1;
cvObjectAllocatedCount += 1;
return cvNextFreeObjectNumber;
}
What I'm not understanding is the flow control that VB uses. I understand that AllocateObjectNumber += 1 is used in place of return cvNextFreeObjectNumber, but if this line comes before the incrementing of the two variables, then how is that code not considered unreachable? Based on my C# understanding, the first line in this method would immediately return to the calling method, and this whole method would basically act as a pseudo-Property.
Any helpful explanations?
The VB approach is more similar to storing the value in a temporary variable:
internal long AllocateObjectNumber()
{
var nextNumber = _nextFreeObjectNumber
cvNextFreeObjectNumber += 1;
cvObjectAllocatedCount += 1;
return nextNumber;
}
In VB the function = value syntax doesn't do a return - so the code after can keep running. When that method reaches the end, then the value you used becomes the 'return' value for whatever called it in the first place.
You can use the function = value syntax multiple times in the same method as a way of returning a different result in different conditions without needing the temporary variable I used in my example.
Based on my C# understanding, the first line in this method would immediately return to the calling method
But it’s not C# code, it’s VB code. AllocateObjectNumber = _nextFreeObjectNumber does not return, it just assigns a return value. The actual return is at the end of the method.
Most people would actually write the VB code identical to the C# code, i.e. using Return explicitly. The assign-to-method-name style is a remnant of older VB versions where it was the only way of returning a value from a function. In VB.NET, you can use both.
I would like to be able to mark a function somehow (attributes maybe?) so that when it's called from anywhere, some other code gets to process the parameters and return a value instead of the called function, or can let the function execute normally.
I would use it for easy caching.
For example, if I had a function called Add10 and it would look like this:
int Add10 (int n)
{
return n + 10;
}
If the function go called repeatedly with the same value (Add10(7)) it would always give the same result (17) so it makes no sense to recalculate every time. Naturally, I wouldn't do it with functions as simple as this but I'm sure you can understand what I mean.
Does C# provide any way of doing what I want?
I need a way to mark a function as cached so that when someone does Add10(16) some code somewhere is ran first to check in a dictionary is we already know the Add10 value of 16 and return it if we do, calculate, store and return if we don't.
You want to memoize the function. Here's one way:
http://blogs.msdn.com/b/wesdyer/archive/2007/01/26/function-memoization.aspx
Instead of the function, then I would expose a Func<> delegate:
Func<int, int> add10 = (n) =>
{
// do some other work
...
int result = Add10(n); // calling actual function
// do some more perhaps even change result
...
return result;
};
And then:
int res = add10(5); // invoking the delegate rather than calling function
Like Jason mentioned, you probably want something like function memoization.
Heres another SO thread about it:
Whose responsibility is it to cache / memoize function results?
You could also achieve this sort of functionality using principles related to Aspect Oriented Programming.
http://msdn.microsoft.com/en-us/magazine/gg490353.aspx
Aspect Oriented Programming in C#
MbCache might be what you're looking for.
At the top of my program I have a code segment that looks like this
var XXXAssembler = new XXXAssembler(ctx);
XXXAssembler.LoadXXX();
var YYYAssembler = new YYYAssembler(ctx );
YYYAssembler.LoadYYY();
var ZZZAssembler = new ZZZAssembler(ctx);
ZZZAssembler.LoadZZZ();
In the above logic I use each varaible once to call the respective loader, and I don't use the variables anywhere else.
I can change the code to this
new XXXAssembler(ctx).LoadXXX();
new YYYAssembler(ctx ).LoadYYY();
new ZZZAssembler(ctx).LoadZZZ();
This reduces the size of the code, but I'd like to think it simplifies it as well. I could see the usefulness of variables for debugging, but I don't think that's necessarily a good reason. Others may disagree.
Is the non-varaible version considered bad coding style?
Unless you're going to use the object assigned to the Assembler variable, then there's no need for it.
I'd say get rid of it, clean up the code, and then if you need it later you can bring it back.
new XXXAssembler(ctx).LoadXXX(); is absolutely fine as long as you don't have use the reference returned by new XXXAssembler(ctx) elsewhere.
If u ask me, the size of the code doesn't matters. Only matter is that, when you see the code 1 year later, to know how it does what it needs to do, and how to rewrite / reuse / etc.
As you mention, the only technical reason to assign the created object to a variable is if you need to use it or look at it somewhere. If you're confident that you'll never need to do this, you don't need to create a new variable, and you can shorten up your code a bit.
But I'll offer up two caveats:
(1) I often find that I need to look at the output of a method before it returns, or at the instance of the object created by the new statement when I'm debugging. So sometimes instead of doing this:
public MyObject ReturnSomeObject()
{
return new MyObject();
}
I'll do this instead:
public MyObject ReturnSomeObject()
{
var myObject = new MyObject();
return myObject;
}
Just so I can look at it in the debugger. It clutters up my code a bit, but it can be very helpful when I'm trying to figure out why something else went wrong.
(2) If you find that you can do the sort of thing you're describing very often, you may want to take a harder look at how your classes are structured. What's the point of a class that has a method that returns nothing and which doesn't modify the internal state of the class in any fashion that you're interested in? To take your example above, presumably your various LoadXXX() methods should return some sort of status code, or modify some status property of the object, or return a pointer to the file that they loaded, or, well, something. If they do, but you're not bothering to look at it - well, that's another problem. If the methods really don't need to modify any aspect of the object's internal state, then you should look strongly at making them static: it allows you to avoid running the class constructor each time you call them, it expresses their intent more clearly, and it allows the compiler to notify you of a possible inconsistency if you do decide that they need to modify the object state at some point in the future.
Nothing hard-and-fast here, just some guidelines.
If you are never going to use the Object again, but for this case, I don't see the point in giving them names. It adds needless lines of clutter to your code.
I think not assiging to a variable is fine. I do this in many cases, e.g. for some unittest mocks new Mock<IInterfaceToMock>.Object or for callbacks functors SomeFunctionAcceptingCallback(args, new CallbackHandler()).
When writing an API or reusable object, is there any technical reason why all method calls that return 'void' shouldn't just return 'this' (*this in C++)?
For example, using the string class, we can do this kind of thing:
string input= ...;
string.Join(input.TrimStart().TrimEnd().Split("|"), "-");
but we can't do this:
string.Join(input.TrimStart().TrimEnd().Split("|").Reverse(), "-");
..because Array.Reverse() returns void.
There are many other examples where an API has lots of void-returning operations, so code ends up looking like:
api.Method1();
api.Method2();
api.Method3();
..but it would be perfectly possible to write:
api.Method1().Method2().Method3()
..if the API designer had allowed this.
Is there a technical reason for following this route? Or is it just a style thing, to indicate mutability/new object?
(x-ref Stylistic question concerning returning void)
EPILOGUE
I've accepted Luvieere's answer as I think this best represents the intention/design, but it seems there are popular API examples out there that deviate from this :
In C++ cout << setprecision(..) << number << setwidth(..) << othernumber; seems to alter the cout object in order to modify the next datum inserted.
In .NET, Stack.Pop() and Queue.Dequeue() both return an item but change the collection too.
Props to ChrisW and others for getting detailed on the actual performance costs.
Methods that return void state more clearly that they have side effects. The ones that return the modified result are supposed to have no side effects, including modifying the original input. Making a method return void implies that it changes its input or some other internal state of the API.
If you had Reverse() return a string, then it wouldn't be obvious to a user of the API whether it returned a new string or the same-one, reversed in-place.
string my_string = "hello";
string your_string = my_string.reverse(); // is my_string reversed or not?
That is why, for instance, in Python, list.sort() returns None; it distinguishes the in-place sort from sorted(my_list).
Is there a technical reason for following this route?
One of the C++ design guidelines is "don't pay for features you don't use"; returning this would have some (slight) performance penalty, for a feature which many people (I, for one) wouldn't be inclined to make use of.
The technical principal that many others have mentioned (that void emphasizes the fact the function has a side-effect) is known as Command-Query Separation.
While there are pros and cons to this principle, e.g., (subjectively) clearer intent vs. more concise API, the most important part is to be consistent.
I'd imagine one reason might be simplicity. Quite simply, an API should generally be as minimal as possible. It should be clear with every aspect of it, what it is for.
If I see a function that returns void, I know that the return type is not important. Whatever the function does, it doesn't return anything for me to work with.
If a function returns something non-void, I have to stop and wonder why. What is this object that might be returned? Why is it returned? Can I assume that this is always returned, or will it sometimes be null? Or an entirely different object? And so on.
In a third-party API, I'd prefer if that kind of questions just never arise.
If the function doesn't need to return anything, it shouldn't return anything.
If you intend your API to be called from F#, please do return void unless you're convinced that this particular method call is going to be chained with another nearly every time it's used.
If you don't care about making your API easy to use from F#, you can stop reading here.
F# is more strict than C# in certain areas - it wants you to be explicit about whether you're calling a method in order to get a value, or purely for its side-effects. As a result, calling a method for its side-effects when that method also returns a value becomes awkward, because the returned value has to be explicitly ignored in order to avoid compiler errors. This makes "fluent interfaces" somewhat awkward to use from F#, which has its own superior syntax for chaining a series of calls together.
For example, suppose we have a logging system with a Log method that returns this to allow for some sort of method chaining:
let add x y =
Logger.Log(String.Format("Adding {0} and {1}", x, y)) // #1
x + y // #2
In F#, because line #1 is a method call that returns a value, and we're not doing anything with that value, the add function is considered to take two values and return that Logger instance. However, line #2 not only also returns a value, it appears after what F# considers to be the "return" statement for the function, effectively making two "return" statements. This will cause a compiler error, and we need to explicitly ignore the return value of the Log method to avoid this error, so that our add method has only a single return statement.
let add x y =
Logger.Log(String.Format("Adding {0} and {1}", x, y)) |> ignore
x + y
As you might guess, making lots of "Fluent API" calls that are mainly about side-effects becomes a somewhat frustrating exercise in sprinkling lots of ignore statements all over the place.
You can, of course, have the best of both worlds and please both C# and F# developers by providing both a fluent API and an F# module for working with your code. But if you're not going to do that, and you intend your API for public consumption, please think twice before returning this from every single method.
Besides the design reasons, there is also a slight performance cost (both in speed and space) for returning this.