Good morning, afternoon or night,
Have you ever wrote anything in which Code Analysis popped up this kind of warnings? If so, did you pay attention to them and implement the friendly alternates? If so, using code repetition to avoid performance breakdowns or using operator calls?
Thank you very much.
If a method only contains a call to another method then most likely the outer method will be inlined into its caller. Which means that there is no performance loss. (Release build without debugger attached).
So I wouldn't duplicate the code and call the operators instead.
Personally I don't really get why the rule exists at all. Shouldn't languages without operator overloading support be able to just manually call the op_SomeThing public static method like any other method?
I do it on public classes of assemblies that are expected not to just see private use, and sometimes beyond that. Still, with one calling into the other, the overhead is negligible if indeed there is any overhead at all (I would expect inlining to mean the latter)
Related
I have been reading up on virtual methods and how they are called. As discussed here and here, I have reached the conclusion that they should not really be that different.
The C# compiler emits IL code that calls static methods by the call IL instruction and calls virtual/non-virtual members by callvirt. It seems it is the job of JIT to actually figure out if the object the method being called from is actually null or not. So the check is the same for both.
Also, as discussed in the first article, it seems that vtables or tables that hold metadata on method definitions, are flattened at compile time. In other words, the tables contain exactly which method the object should call without a need for a recursive search up the inheritance chain.
With all the above, why are virtual methods considered slower? Is maybe one level of indirection(if any)that big of a deal? Please explain...
You're looking at the difference between a function call instruction with direct vs indirect addressing. But most of the "cost" of an indirect function call is not the call itself, but the lost opportunity to perform optimizations which require static knowledge of the target. Inlining, cross-procedure aliasing analysis, and so on.
Figuring out which actual method implementation to execute is going to have some cost, as opposed to just knowing. That cost can be very small, and it is quite likely that the cost is entirely negligible for any particular context because it really doesn't take that long. But the cost is non-zero, so in particularly performance sensitive applications it will make some difference.
I have two questions, stemming from observed behavior of C# static methods (which I may be misinterpretting):
First:
Would a recursive static method be tail call optimized in a sense by the way the static method is implemented under the covers?
Second:
Would it be equivalent to functional programming to write an entire application with static methods and no variables beyond local scope? I am wondering because I still haven't wrapped my head around this "no side effects" term I keep hearing about functional programming..
Edit:
Let me mention, I do use and understand why and when to use static methods in the normal C# OO methodology, and I do understand tail call optimization will not be explicitly done to a recursive static method. That said, I understand tail call optimization to be an attempt at stopping the creation of a new stack frame with each pass, and I had at a couple points observed what appeared to be a static method executing within the frame of it's calling method, though I may have misinterpreted my observation.
Would a recursive static method be tail call optimized in a sense by the way the static method is implemented under the covers?
Static methods have nothing to do with tail recursion optimization. All the rules equally apply to instance and static methods, but personally I would never rely on JIT optimizing away my tail calls. Moreover, C# compiler doesn't emit tail call instruction but sometimes it is performed anyway. In short, you never know.
F# compiler supports tail recursion optimization and, when possible, compiles recursion to loops.
See more details on C# vs F# behavior in this question.
Would it be equivalent to functional programming to write an entire application with static methods and no variables beyond local scope?
It's both no and yes.
Technically, nothing prevents you from calling Console.WriteLine from a static method (which is a static method itself!) which obviously has side-effects. Nothing also prevents you from writing a class (with instance methods) that does not change any state (i.e. instance methods don't access instance fields). However from the design point of view, such methods don't really make sense as instance methods, right?
If you Add an item to .NET Framework List<T> (which has side effects), you will modify its state.
If you append an item to an F# list, you will get another list, and the original will not be modified.
Note that append indeed is a static method on List module. Writing “transformation” methods in separate modules encourages side-effect free design, as no internal storage is available by definition, even if the language allows it (F# does, LISP doesn't). However nothing really prevents you from writing a side-effect free non-static method.
Finally, if you want to grok functional language concepts, use one! It's so much more natural to write F# modules that operate immutable F# data structures than imitate the same in C# with or without static methods.
The CLR does do some tail call optimisations but only in 64-bit CLR processes. See the following for where it is done: David Broman's CLR Profiling API Blog: Tail call JIT conditions.
As for building software with just static variables and local scope, I've done this a lot and it's actually fine. It's just another way of doing things that is as valid as OO is. In fact because there is no state outside the function/closure, it's safer and easier to test.
I read the entire SICP book from cover to cover first however: http://mitpress.mit.edu/sicp/
No side effects simply means that the function can be called with the same arguments as many times as you like and always return the same value. That simply defines that the result of the function is always consistent therefore does not depend on any external state. Due to this, it's trivial to parallelize the function, cache it, test it, modify it, decorate it etc.
However, a system without side effects is typically useless, so things that do IO will always have side effects. It allows you to neatly encapsulate everything else though which is the point.
Objects are not always the best way, despite what people say. In fact, if you've ever used a LISP variant, you will no doubt determine that typical OO does sometimes get in the way.
There's a pretty good book written on this subject, http://www.amazon.com/Real-World-Functional-Programming-Examples/dp/1933988924.
And in the real world using F# unfortunately isn't an option due to team skills or existing codebases, which is another reason I do love this book, as it has shows many ways to implement F# features in the code you use day to day. And to me at least the vast reduction in state bugs, which take far longer to debug than simple logic errors, is worth the slight reduction in OOP orthodoxy.
For the most part having no static state and operating in a static method only on the parameters given will eliminate side-effects, as you're limiting yourself to pure functions. One point to watch out for though is retrieving data to be acted on or saving data to a database in such a function. Combining OOP and static methods, though, can help here, by having your static methods delegate to lower level objects commands to manipulate state.
Also a great help in enforcing function purity is to keep objects immutable whenever possible. Any object acted on should return a new modified instance, and the original copy discarded.
Regarding second question: I believe you mean "side effects" of mutable data structures, and obviously this is not a problem for (I believe) most functional languages. For instance, Haskel mostly (or even all!?) uses immutable data structures. So there is nothing about "static" behaviour.
I assume that public or private static targets must have reduced memory usage, due to the fact that there is only one copy of the static target in memory.
It seems like because a method is static that might make the method a potential point for further optimization by the CLR compiler beyond what is possible with a non-static function. Just a flimsy theory though, so I've come to ask you all.
Do static public or private methods provide any increased performance benefit beyond reduced memory usage?
(Note: I'm not interested in responses that talk on the problems of premature optimization. Certainly that's sound advice I follow everyday, but that does not mean optimization is not necessary at times. (double negative!). Allow me to indulge my curiosity, at the least)
From Static Classes and Static Class Members (C# Programming Guide)
A call to a static method generates a
call instruction in Microsoft
intermediate language (MSIL), whereas
a call to an instance method generates
a callvirt instruction, which also
checks for a null object references.
However, most of the time the
performance difference between the two
is not significant.
Aside from what astander said, your question suggests a misunderstanding of what instance methods do. Regardless of whether the function is static or not, there is only one copy of the function code in memory. A non-static method has to be called through an object, but the object does not carry its own private copy of the method. So the memory usage of static and non-static methods is in fact identical, and as others have pointed out, the performance characteristics are nearly identical.
Non-static member variables, however, do exist separately for every object that you create. But it is nearly always a waste of time to worry about that memory usage, unless you actually have a memory-related problem in your program.
This is a little bit off-topic, but none the less important.
The choice of making methods static or instance should not be based on execution time (which anyway seems not to matter). It should be based on whether the method operates on an object. For instance, all the Math.* methods are static while e.g. (most) String.* methods are instance since they operate on a String instance. My personal philosophy: a good design should make up for the few cycles that may be saved elsewhere.
Another view on the subject: I recently worked with a guy who had been told that static methods are evil because they take us back to the dark age of procedural programming and thus shall be avoided at all costs. This resulted in weird examples of classes that required instances for access to methods that had absolutely no interest in the internals of the object.
Phew, it felt good to get that from my hearth.
Good answers - basically it doesn't matter, which is the answer to nearly every question of this sort. Even if it did make a difference - If execution time of your program cost a dollar, this sort of issue is likely to cost a fraction of a cent, and it is very likely that there are other things costing a great deal more.
MeasureIt to be certain, but you'll find unless you're creating a globe-spanning ultra-high-volume transaction processing supercomputing cluster, it's not going have an appreciable difference.
Lately, I have taken to the pattern of having a lot of diagnostic logging in parts of my code, that makes use of lambda expressions/anonymous delegates like so:
MyEventManager.LogVerbose( LogCategory.SomeCategory, () => String.Format(msg_string, GetParam1(), GetParam2(), GetParam3() );
Notice that the second argument to LogVerbose is a lambda expression which evaluates to a string. The reason for this is that if verbose logging is not actually enabled, LogVerbose should exit having done as little work as possible, in order to minimize performance impact. The construction of the error message string may, in some cases, take time or resources, and if the lambda expression is never evaluated, that performance penalty will not be incurred.
I'm wondering if littering the type system with so many anonymous delegates like this will have some unforeseen consequence for application performance, or if there are any other strategies I should consider.
It should be fine. In particular, if your anonymous function doesn't capture anything, it is cached as a static field (because it can be). If you capture "this" then you'll end up creating new delegate instances, but they're not expensive.
If you capture local variables, that will involve instantiating a nested type - but I'd only worry about this if you saw it actually becoming a problem. As ever with optimisation, focus on readability first, measure the performance, and then profile it to find out where you need to concentrate your efforts.
Whilst I don't actually know for sure the answer to the question, I think its worth considering that a drive to a more functional style of programming in C# would be seriously undermined if there were any suggestion that there would some kind of limit on the use of such expressions.
I've got a solution that's got thousands of anon delegates, and it still works. Sometimes Visual Studio is a little clunky, but whether that's because we've got hundreds of projects or this or some other factor is unknown. The performance of the applications don't seem to be that strongly affected (with quite a bit of perf testing).
I have a function that I use to add vectors, like this:
public static Vector AddVector(Vector v1, Vector v2)
{
return new Vector(
v1.X + v2.X,
v1.Y + v2.Y,
v1.Z + v2.Z);
}
Not very interesting. However, I overload the '+' operator for vectors and in the overload I call the AddVector function to avoid code duplication. I was curious whether this would result in two method calls or if it would be optimized at compile or JIT time. I found out that it did result in two method calls because I managed to gain 10% in total performance by duplicating the code of the AddVector as well as the dot product method in the '+' and '*' operator overload methods. Of course, this is a niche case because they get called tens of thousands of times per second, but I didn't expect this. I guess I expected the method to be inlined in the other, or something. And I suppose it's not just the overhead of the method call, but also the copying of the method arguments into the other method (they're structs).
It's no big deal, I can just duplicate the code (or perhaps just remove the AddVector method since I never call it directly) but it will nag me a lot in the future when I decide to create a method for something, like splitting up a large method into several smaller ones.
If you compile into debug mode or begin the process with a debugger attatched (though you can add one later) then a large class of JIT optimisations, including inlining, won't happen.
Try re-running your tests by compiling it in Release mode and then running it without a debugger attatched (Ctrl+F5 in VS) and see if you see the optimisations you expected.
"And I suppose it's not just the overhead of the method call, but also the copying of the method arguments into the other method (they're structs)."
Why don't you test this out? Write a version of AddVector that takes a reference to two vector structs, instead of the structs themselves.
Don't assume that struct is the right choice for performance. The copying cost can be significant in some scenarios. Until you measure you don't know. Furthermore, structs have spooky behaviors, especially if they're mutable, but even if they're not.
In addition, what others have said is correct:
Running under a debugger will disable JIT optimizations, making your performance measurements invalid.
Compiling in Debug mode also makes performance measurements invalid.
I had VS in Release mode and I ran without debugging so that can't be to blame. Running the .exe in the Release folder yields the same result. I have .NET 3.5 SP1 installed.
And whether or not I use structs depends on how many I create of something and how large it is when copying versus referencing.
You say Vector is a struct. According to a blog post from 2004, value types are a reason for not inlining a method. I don't know whether the rules have changed about that in the meantime.
Theres only one optimization I can think of, maybe you want to have a vOut parameter, so you avoid the call to new() and hence reduce garbage collection - Of course, this depends entirely on what you are doing with the returned vector and if you need to persist it or not, and if you're running into garbage collection problems.