I am using c# to go through a loop and do something (this loop is massive, sometimes as big as 1,000,000 records long). I wanted to replace the inline code with code that does the exact same thing, except in a function.
I am guessing there is a slight decrease in performance, but will it actually be noticeable?
If I have a loop:
public void main()
{
int x = 0;
for (int i = 0; i < 1000; i++)
{
x += 1;
}
}
Would my loop slow down if I did the same thing except this time making use of a function?
public void main()
{
int x = 0;
for (int i = 0; i < 1000; i++)
{
x = incrementInt(x);
}
}
public int incrementInt(int x)
{
return x + 1;
}
EDIT:
Fixed logic bug, sorry for that.
A method call will always slow you down. But the JIT compiler can inline your method if a set of conditions is fullfilled which results in assembly code which is equivalent to your first example (if you fix the logic bug in your example).
The question you are indirectly asking is under which circumstances my method is inlined? There are many different rules but the easiest way to be sure that inlining does work is that you measure it.
You can also use PerfView to find out for each method why it was not inlined. You can give the JIT compiler a hint to relax some of the rules and to inline a method with .NET 4.5
See http://blogs.microsoft.co.il/sasha/2012/01/20/aggressive-inlining-in-the-clr-45-jit/
There are some conditions described which prevent inlining:
Methods marked with MethodImplOptions.NoInlining
Methods larger than 32 bytes of IL
Virtual methods
Methods that take a large value type as a parameter
Methods on MarshalByRef classes
Methods with complicated flowgraphs
Methods meeting other, more exotic criteria
If you follow the rules and measure carefully you can write highly performant code while keeping readable and maintainable code.
I have written a test application and run the performance analyzer on the code and the function call is slower than the loop (Although as mentioned above the two do different things.)
It is very simple to analyze these things in VS2012. Just click the "ANALYZE" menu item and select "Start Performance Analysis".
Calling a function is slower than not calling it, but you can really ignore this.
Related
In ruby, I have seen the following.
10.times {
# do this block 10 times
}
With this, (as far as I know, which is not much relative to ruby) there is no loop iterator. In C based languages, the closest replica I can come up with is a simple for loop.
for (int i = 0; i < 10; i++) {
// do this block 10 times
}
But this utilizes the loop iterator, i. Is there any way in C based languages (including those listed in the tags) to execute a block of code a multiple number of times without the use of an iterator?
For example: If I wanted to execute a block a certain number of times, but did not care which iteration I was on.
No, this is not possible. i exists because you need to keep track of how many iterations you have performed. Even Ruby must do this under the hood, or else how would it know how many iterations had been performed?
All it really is is a difference in syntax. Ruby hides it from you where as C derivatives do not.
In Ruby, all types (including integers) are essentially complex types while it is common in other languages that integers are plain old primitive types. In C# for instance, as much as I don't think I would ever want to do this but if the the goal is purely that of mimicking Ruby syntax then...
Define an extension method on int
public static class Helpers
{
public static void Times(this int value, Action action)
{
for (int i = 0; i < value; i++)
{
action.Invoke();
}
}
}
Do something...
10.Times(() => { ... });
I use both Javascript and C# on a daily basis and I sometimes have to consider hoisting when using Javascript. However, C# doesn't seem to implement hoisting(that I know of) and I can't figure out why. Is it more of a design choice or is it more akin to a security or language constraint that applies to all statically typed languages?
For the record, I'm not saying i WANT it to exist in C#. I just want to understand why it doesn't.
EDIT: I noticed the issue when I declared a variable after a LINQ query, but the LINQ query was deferred until after the variable declaration.
var results = From c In db.LoanPricingNoFee Where c.LoanTerm == LoanTerm
&& c.LoanAdvance <= UpperLimit Select c
Order By c.LoanInstalment Ascending;
Int LoanTerm = 12;
Throws an error whereas:
int LoanTerm = 12;
var results = From c In db.LoanPricingNoFee Where c.LoanTerm == LoanTerm
&& c.LoanAdvance <= UpperLimit Select c
Order By c.LoanInstalment Ascending;
Does not.
Of all the programming languages I have used, Javascript has the most confusing scope system and hoisting is a part of that. The outcome is that it is easy to write unpredictable code in JavaScript and you have to be careful with how you write it to make it into the powerful and expressive language it can be.
C#, in common with almost every other language, assumes that you will not use a variable until you have declared it. Because it has a compiler it can enforce that by simply refusing to compile if you try to use an undeclared variable. The other approach to this, more often seen in scripting languages, is that if a variable is used without having been declared it is instantiated at first use. This can make it somewhat hard to follow the flow of code and is often used as a criticism of languages that behave that way. Most people who have used languages with block level scope ( where variables only exist at the level where they were declared ) find it a particularly weird feature of Javascript.
A couple of big reasons that hoisting can cause problems:
It is absolutely counter-intuitive and makes code harder to read and its behaviour harder to predict unless you are conscious of this behaviour. Hard to read and hard to predict code is far more likely to include bugs.
In terms of limiting the number of bugs in your code, limiting the lifetime of your variables can be really helpful. If you can declare the variable and use it in two lines of code, then having ten lines of code in between those two lines gives a lot of opportunities to accidentally affect the behaviour of the variable. There is a lot of information on this in Code Complete - if you haven't read that, I heartily recommend it.
There is a classic UX concept of the Principle Of Least Astonishment - features like hoisting ( or like the way Javascript handles equality ) tend to break that. People don't often think of user experience when developing programming languages, but actually programmers tend to be quite discerning users and more than a little grumpy when they find themselves routinely caught out by odd features. Javascript is very lucky that it's unique ubiquity in the browser has created a kind of enforced popularity that meant we have to tolerate its many quirks and problematic design decisions.
Finally, I cannot imagine a reason why it would be a useful addition to a language like C#- what possible benefit could it confer?
"Is it more of a design choice or is it more akin to a security or language constraint that applies to all statically typed languages?"
It's not a constraint of static typing. It would be trivial for the compiler to move all variable declarations to the top of the scope (in Javascript this is the top of the function, in C# the top of the current block) and to error if a name was declared with different types.
So the reason hoisting doesn't exist in C# is purely a design decision. Why it was designed that way I can't say I wasn't on the team. But it was probably due to the ease of parsing (both for human programmers and the compiler) if variables are always declared before use.
There is a form of Hoisting that exists in C# (and Java), in the context of Loop-invariant code motion - which is the JIT compiler optimization which "hoists" (pulls up) expressions from loop statements that don't effect the actual loop.
You can learn more about it here.
Quote:
“Hoisting” is a compiler optimization that moves loop-invariant code
out of loops. “Loop-invariant code” is code that is referentially
transparent to the loop and can be replaced with its values, so that
it doesn’t change the semantic of the loop. This optimization improves
runtime performance by executing the code only once rather than at
each iteration.
So this written code
public void Update(int[] arr, int x, int y)
{
for (var i = 0; i < arr.Length; i++)
{
arr[i] = x + y;
}
}
is actually optimized to be somewhat like this:
public void Update(int[] arr, int x, int y)
{
var temp = x + y;
var length = arr.Length;
for (var i = 0; i < length; i++)
{
arr[i] = temp;
}
}
This happens in the JIT - i.e. when translating the IL into native machine instructions so its not so easy to view (you can check here, and here).
I'm not an expert in reading assembly, but here is what I got from running this snippet with BenchmarkDotNet, and my comments on it showing that the optimization actually took place:
int[] arr = new int[10];
int x = 11;
int y = 19;
public void Update()
{
for (var i = 0; i < arr.Length; i++)
{
arr[i] = x + y;
}
}
Generated:
Because it is a faulty concept, most probably existing because of rushed implementation of JavaScript. It is a bad approach to coding, which can mislead even experienced javascript coder about scope of a variable.
Function hoisting has a potentially unnecessary cost in work that the compiler has to fulfill. For example, if a variable declaration is never even reached because various code control decisions returned the function, then the processor does not need to waste time pushing an undefined null-reference variable onto the stack memory and then popping it from the stack as part of it's method's clean up operations when it wasn't even reached.
Also, remember that JavaScript has "variable hoisting" and "function hoisting" (among others) which are treated differently. Function hoisting wouldn't make sense in C# since it is not a top-down interpreted language. Once the code is compiled, the method might not ever be called. In JavaScript, however, the "self-invoking" functions are evaluated immediately as the interpreter parses them.
I doubt that it was an arbitrary design decision: Not only is hoisting inefficient for C#, but it just wouldn't make sense for the way that C# works.
I quite often write code that copies member variables to a local stack variable in the belief that it will improve performance by removing the pointer dereference that has to take place whenever accessing member variables.
Is this valid?
For example
public class Manager {
private readonly Constraint[] mConstraints;
public void DoSomethingPossiblyFaster()
{
var constraints = mConstraints;
for (var i = 0; i < constraints.Length; i++)
{
var constraint = constraints[i];
// Do something with it
}
}
public void DoSomethingPossiblySlower()
{
for (var i = 0; i < mConstraints.Length; i++)
{
var constraint = mConstraints[i];
// Do something with it
}
}
}
My thinking is that DoSomethingPossiblyFaster is actually faster than DoSomethingPossiblySlower.
I know this is pretty much a micro optimisation, but it would be useful to have a definitive answer.
Edit
Just to add a little bit of background around this. Our application has to process a lot of data coming from telecom networks, and this method is likely to be called about 1 billion times a day for some of our servers. My view is that every little helps, and sometimes all I am trying to do is give the compiler a few hints.
Which is more readable? That should usually be your primary motivating factor. Do you even need to use a for loop instead of foreach?
As mConstraints is readonly I'd potentially expect the JIT compiler to do this for you - but really, what are you doing in the loop? The chances of this being significant are pretty small. I'd almost always pick the second approach simply for readability - and I'd prefer foreach where possible. Whether the JIT compiler optimizes this case will very much depend on the JIT itself - which may vary between versions, architectures, and even how large the method is or other factors. There can be no "definitive" answer here, as it's always possible that an alternative JIT will optimize differently.
If you think you're in a corner case where this really matters, you should benchmark it - thoroughly, with as realistic data as possible. Only then should you change your code away from the most readable form. If you're "quite often" writing code like this, it seems unlikely that you're doing yourself any favours.
Even if the readability difference is relatively small, I'd say it's still present and significant - whereas I'd certainly expect the performance difference to be negligible.
If the compiler/JIT isn't already doing this or a similar optimization for you (this is a big if), then DoSomethingPossiblyFaster should be faster than DoSomethingPossiblySlower. The best way to explain why is to look at a rough translation of the C# code to straight C.
When a non-static member function is called, a hidden pointer to this is passed into the function. You'd have roughly the following, ignoring virtual function dispatch since it's irrelevant to the question (or equivalently making Manager sealed for simplicity):
struct Manager {
Constraint* mConstraints;
int mLength;
}
void DoSomethingPossiblyFaster(Manager* this) {
Constraint* constraints = this->mConstraints;
int length = this->mLength;
for (int i = 0; i < length; i++)
{
Constraint constraint = constraints[i];
// Do something with it
}
}
void DoSomethingPossiblySlower()
{
for (int i = 0; i < this->mLength; i++)
{
Constraint constraint = (this->mConstraints)[i];
// Do something with it
}
}
The difference is that in DoSomethingPossiblyFaster, mConstraints lives on the stack and access only requires one layer of pointer indirection, since it's at a fixed offset from the stack pointer. In DoSomethingPossiblySlower, if the compiler misses the optimization opportunity, there's an extra pointer indirection. The compiler has to read a fixed offset from the stack pointer to access this and then read a fixed offset from this to get mConstraints.
There are two possible optimizations that could negate this hit:
The compiler could do exactly what you did manually and cache mConstraints on the stack.
The compiler could store this in a register so that it doesn't need to fetch it from the stack on every loop iteration before dereferencing it. This means that fetching mConstraints from this or from the stack is basically the same operation: A single dereference of a fixed offset from a pointer that's already in a register.
You know the response you will get, right? "Time it."
There is probably not a definitive answer. First, the compiler might do the optimization for you. Second, even if it doesn't, indirect addressing at the assembly level may not be significantly slower. Third, it depends on the cost of making the local copy, compared to the number of loop iterations. Then there are caching effects to consider.
I love to optimize, but this is one place I would definitely say wait until you have a problem, then experiment. This is a possible optimization that can be added when needed, not one of those optimizations that needs to be planned up front to avoid a massive ripple effect later.
Edit: (towards a definitive answer)
Compiling both functions in release mode and examining the IL with IL Dasm shows that in both places the "PossiblyFaster" function uses the local variable, it has one less instruction
ldloc.0 vs
ldarg.0; ldfld class Constraint[] Manager::mConstraints
Of course, this is still one level removed from the machine code - you don't know what the JIT compiler will do for you. But it is likely that "PossiblyFaster" is marginally faster.
However, I still don't recommend adding the extra variable until you are sure this function is the most expensive thing in your system.
I've profiled this and came up with a bunch of interesting results that are probably only valid for my specific example, but I thought would be worth while noting here.
The fastest is X86 release mode. That runs one iteration of my test in 7.1 seconds, whereas the equivalent X64 code takes 8.6 seconds. This was running 5 iterations, each iteration processing the loop 19.2 million times.
The fastest approach for the loop was:
foreach (var constraint in mConstraints)
{
... do stuff ...
}
The second fastest approach, which massively surprised me was the following
for (var i = 0; i < mConstraints.Length; i++)
{
var constraint = mConstraints[i];
... do stuff ...
}
I guess this was because mConstraints was stored in a register for the loop.
This slowed down when I removed the readonly option for mConstraints.
So, my summary from this is that being readable in this situation does give performance as well.
List<int> list = ...
for(int i = 0; i < list.Count; ++i)
{
...
}
So does the compiler know the list.Count does not have to be called each iteration?
Are you sure about that?
List<int> list = new List<int> { 0 };
for (int i = 0; i < list.Count; ++i)
{
if (i < 100)
{
list.Add(i + 1);
}
}
If the compiler cached the Count property above, the contents of list would be 0 and 1. If it did not, the contents would be the integers from 0 to 100.
Now, that might seem like a contrived example to you; but what about this one?
List<int> list = new List<int>();
int i = 0;
while (list.Count <= 100)
{
list.Add(i++);
}
It may seem as if these two code snippets are completely different, but that's only because of the way we tend to think about for loops versus while loops. In either case, the value of a variable is checked on every iteration. And in either case, that value very well could change.
Typically it's not safe to assume the compiler optimizes something when the behavior between "optimized" and "non-optimized" versions of the same code is actually different.
The C# compiler does not do any optimizations like this. The JIT compiler, however, optimizes this for arrays, I believe (which are not resizable), but not for lists.
A List's count property can change within the loop structure, so it would be an incorrect optimization.
It's worth noting, as nobody else has mentioned it, that there is no knowing from looking at a loop like this what the "Count" property will actually do, or what side effects it may have.
Consider the following cases:
A third party implementation of a property called "Count" could execute any code it wished to. e.g. return a Random number for all we know. With List we can be a bit more confident about how it will operate, but how is the JIT to tell these implementations apart?
Any method call within the loop could potentially alter the return value of Count (not just a straight "Add" directly on the collection, but a user method that is called in the loop might also party on the collection)
Any other thread that happens to be executing concurrently could also change the Count value.
The JIT just can't "know" that Count is constant.
However, the JIT compiler can make the code run much more efficiently by inlining the implementation of the Count property (as long as it is a trivial implementation). In your example it may well be inlined down to a simple test of a variable value, avoiding the overhead of a function call on each iteration, and thus making the final code nice and fast. (Note: I don't know if the JIT will do this, just that it could. I don't really care - see the last sentence of my answer to find out why)
But even with inlining, the value may still be changed between iterations of the loop, so it would still need to be read from RAM for each comparison. If you were to copy Count into a local variable and the JIT could determine by looking at the code in the loop that the local variable will remain constant for the loop's lifetime, then it may be able to further optimise it (e.g. by holding the constant value in a register rather than having to read it from RAM on each iteration). So if you (as a programmer) know that Count will be constant for the lifetime of the loop, you may be able to help the JIT by caching Count in a local variable. This gives the JIT the best chance of optimising the loop. (But there are no guarantees that the JIT will actually apply this optimisation, so it may make no difference to the execution times to manually "optimise" this way. You also risk things going wrong if your assumption (that Count is constant) is incorrect. Or your code may break if another programmer edits the contents of the loop so that Count is no longer constant, and he doesn't spot your cleverness)
So the moral of the story is: The JIT can make a pretty good stab at optimising this case by inlining. Even if it doesn't do this now, it may do it with the next C# version. You might not gain any advantage by manually "optmising" the code, and you risk changing its behaviour and thus breaking it, or at least making future maintenance of your code more risky, or possibly losing out on future JIT enhancements. So the best approach is to just write it the way you have, and optimise it when your profiler tells you that the loop is your performance bottleneck.
Hence, IMHO it's interesting to consider/understand cases like this, but ultimately you don't actually need to know. A little bit of knowledge can be a dangerous thing. Just let the JIT do its thing, and then profile the result to see if it needs improving.
If you take a look at the IL generated for Dan Tao's example you'll see a line like this at the condition of the loop:
callvirt instance int32 [mscorlib]System.Collections.Generic.List`1<int32>::get_Count()
This is undeniable proof that Count (i.e. get_Count()) is called for every iteration of the loop.
For all the other commenters who say that the 'Count' property could change in a loop body: JIT optimizations let you take advantage of the actual code that's running, not the worst-case of what might happen. In general, the Count could change. But it doesn't in all code.
So in the poster's example (which might not have any Count-changing), is it unreasonable for the JIT to detect that the code in the loop doesn't change whatever internal variable List uses to hold its length? If it detects that list.Count is constant, wouldn't it lift that variable access out of the loop body?
I don't know if the JIT does this or not. But I am not so quick to brush this problem off as trivially "never."
No, it doesn't. Because condition is calculated on each step. It can be more complex than just comparsion with count, and any boolean expression is allowed:
for(int i = 0; new Random().NextDouble() < .5d; i++)
Console.WriteLine(i);
http://msdn.microsoft.com/en-us/library/aa664753(VS.71).aspx
It depends on the particular implementation of Count; I've never noticed any performance issues with using the Count property on a List so I assume it's ok.
In this case you can save yourself some typing with a foreach.
List<int> list = new List<int>(){0};
foreach (int item in list)
{
// ...
}
The question field is a bit too short to pose my real question. If anyone can recapitulate it better, please feel free.
My real question is this: I'm reading a lot of other people's code in C# these days, and I have noticed that one particular form of iteration is widely spread, (see in code).
My first question is:
Are all these iterations equivalent?
And my second is: why prefer the first? Has it something to do with readibility? Now I don't believe the first form is more readable then the for-form once you get used to it, and readibility is far too much a subjective item in these constructs, of course, what you use the most will seem more readable, but I can assure everyone that the for-form is at least as readable, since it has all in one line, and you can even read the initializing in the construct.
Thus the second question: why is the 3rd form seen much less in code?
// the 'widespread' construct
int nr = getNumber();
while (NotZero(nr))
{
Console.Write(1/nr);
nr = getNumber();
}
// the somewhat shorter form
int nr;
while (NotZero(nr = getNumber()))
Console.Write(1 / nr);
// the for - form
for (int nr = getNumber(); NotZero(nr); nr = getNumber())
Console.Write(1 / nr);
The first and third forms you've shown repeat the call to GetNumber. I prefer the second form, although it has the disadvantage of using a side-effect within a condition of course. However I pretty much only do that with a while loop. Usually I don't end up passing the result as an argument though - the common situations I find myself in are:
string line;
while ( (line = reader.ReadLine()) != null)
...
and
int bytesRead;
while ( (bytesRead = stream.Read(buffer, 0, buffer.Length)) > 0)
...
Both of these are now so idiomatic to me that they don't cause me any problems - and as I say, they allow me to only state each piece of logic once.
If you don't like the variable having too much scope, you can just introduce an extra block:
{
int bytesRead;
while ( (bytesRead = stream.Read(buffer, 0, buffer.Length)) > 0)
{
// Body
}
}
Personally I don't tend to do this - the "too-wide" scope doesn't bother me that much.
I suspect it wouldn't be too hard to write a method to encapsulate all of this. Something like:
ForEach(() => reader.ReadLine(), // Way to obtain a value
line => line != null, // Condition
line =>
{
// body
};
Mind you, for line reading I have a class which helps:
foreach (string line in new LineReader(file))
{
// body
}
(It doesn't just work with files - it's pretty flexible.)
Are all this iterations equivalents?
yes
why prefer the first? Has it sth. to do with readibility?
because you may want to extend the scope of the nr var beyond the while loop?
why is the 3th form seen much less in code?
it is equivalent, same statements!
You may prefer the latter because you don't want to extend the scope of the nr variable
I think that the third form (for-loop) is the best of these alternatives, because it puts things into the right scope. On the other hand, having to repeat the call to getNumber() is a bit awkward, too.
Generally, I think that explicit looping is widely overused. High-level languages should provide mapping, filtering, and reducing. When these high level constructs are applicable and available, looping instead is like using goto instead of looping.
If mapping, filtering, or reducing is not applicable, I would perhaps write a little macro for this kind of loop (C# doesn't have those, though, does it?).
I offer another alternative
foreach (var x in InitInfinite(() => GetNumber()).TakeWhile(NotZero))
{
Console.WriteLine(1.0/x);
}
where InitInfinite is a trivial helper function. Whole program:
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static IEnumerable<T> InitInfinite<T>(Func<T> f)
{
while (true)
{
yield return f();
}
}
static int N = 5;
static int GetNumber()
{
N--;
return N;
}
static bool NotZero(int n) { return n != 0; }
static void Main(string[] args)
{
foreach (var x in InitInfinite(() => GetNumber()).TakeWhile(NotZero))
{
Console.WriteLine(1.0/x);
}
}
}
I think people use the while() loop often because it best represents the way you would visualize the task in your head. I think think there is any performance benefits for using it over any other loop structure.
Here is a random speculation:
When I write C# code, the only two looping constructs I write are while() and foreach(). That is, no one uses 'for' any more, since 'foreach' often works and is often superior. (This is an overgeneralization, but it has a core of truth.) As a result, my brain has to strain to read any 'for' loop because it's unfamiliar.
As for why (1) and (2) are "preferred" over (3), my feeling is that most people think of the latter as a way to iterate over a range, using the condition to define the range, rather than continuing to iterate over a block while some condition still holds. The keyword semantics lend themselves to this interpretation and I suspect that, partly because of that, people find that the expressions are most readable in that context. For instance, I would never use (1) or (2) to iterate over a range, though I could.
Between (1) and (2), I'm torn. I used to use (2) (in C) most often due to the compactness, but now (in C#) I generally write (1). I suppose that I've come to value readability over compactness and (1) seems easier to parse quickly and thus more readable to my mind even though I do end up repeating a small amount of logic.
Honestly, I rarely write while statements anymore, typically using foreach -- or LINQ -- in the cases where while statements would previously been used. Come to think of it, I'm not sure I use many for statements, either, except in unit tests where I'm generating some fixed number of a test object.