Lately, I have taken to the pattern of having a lot of diagnostic logging in parts of my code, that makes use of lambda expressions/anonymous delegates like so:
MyEventManager.LogVerbose( LogCategory.SomeCategory, () => String.Format(msg_string, GetParam1(), GetParam2(), GetParam3() );
Notice that the second argument to LogVerbose is a lambda expression which evaluates to a string. The reason for this is that if verbose logging is not actually enabled, LogVerbose should exit having done as little work as possible, in order to minimize performance impact. The construction of the error message string may, in some cases, take time or resources, and if the lambda expression is never evaluated, that performance penalty will not be incurred.
I'm wondering if littering the type system with so many anonymous delegates like this will have some unforeseen consequence for application performance, or if there are any other strategies I should consider.
It should be fine. In particular, if your anonymous function doesn't capture anything, it is cached as a static field (because it can be). If you capture "this" then you'll end up creating new delegate instances, but they're not expensive.
If you capture local variables, that will involve instantiating a nested type - but I'd only worry about this if you saw it actually becoming a problem. As ever with optimisation, focus on readability first, measure the performance, and then profile it to find out where you need to concentrate your efforts.
Whilst I don't actually know for sure the answer to the question, I think its worth considering that a drive to a more functional style of programming in C# would be seriously undermined if there were any suggestion that there would some kind of limit on the use of such expressions.
I've got a solution that's got thousands of anon delegates, and it still works. Sometimes Visual Studio is a little clunky, but whether that's because we've got hundreds of projects or this or some other factor is unknown. The performance of the applications don't seem to be that strongly affected (with quite a bit of perf testing).
Related
Is there a way to use a custom memory allocator for LINQ?
For example when I call:
someCollection.Where(x).SelectMany(y).ToList();
Methods like ToList() or OrderBy() will always create a new array, so lots of GC will happen.
With a custom allocator, I could always use the same List, which will be cleared and refilled every time. Iam aware that reusing buffers could lead to problems with reentrancy.
The background is, my application is a game and GC means stuttering.
Please don't tell me "Use C++ instead" or "Do not use LINQ", I know that :)
(Although you asked not to be suggested against it, I thin this answer could help the community)
LINQ is a facility built on top the CLR, therefore it uses the CLR allocator, and it cannot be changed.
You can tune it a little bit, for example configuring whether or not the GC cycle should be offloaded to a background thread, but you can't go any further.
The aim of LINQ is to simply writing code for certain class of problems sacrificing the freedom to choose the implementation of every building block (that's why we usually choose LINQ).
However, depending on the scenario, LINQ could not be your best friend as its design choices may play against yours.
If, after profiling your code, you identify that you have a serious performance problems you should try at first to identify whether or not you can isolate the bottleneck in some of LINQ methods and see whether you can roll your own implementation, via extension methods.
Of course this option is viable when yuo are the main caller, unless you manage to roll something that is IEnumerable complaint. You need to be very lucky, because your implementation should abide to LINQ rules. Particularly, as you are not in control of how the objects are manipulated, you cannot perform the optimizations you would in your own code.
Closures and deferred execution work against you.
Otherwise, what has been suggested by the comments, is the only viable option: avoid using LINQ for that specific task.
The reason for stepping away from LINQ is that it is not the right tool to solve your problem with performance constraint you require.
Additionally, as stated in the comments, the (ab)use of lambda expressions significantly increase the memory pressure as backing objects are created to implement the closures.
We had performance issues similar to yours, where we had to rewrite certain slow paths. In other (rare) cases, preallocating the lists and loading the results via AddRange helped.
This is a common belief that reflection is slow and try to avoid it as much as possible. But is that belief true, in the current situation? There has been lot of changes in the current .net versions like, use of IL Weaving (i.e. IL Emit) etc, as opposed to traditional PropertyInfo and MethodInfo ways of performing reflection.
Is that any convincing proof, that the new reflection is not that slow any more, and can be used. Is there better way to read attribute data?
Thanks,
Bhaskar
When you think about it, reflection is pretty darn impressive in how fast it is.
A cached delegate from a ConstructorInfo or MethodInfo can be called with speed comparable to any other delegate.
A delegate created from ILGenerator.Emit (which incidentally isn't new, it's been in .NET since version 1) can likewise be called just as fast.
An object obtained through emitting or calling a ConstructorInfo's delegate will be just as fast as any other object.
If you obtain an object by loading an assembly dynamically, using reflection to find the method to call, and calling it, and it implements a defined interface through which you call it from that point on, then it'll be just as fast in how it's used as another implementation of that interface.
In all, reflection gives us ways of doing things that without it we would - if we could do them at all - have to use techniques that are slower both to code and to execute.
It also gives us means of doing things that are more complicated, more brittle, less type-safe and less performant than other means too. Pretty much every line of C# code can be replaced by a big chunk of code that uses reflection. This code will almost certainly be worse than the original line in a whole bunch of ways, and performance is the least of them.
Quite possibly the "avoid reflection because its slow" advice stems from the belief that the sort of developer that would go nuts for any new technique just because it seemed cool would be the sort that would be more likely warned off by "it'll be slower" than by "it'll be less idiomatic, more error-prone and harder to maintain". Quite possibly this belief is completely correct.
For the most part though, when the most natural and obvious approach is to use reflection, then it also won't be less performant than a really convoluted attempt to avoid it.
If performance concerns apply to anything in reflection, its really to the uses that are hidden:
Using dynamic can seem sensible in a case where only a little work could avoid it. Here the performance difference may be worth considering.
In ASP.NET, using <%#DataBinder.Eval(Container.DataItem, "SomeProperty")%> is easier but generally less performant than <#((SomeType)Container.DataItem).SomeProperty%> or <%#SomeCodeBehindProvidedCallWithTheSameResult%>. I'll still use the former 90% of the time, and the latter only if I really care about a given page's performance or more likely because doing many operations on the same object makes the latter actually more natural.
So in all, everything remains "slow" in computers while they are bounded by the requirement to work in a single universe in a way that consumes energy and takes time, for some value of "slow". Different reflection techniques have different costs, but so does the alternatives. Beware not so much of reflection in cases where it's the obvious approach as hidden reflection where another approach just slightly less obvious may serve well.
And of course, code wisely with whatever technique you use. If you're going to call the same delegate a hundred times in a row, you should be storing it rather than obtaining it each call, and that would go whether the way to obtain it was through reflection or not.
It all depends.
Yes, using reflection is without any doubt slower than not using reflection, but you have to look at the big picture:
For example, imagine that your code does some reflection and then loads some data from a database over the network.
In this case, you can completely neglect the additional cost for reflection because the database request over the network will likely take much, much longer.
I wouldn't worry about the performance cost that much. Not saying performance doesn't matter, but in a lot of cases, a millisecond or two isn't noticeable or worth choosing a more complicated coding alternative. Take a look here.
I generally prefer this approach:
write simple code
measure performance
optimize the biggest hitter, if needed
Whad do you mean by mentioning a common belief? Reflection is not a solid block. You should consider each method apart. For instance, creating an object through a default constructor is few times slower than a simple call, whilst creating a parametered constructor is tens time slower. So if you want to study the speed, do a benchmark and benchmark the concrete functions you need.
PS. Using C#, you can always create and compile expressions on the fly which if you manage to do it would be much faster than reflection.
yes reflection is slow when we try to do on bulky opertations or with in a loop.
you can try out dynamic options as well . with dynamic option its can be cached and eventually will be faster than reflection.
If you need to look at attributes applied to source code, then you pretty much have to use reflection.
Some reflection operations are fast, such as Object.GetType(), but some operations are relatively slow, such as Type.FindMethod("MyMethod").
In general, I would say that if your application makes occasional use of Reflection, there should be no performance concern. On the other hand, if your application uses Reflection extensively, then you might observe some slowness.
Introspection is an heavy job. Saying slow is relative to a lot of things. Invoking method/constructor through reflection is slow, but using reflection to only retreive metadatas is not.
Keep in mind, reflection must be only use to retrieve metadatas.
If you need to invoke method, running something, just emit dynamic types/methods at initialization time and invoke them through interfaces/delegates.
there is some code to execute after validation.
consider a variable SOQualityStandards = true;
this variable is validated before the execution of code.
i have come across two ways of checking SOQualityStandards
one is
if(SOQualityStandards)
{
//code to execute
}
and the other is
if(!SOQualityStandards) return;
//code to execute
is there any performance difference between both. which one should i consider.
They have the same semantics (assuming there is no other code in the function after the if-block in the first example).
I find the first to be clearer, but that is a matter of personal preference.
The compiler will consider those two options to be the same and could transform one into the other or visa versa, so performance considerations are irrelivent. Even if performence were affected, I think the readability/maintainability is a larger issue in these questions anyway.
I tend to do the return at the beginning in cases like this, because it reduces the indentations and mental burden in reading the rest of the method. The states tested in returns at the beginning become states that no longer need to be considered in understanding the method. Large if blocks, on the other hand, require mentally tracking the state differences throughout the method.
This becomes especially important if there are several tests that need to be done to guard an interior block of code.
There's no difference in the two approaches. It's a matter of personal choice.
Choosing between the two is just personal preference (see Sven's answer and bomslang's answer). Micro-optimization is in most cases completely unnecessary.
You shouldn't optimize execute-time until you see it's a problem. You could be spending that valuable time to add other functionality or to come up with improvements to the system architecture.
In the case you actually need to optimize, loops and recursive functions are generally the first place to look.
If you would need further optimization than that, single line variable checks and manipulations would still be some of the last things to optimize.
Remember (as Jeffrey said) readability and maintainability are in most cases the most important factors.
I've got an idea for caching that I'm beginning to implement:
Memoizing functions and storing the return along with a hash of the function signature in Velocity. Using PostSharp, I want to check the cache and return a rehydrated representation of the return value instead of calling the function again. I want to use attributes to control this behavior.
Unfortunately, this could prove dangerous to other developers in my organization, if they fall in love with the performance gain and start decorating every method in sight with caching attributes, including some with side effects. I'd like to kick out a compiler warning when the memoization library suspects that a function may cause side effects.
How can I tell that code may cause side effects using CodeDom or Reflection?
This is an extremely hard problem, both in practice and in theory. We're thinking hard about ways to either prevent or isolate side effects for precisely your scenarios -- memoization, automatic parallelization, and so on -- but it's difficult and we are still far from a workable solution for C#. So, no promises. (Consider switching to Haskell if you really want to eliminate side effects.)
Unfortunately, even if a miracle happened and you found a way to prevent memoization of methods with side effects, you've still got some big problems. Consider the following:
1) What if you memoize a function that is itself calling a memoized function? That's a good situation to be in, right? You want to be able to compose memoized functions. But memoization has a side effect: it adds data to a cache! So immediately you have a meta-problem: you want to tame side effects, but only "bad" side effects. The "good" ones you want to encourage, the bad ones you want to prevent, and it is hard to tell them apart.
2) What are you going to do about exceptions? Can you memoize a method which throws an exception? If so, does it always throw the same exception, or does it throw a new exception every time? If the former, how are you going to do it? If the latter, now you have a memoized function which has two different results on two different calls because two different exceptions are thrown. Exceptions can be seen as a side effect; it is hard to tame exceptions.
3) What are you going to do about methods which do not have a side effect but are nevertheless impure methods? Suppose you have a method GetCurrentTime(). That doesn't have a side effect; nothing is mutated by the call. But this is still not a candidate for memoization because any two calls are required to produce different results. You don't need a side-effects detector, you need a purity detector.
I think your best bet is to solve the human problem via education and code reviews, rather than trying to solve the hard technical problem.
Simply speaking you can't with either CodeDom or Reflection.
To accurately determine whether or not a method causes side effects you must understand what actions it is taking. For .Net that means cracking open the IL and interperting it in some manner.
Neither Reflection or CodeDom give you this capability.
CodeDom is a method for generating code into an application and only has very limited inspection capabilities. It's essentially limited to the subset of the language understood by the various parsing enginse.
Reflections strength lies in it's ability to inspect metadata and not the underlying IL of the method bodies. MetaData can only give you a very limited set of information as to what does and does not cause side effects.
Reflection in itself won't do it, because the metadata doesn't have any such attributes.
CodeDom may not be powerful enough to inspect all IL instructions.
So you'd have to use the very low-level pieces of the reflection API that let you get a byte[] containing the raw IL of each method, and analyze that. So it's possible in principle, but not easy.
You'd have to analyze all the instructions and observe what effects they have, and whether those effects are going to survive outside of some significant scope (e.g. do they modify the fields of objects that can leak out through return values or out parameters, or do they just modify transient objects that are guaranteed to be unreachable outside the method?).
Sounds pretty complicated!
I am getting two contradicting views on this. Some source says there should be less little methods to reduce method calls, but some other source says writing shorter method is good for letting the JIT to do the optimization.
So, which side is correct?
The overhead of actually making the method call is inconsequentially small in most every case. You never need to worry about it unless you can clearly identify a problem down the road that requires revisiting the issue (you won't).
It's far more important that your code is simple, readable, modular, maintainable, and modifiable. Methods should do one thing, one thing only and delegate sub-things to other routines. This means your methods should be as short as they can possibly be, but not any shorter. You will see far more performance benefits by having code that is less prone to error and bugs because it is simple, than by trying to outsmart the compiler or the runtime.
The source that says methods should be long is wrong, on many levels.
None, you should have relatively short method to achieve readability.
There is no one simple rule about function size. The guideline should be a function should do 'one thing'. That's a little vague but becomes easier with experience. Small functions generally lead to readability. Big ones are occasionally necessary.
Worrying about the overhead of method calls is premature optimization.
As always, it's about finding a good balance. The most important thing is that the method does one thing only. Longer methods tend to do more than one thing.
The best single criterion to guide you in sizing methods is to keep them well-testable. If you can (and actually DO!-) thoroughly unit-test every single method, your code is likely to be quite good; if you skimp on testing, your code is likely to be, at best, mediocre. If a method is difficult to test thoroughly, then that method is likely to be "too big" -- trying to do too many things, and therefore also harder to read and maintain (as well as badly-tested and therefore a likely haven for bugs).
First of all, you should definitely not be micro-optimizing the performance on the number-of-methods level. You will most likely not get any measurable performance benefit. Only if you have some method that are being called in a tight loop millions of times, it might be an idea - but don't begin optimizing on that before you need it.
You should stick to short concise methods, that does one thing, that makes the intent of the method clear. This will give you easier-to-read code, that is easier to understand and promotes code reuse.
The most important cost to consider when writing code is maintanability. You will spend much, much more time maintaining an application and fixing bugs than you ever will fixing performance problems.
In this case the almost certainly insignificant cost of calling a method is incredibly small when compared to the cost of maintaining a large unwieldy method. Small concise methods are easier to maintain and comprehend. Additionally the cost of calling the method almost certainly will not have a significant performance impact on your application. And if it does, you can only assertain that by using a profiler. Developers are notoriously bad at identifying performance problems before hand.
Generally speaking, once a performance problem is identified, they are easy to fix. Making a method or more importantly a code base, maintainable is a much higher cost.
Personally, I am not afraid of long methods as long as the person writing them writes them well (every piece of sub-task separated by 2 newlines and a nice comment preceeding it, etc. Also, identation is very important.).
In fact, many times I even prefer them (e.g. when writing code that does things in a specific order with sequential logic).
Also, I really don't understand why breaking a long method into 100 pieces will improve readablility (as others suggest). Only the opposite. You will only end-up jumping all over the place and holding pieces of code in your memory just to get a complete picture of what is going on in your code. Combine that with possible lack of comments, bad function names, many similar function names and you have the perfect recipe for chaos.
Also, you could go the other end while trying to reduce the size of the methods: to create MANY classes and MANY functions each of which may take MANY parameters. I don't think this improves readability either (especially for a begginer to a project that has no clue what each class/method do).
And the demand that "a function should do 'one thing'" is very subjective. 'One thing' may be increasing a variable by one up to doing a ton of work supposedly for the 'same thing'.
My rule is only reuseability:
The same code should not appear many times in many places. If this is the case you need a new function.
All the rest is just philosophical talk.
In a question of "why do you make your methods so big" I reply, "why not if the code is simple?".