Performance of static methods vs instance methods - c#

My question is relating to the performance characteristics of static methods vs instance methods and their scalability. Assume for this scenario that all class definitions are in a single assembly and that multiple discrete pointer types are required.
Consider:
public sealed class InstanceClass
{
public int DoOperation1(string input)
{
// Some operation.
}
public int DoOperation2(string input)
{
// Some operation.
}
// … more instance methods.
}
public static class StaticClass
{
public static int DoOperation1(string input)
{
// Some operation.
}
public static int DoOperation2(string input)
{
// Some operation.
}
// … more static methods.
}
The above classes represent a helper style pattern.
In an instance class, resolving the instance method take a moment to do as oppose to StaticClass.
My questions are:
When keeping state is not a concern (no fields or properties are required), is it always better to use a static class?
Where there is a considerable number of these static class definitions (say 100 for example, with a number of static methods each) will this affect execution performance or memory consumption negatively as compared with the same number of instance class definitions?
When another method within the same instance class is called, does the instance resolution still occur? For example using the [this] keyword like this.DoOperation2("abc") from within DoOperation1 of the same instance.

In theory, a static method should perform slightly better than an instance method, all other things being equal, because of the extra hidden this parameter.
In practice, this makes so little difference that it'll be hidden in the noise of various compiler decisions. (Hence two people could "prove" one better than the other with disagreeing results). Not least since the this is normally passed in a register and is often in that register to begin with.
This last point means that in theory, we should expect a static method that takes an object as a parameter and does something with it to be slightly less good than the equivalent as an instance on that same object. Again though, the difference is so slight that if you tried to measure it you'd probably end up measuring some other compiler decision. (Especially since the likelihood if that reference being in a register the whole time is quite high too).
The real performance differences will come down to whether you've artificially got objects in memory to do something that should naturally be static, or you're tangling up chains of object-passing in complicated ways to do what should naturally be instance.
Hence for number 1. When keeping state isn't a concern, it's always better to be static, because that's what static is for. It's not a performance concern, though there is an overall rule of playing nicely with compiler optimisations - it's more likely that someone went to the effort of optimising cases that come up with normal use than those which come up with strange use.
Number 2. Makes no difference. There's a certain amount of per-class cost for each member it terms of both how much metadata there is, how much code there is in the actual DLL or EXE file, and how much jitted code there'll be. This is the same whether it's instance or static.
With item 3, this is as this does. However note:
The this parameter is passed in a particular register. When calling an instance method within the same class, it'll likely be in that register already (unless it was stashed and the register used for some reason) and hence there is no action required to set the this to what it needs to be set to. This applies to a certain extent to e.g. the first two parameters to the method being the first two parameters of a call it makes.
Since it'll be clear that this isn't null, this may be used to optimise calls in some cases.
Since it'll be clear that this isn't null, this may make inlined method calls more efficient again, as the code produced to fake the method call can omit some null-checks it might need anyway.
That said, null checks are cheap!
It is worth noting that generic static methods acting on an object, rather than instance methods, can reduce some of the costs discussed at http://joeduffyblog.com/2011/10/23/on-generics-and-some-of-the-associated-overheads/ in the case where that given static isn't called for a given type. As he puts it "As an aside, it turns out that extension methods are a great way to make generic abstractions more pay-for-play."
However, note that this relates only to the instantiation of other types used by the method, that don't otherwise exist. As such, it really doesn't apply to a lot of cases (some other instance method used that type, some other code somewhere else used that type).
Summary:
Mostly the performance costs of instance vs static are below negligible.
What costs there are will generally come where you abuse static for instance or vice-versa. If you don't make it part of your decision between static and instance, you are more likely to get the correct result.
There are rare cases where static generic methods in another type result in fewer types being created, than instance generic methods, that can make it sometimes have a small benefit to turn rarely used (and "rarely" refers to which types it's used with in the lifetime of the application, not how often it's called). Once you get what he's talking about in that article you'll see that it's 100% irrelevant to most static-vs-instance decisions anyway. Edit: And it mostly only has that cost with ngen, not with jitted code.
Edit: A note on just how cheap null-checks are (which I claimed above). Most null-checks in .NET don't check for null at all, rather they continue what they were going to do with the assumption that it'll work, and if a access exception happens it gets turned into a NullReferenceException. As such, mostly when conceptually the C# code involves a null-check because it's accessing an instance member, the cost if it succeeds is actually zero. An exception would be some inlined calls, (because they want to behave as if they called an instance member) and they just hit a field to trigger the same behaviour, so they are also very cheap, and they can still often be left out anyway (e.g. if the first step in the method involved accessing a field as it was).

When keeping state is not a concern (no fields or properties are
required), is it always better to use a static class?
I would say, yes. As declaring something static you declare an intent of stateless execution (it's not mandatory, but an intent of something one would expect)
Where there is a considerable number of these static classes (say 100
for instance, with a number of static methods each) will this affect
execution performance or memory consumption negatively as compared
with the same number of instance classes?
Don't think so, unless you're sure that static classes are really stateless, cause if not it's easy mess up memory allocations and get memory leaks.
When the [this] keyword is used to call another method within the same
instance class, does the instance resolution still occur?
Not sure, about this point (this is a purely implementation detail of CLR), but think yes.

Static methods are faster but less OOP. If you'll be using design patterns, static method is likely bad code. Business logic are better written as non-Static. Common functions like file reading, WebRequest etc are better as static. Your questions have no universal answer.

Related

Usage of Instance Variable within the class for Java/C#

Assume that 2 different methods - one static and one non-static - need an instance variable.
The variable is used 3-5 different times within the methods for comparison purposes.
The variable is NOT changed in any manner.
Also would the type of variable - String, Colection, Collection, etc. make any difference on how it should be coded.
What is the best/right way of using Instance Variable within a private method (static and non-static)?
Pass as method argument
Store locally by using the method to get the value - this.getClaimPropertyVertices();
Store locally by getting the value - this.claimPropertyVertices;
Use the instance variable directly in the method
When creating a local variable to store the value will the "final" keyword provide any advantages, if the variable will not be changed.
Edit 1: Based on a comment, I am adding additional information
The value cannot be created locally in the method. It has to come from the class or some other method accessed by the class.
My Solution Based on the Answers:
Based on the answer by #EricJ. and #Jodrell. I went with option 1 and also created it as a private static method. I also found some details here to support this.
When creating a local variable to store the value will the "final" keyword provide any advantages, if the variable will not be changed
In Java, final provides an optimization opportunity to the compiler. It states that the contents of the variable will not be changed. The keyword readonly provides a similar role in C#.
Whether or not that additional opportunity for optimization is meaningful depends on the specific problem. In many cases, the cost of other portions of the algorithm will be vastly larger than optimizations that the compiler is able to make due to final or readonly.
Use of those keywords has another benefit. They create a contract that the value will not change, which helps future maintainers of the code understand that they should not change the value (indeed, the compiler will not let them).
What is the best/right way of using Instance Variable within a private method (static and non-static)?
Pass as method argument
The value is already stored in the instance. Why pass it? Best case is this is not better than using the instance property/field. Worst case the JITer not inline the call, and will create a larger stack frame costing a few CPU cycles. Note: if you are calling a static method, then you must pass the variable as the static method cannot access the object instance.
Store locally by using the method to get the value - this.getClaimPropertyVertices();
This is what I do in general. Getters/setters are there to provide a meaningful wrapper around fields. In some cases, the getter will initialize the backing field (common pattern in C# when using serializers that do not call the object constructor. Don't get me started on that topic...).
Store locally by getting the value - this.claimPropertyVertices;
No, see above.
Use the instance variable directly in the method
Exactly the same as above. Using this or not using this should generate the exact same code.
UPDATE (based on your edit)
If the value is external to the object instance, and should not meaningfully be stored along with the instance, pass it in as a value to the method call.
If you write your functions with the static keyword whenever you can, there are several obvious benefits.
Its obvious what inputs effect the function from the signature.
You know that the function will have no side effects (unless you are passing by reference). This overlooks non-functional side effects, like changes to the GUI.
The function is not programtically tied to the class, if you decide that logically its behaviour has a better association with another entity, you can just move it. Then adjust any namespace references.
These benefits make the function easy to understand and simpler to reuse. They will also make it simpler to use the function in a Multi Threaded context, you don't have to worry about contention on ever spreading side effects.
I will cavet this answer. You should write potentially resuable functions with the static keyword. Simple or obviously non-resulable functionality should just access the private member or getter, if implemented.

more advantages or disadvantages to delegate members over classic functions?

class my_class
{
public int add_1(int a, int b) {return a + b;}
public func<int, int, int> add_2 = (a, b) => {return a + b;}
}
add_1 is a function whereas add_2 is a delegate. However in this context delegates can forfill a similar role.
Due to precedent and the design of the language the default choice for C# methods should be functions.
However both approaches have pros and cons so I've produced a list. Are there any more advanteges or disadvantages to either approach?
Advantages to conventional methods.
more conventional
outside users of the function see named parameters - for the add_2 syntax arg_n and a type is generally not enough information.
works better with intellisense - ty Minitech
works with reflection - ty Minitech
works with inheritance - ty Eric Lippert
has a "this" - ty CodeInChaos
lower overheads, speed and memory - ty Minitech and CodeInChaos
don't need to think about public\private in respect to both changing and using the function. - ty CodeInChaos
less dynamic, less is permitted that is not known at compile time - ty CodeInChaos
Advantages to "field of delegate type" methods.
more consistant, not member functions and data members, it's just all just data members.
can outwardly look and behave like a variable.
storing it in a container works well.
multiple classes could use the same function as if it were each ones member function, this would be very generic, concise and have good code reuse.
straightforward to use anywhere, for example as a local function.
presumably works well when passed around with garbage collection.
more dynamic, less must be known at compile time, for example there could be functions that configure the behaviour of objects at run time.
as if encapsulating it's code, can be combined and reworked, msdn.microsoft.com/en-us/library/ms173175%28v=vs.80%29.aspx
outside users of the function see unnamed parameters - sometimes this is helpfull although it would be nice to be able to name them.
can be more compact, in this simple example for example the return could be removed, if there were one parameter the brackets could also be removed.
roll you'r own behaviours like inheritance - ty Eric Lippert
other considerations such as functional, modular, distributed, (code writing, testing or reasoning about code) etc...
Please don't vote to close, thats happened already and it got reopened. It's a valid question even if either you don't think the delegates approach has much practical use given how it conflicts with established coding style or you don't like the advanteges of delegates.
First off, the "high order bit" for me with regards to this design decision would be that I would never do this sort of thing with a public field/method. At the very least I would use a property, and probably not even that.
For private fields, I use this pattern fairly frequently, usually like this:
class C
{
private Func<int, int> ActualFunction = (int y)=>{ ... };
private Func<int, int> Function = ActualFunction.Memoize();
and now I can very easily test the performance characteristics of different memoization strategies without having to change the text of ActualFunction at all.
Another advantage of the "methods are fields of delegate type" strategy is that you can implement code sharing techniques that are different than the ones we've "baked in" to the language. A protected field of delegate type is essentially a virtual method, but more flexible. Derived classes can replace it with whatever they want, and you have emulated a regular virtual method. But you could build custom inheritence mechanisms; if you really like prototype inheritance, for example, you could have a convention that if the field is null, then a method on some prototypical instance is called instead, and so on.
A major disadvantage of the methods-are-fields-of-delegate-type approach is that of course, overloading no longer works. Fields must be unique in name; methods merely must be unique in signature. Also, you don't get generic fields the way that we get generic methods, so method type inference stops working.
The second one, in my opinion, offers absolutely no advantage over the first one. It's much less readable, is probably less efficient (given that Invoke has to be implied) and isn't more concise at all. What's more, if you ever use reflection it won't show up as being a method so if you do that to replace your methods in every class, you might break something that seems like it should work. In Visual Studio, the IntelliSense won't include a description of the method since you can't put XML comments on delegates (at least, not in the same way you would put them on normal methods) and you don't know what they point to anyway, unless it's readonly (but what if the constructor changed it?) and it will show up as a field, not a method, which is confusing.
The only time you should really use lambdas is in methods where closures are required, or when it's offers a significant convenience advantage. Otherwise, you're just decreasing readability (basically the readability of my first paragraph versus the current one) and breaking compatibility with previous versions of C#.
Why you should avoid delegates as methods by default, and what are alternatives:
Learning curve
Using delegates this way will surprise a lot of people. Not everyone can wrap their head around delegates, or why you'd want to swap out functions. There seems to be a learning curve. Once you get past it, delegates seem simple.
Perf and reliability
There's a performance loss to invoking delegates in this manner. This is another reason I would default to traditional method declaration unless it enabled something special in my pattern.
There's also an execution safety issue. Public fields are nullable. If you're passed an instance of a class with a public field you'll have to check that it isn't null before using it. This hurts perf and is kind of lame.
You can work around this by changing all public fields to properties (which is a rule in all .Net coding standards anyhow). Then in the setter throw an ArgumentNullException if someone tries to assign null.
Program design
Even if you can deal with all of this, allowing methods to be mutable at all goes against a lot of the design for static OO and functional programming languages.
In static OO types are always static, and dynamic behavior is enabled through polymorphism. You can know the exact behavior of a type based on its run time type. This is very helpful in debugging an existing program. Allowing your types to be modified at run time harms this.
In both static OO and function programming paradigms, limiting and isolating side-effects is quite helpful, and using fully immutable structures is one of the primary ways to do this. The only point of exposing methods as delegates is to create mutable structures, which has the exact opposite effect.
Alternatives
If you really wanted to go so far as to always use delegates to replace methods, you should be using a language like IronPython or something else built on top of the DLR. Those languages will be tooled and tuned for the paradigm you're trying to implement. Users and maintainers of your code won't be surprised.
That being said, there are uses that justify using delegates as a substitute for methods. You shouldn't consider this option unless you have a compelling reason to do so that overrides these performance, confusion, reliability, and design issues. You should only do so if you're getting something in return.
Uses
For private members, Eric Lippert's answer describes a good use: (Memoization).
You can use it to implement a Strategy Pattern in a function-based manner rather than requiring a class hierarchy. Again, I'd use private members for this...
...Example code:
public class Context
{
private Func<int, int, int> executeStrategy;
public Context(Func<int, int, int> executeStrategy) {
this.executeStrategy = executeStrategy;
}
public int ExecuteStrategy(int a, int b) {
return executeStrategy(a, b);
}
}
I have found a particular case where I think public delegate properties are warrented: To implement a Template Method Pattern with instances instead of derived classes...
...This is particularly useful in automated integration tests where you have a lot of setup/tear down. In such cases it often makes sense to keep state in a class designed to encapsulate the pattern rather than rely on the unit test fixture. This way you can easily support sharing the skeleton of the test suite between fixtures, without relying on (sometimes shoddy) test fixture inheritance. It also might be more amenable to parallelization, depending on the implementation of your tests.
var test = new MyFancyUITest
{
// I usually name these things in a more test specific manner...
Setup = () => { /* ... */ },
TearDown = () => { /* ... */ },
};
test.Execute();
Intellisense Support
outside users of the function see unnamed parameters - sometimes this is helpfull although it would be nice to be able to name them.
Use a named delegate - I believe this will get you at least some Intellisense for the parameters (probably just the names, less likely XML docs - please correct me if I'm wrong):
public class MyClass
{
public delegate int DoSomethingImpl(int foo, int bizBar);
public DoSomethingImpl DoSomething = (x, y) => { return x + y; }
}
I'd avoid delegate properties/fields as method replacements for public methods. For private methods it's a tool, but not one I use very often.
instance delegate fields have a per instance memory cost. Probably a premature optimization for most classes, but still something to keep in mind.
Your code uses a public mutable field, which can be changed at any time. That hurts encapsulation.
If you use the field initializer syntax, you can't access this. So field initializer syntax is mainly useful for static methods.
Makes static analysis much harder, since the implementation of that method isn't known at compile-time.
There are some cases where delegate properties/fields might be useful:
Handlers of some sort. Especially if multi-casting (and thus the event subscription pattern) doesn't make much sense
Assigning something that can't be easily described by a simple method body. Such as a memoized function.
The delegate is runtime generated or at least its value is only decided at runtime
Using a closure over local variables is an alternative to using a method and private fields. I strongly dislike classes with lots of fields, especially if some of these fields are only used by two methods or less. In these situations, using a delegate in a field can be preferable to conventional methods
class MyClassConventional {
int? someValue; // When Mark() is called, remember the value so that we can do something with it in Process(). Not used in any other method.
int X;
void Mark() {
someValue = X;
}
void Process() {
// Do something with someValue.Value
}
}
class MyClassClosure {
int X;
Action Process = null;
void Mark() {
int someValue = X;
Process = () => { // Do something with someValue };
}
}
This question presents a false dichotomy - between functions, and a delegate with an equivalent signature. The main difference is that one of the two you should only use if there are no other choices. Use this in your day to day work, and it will be thrown out of any code review.
The benefits that have been mentioned are far outweighed by the fact that there is almost never a reason to write code that is so obscure; especially when this code makes it look like you don't know how to program C#.
I urge anyone reading this to ignore any of the benefits which have been stated, since they are all overwhelmed by the fact that this is the kind of code that demonstrates that you do not know how to program in C#.
The only exception to that rule is if you have a need for one of the benefits, and that need can't be satisfied in any other way. In that case, you'll need to write more comment than code to explain why you have a good reason to do it. Be prepared to answer as clearly as Eric Lippert did. You'd better be able to explain as well as Eric does that you can't accomplish your requirements and write understandable code at the same time.

What happens when you create an instance of an object containing no state in C#?

I am I think ok at algorithmic programming, if that is the right term? I used to play with turbo pascal and 8086 assembly language back in the 1980s as a hobby. But only very small projects and I haven't really done any programming in the 20ish years since then. So I am struggling for understanding like a drowning swimmer.
So maybe this is a very niave question or I'm just making no sense at all, but say I have an object kind of like this:
class Something : IDoer
{
void Do(ISomethingElse x)
{
x.DoWhatEverYouWant(42);
}
}
And then I do
var Thing1 = new Something();
var Thing2 = new Something();
Thing1.Do(blah);
Thing2.Do(blah);
does Thing1 = Thing2? does "new Something()" create anything? Or is it not much different different from having a static class, except I can pass it around and swap it out etc.
Is the "Do" procedure in the same location in memory for both the Thing1(blah) and Thing2(blah) objects? I mean when executing it, does it mean there are two Something.Do procedures or just one?
They are two separate objects; they just don't have state.
Consider this code:
var obj1 = new object();
var obj2 = new object();
Console.WriteLine(object.ReferenceEquals(obj1, obj2));
It will output False.
Just because an object has no state doesn't mean it doesn't get allocated just like any other object. It just takes very little space (just like an object).
In response to the last part of your question: there is only one Do method. Methods are not stored per instance but rather per class. If you think about it, it would be extremely wasteful to store them per instance. Every method call to Do on a Something object is really the same set of instructions; all that differs between calls from different objects is the state of the underlying object (if the Something class had any state to begin with, that is).
What this means is that instance methods on class objects are really behaviorally the same as static methods.
You might think of it as if all instance-level methods were secretly translated as follows (I'm not saying this is strictly true, just that you could think of it this way and it does kind of make sense):
// appears to be instance-specific, so you might think
// it would be stored for every instance
public void Do() {
Do(this);
}
// is clearly static, so it is much clearer it only needs
// to be stored in one place
private static Do(Something instance) {
// do whatever Do does
}
Interesting side note: the above hypothetical "translation" explains pretty much exactly how extension methods work: they are static methods, but by qualifying their first parameter with the this keyword, they suddenly look like instance methods.
There are most definitely two different objects in memory. Each object will consume 8 bytes on the heap (at least on 32-bit systems); 4 for the syncblock and 4 for the type handle (which includes the method table). Other than the system-defined state data there is no other user-defined state data in your case.
There is a single instance of the code for the Something.Do method. The type handle pointer that each object holds is how the CLR locates the different methods for the class. So even though there are two different objects in memory they both execute the same code. Since Something.Do was declared as an instance method it will have a this pointer passed to it internally so that the code can modify the correct instance members depending on which object was invoking the method. In your case the Something class has no instance members (and thus no user-defined state) and so this is quite irrelevant, but still happens nevertheless.
No they are not the same. They are two separate instances of the class Something. They happen to be identically instantiated, that is all.
You would create 2 "empty" objects, there would be a small allocation on the heap for each object.
But the "Do" method is always in the same place, that has nothing to do with the absence of state. Code is not stored 'in' a class/object. There is only 1 piece of code corresponding to Do() and it has a 'hidden' parameter this that points to the instance of Something it was called on.
Conceptually, Thing1 and Thing2 are different objects, but there is only one Something.Do procedure.
The .Net runtime allocates a little bit of memory to each of the objects you create - one chunk to Thing1 and another to Thing2. The purpose of this chunk of memory is to store (1) the state of the object and (2) a the address of any procedures that that belong to the object. I know you don't have any state, but the runtime doesn't care - it still keeps two separate references to two separate chunks of memory.
Now, your "Do" method is the same for both Thing1 and Thing2, do the runtime only keeps one version of the procedure in memory.
he memory allocated Thing1 includes the address of the the Do method. When you invoke the Do method on Thing1, it looks up the address of its Do method for Thing1 and runs the method. The same thing happens with the other object, Thing2. Although the objects are different, the same Do method is called for both Thing1 and Thing2.
What this boils down to is that Thing1 and Thing2 are different, in that the names "Thing1" and "Thing2" refer to different areas of memory. The contents of this memory is he same in both cases - a single address that points to the "Do" method.
Well, that's the theory, anyway. Under the hood, there might be some kind of optimisation going on (See http://www.wrox.com/WileyCDA/Section/CLR-Method-Call-Internals.id-291453.html if you're interested), but for most practical purposes, what I have said is the way things work.
Thing1 != Thing2
These are two different objects in memory.
The Do method code is in the same place for both objects. There is no need to store two different copies of the method.
Each reference type (Thing1, Thing2) is pointing to a different physical address in main memory, as they have been instantiated separately. The thing pointed to in memory is the bytes used by the object, whether it has a state or not (it always has a state, but whether it has a declared/initialised state).
If you assigned a reference type to another reference type (Thing2 = Thing1;) then it would be the same portion of memory used by two different reference types, and no new instantiation would take place.
A good way of think of the new constructor(), is that you are really just calling the method inside your class whos sole responsibility is to produce you a new instance of an object that is cookie cutted from your class.
so now you can have multiple instances of the same class running around at runtime handling all sorts of situations :D
as far as the CLR, you are getting infact 2 seperate instances on memory that each contain pointers to it, it is very similar to any other OOP language but we do not have to actually interact with the pointers, they are translated the same as a non reference type, so we dont have to worry about them!
(there are pointers in C# if you wish to whip out your [unsafe] keyword!)

Variable number of arguments without boxing the value-types?

public void DoSomething(params object[] args)
{
// ...
}
The problem with the above signature is that every value-type that will be passed to that method will be boxed implicitly, and this is serious performance issue for me.
Is there a way to declear a method that accepts variable number of arguments without boxing the value-types?
Thanks.
You can use generics:
public void DoSomething<T>(params T[] args)
{
}
However, this will only allow a single type of ValueType to be specified. If you need to mix or match value types, you'll have to allow boxing to occur, as you're doing now, or provide specific overloads for different numbers of parameters.
Edit: If you need more than one type of parameter, you can use overloads to accomplish this, to some degree.
public void DoSomething<T,U>(T arg1, params U[] args) {}
public void DoSomething<T,U>(T arg1, T arg2, params U[] args) {}
Unfortunately, this requires multiple overloads to exist for your types.
Alternatively, you could pass in arrays directly:
public void DoSomething<T,U>(T[] args1, U[] args2) {}
You lose the nice compiler syntax, but then you can have any number of both parameters passed.
Not presently, no, and I haven't seen anything addressing the issue in the .NET 4 info that's been released.
If it's a huge performance problem for you, you might consider several overloads of commonly seen parameter lists.
I wonder, though: is it really a performance problem, or are you prematurely optimizing?
Let's assume the code you're calling this method from is aware of argument types. If so, you can pack them into appropriate Tuple type from .NET 4, and pass its instance (Tuple is reference type) to such method as object (since there is no common base for all the Tuples).
The main problem here is that it isn't easy to process the arguments inside this method without boxing / unboxing, and likely, even without reflection. Try to think what must be done to extract, let's say, Nth argument without boxing. You'll end up with understanding you must either deal with dictionary lookup(s) there (involving either regular Dictionary<K,V> or internal dictionaries used by CLR), or with boxing. Obviously, dictionary lookups are much more costly.
I'm writing this because actually we developed a solution for very similar problem: we must be able to operate with our own Tuples without boxing - mainly, to compare and deserialize them (Tuples are used by database engine we develop, so performance of any basic operation is really essential in our case).
But:
We end up with pretty complex solution. Take a look e.g. at TupleComparer.
Effect of absence of boxing is actually not as good as we expected: each boxing / unboxing operation is replaced by a single array indexing and few virtual method calls, the cost of both ways is almost identical.
The only benefit of approach we developed is that we don't "flood" Gen0 by garbage, so Gen0 collections happen much more rarely. Since Gen0 collection cost is proportional to the space allocated by "live" objects and to their count, this brings noticeable advantage, if other allocations intermix with (or simply happen during) the execution of algorithm we try to optimize by this way.
Results: after this optimization our synthetic tests were showing from 0% to 200-300% performance increase; on the other hand, simple performance test of the database engine itself have shown much less impressive improvement (about 5-10%). A lot of time were wasted at above layers (there is a pretty complex ORM as well), but... Most likely that's what you'll really see after implementing similar stuff.
In short, I advise you to focus on something else. If it will be fully clear this is a major performance problem in your application, and there are no other good ways of resolving it, well, go ahead... Otherwise you're simply steeling from your customer or your own by doing premature optimization.
For a completely generic implementation, the common workaround is to use a fluent pattern. Something like this:
public class ClassThatDoes
{
public ClassThatDoes DoSomething<T>(T arg) where T : struct
{
// process
return this;
}
}
Now you call:
classThatDoes.DoSomething(1).DoSomething(1m).DoSomething(DateTime.Now)//and so on
However that doesn't work with static classes (extension methods are ok since you can return this).
Your question is basically the same as this: Can I have a variable number of generic parameters? asked in a different way.
Or accept an array of items with params keyword:
public ClassThatDoes DoSomething<T>(params T[] arg) where T : struct
{
// process
return this;
}
and call:
classThatDoes.DoSomething(1, 2, 3)
.DoSomething(1m, 2m, 3m)
.DoSomething(DateTime.Now) //etc
Whether the array creating overhead is less than boxing overhead is something you will have to decide yourself.
In C# 4.0 you can use named (and thus optional) parameters! More info on this blog post

Overhead of using this on structs

When you have automatic properties, C# compiler asks you to call the this constructor on any constructor you have, to make sure everything is initialized before you access them.
If you don't use automatic properties, but simply declare the values, you can avoid using the this constructor.
What's the overhead of using this on constructors in structs? Is it the same as double initializing the values?
Would you recommend not using it, if performance was a top concern for this particular type?
I would recommend not using automatic properties at all for structs, as it means they'll be mutable - if only privately.
Use readonly fields, and public properties to provide access to them where appropriate. Mutable structures are almost always a bad idea, and have all kinds of nasty little niggles.
Do you definitely need to create your own value type in the first place though? In my experience it's very rare to find a good reason to create a struct rather than a class. It may be that you've got one, but it's worth checking.
Back to your original question: if you care about performance, measure it. Always. In this case it's really easy - you can write the struct using an automatic property and then reimplement it without. You could use a #if block to keep both options available. Then you can measure typical situations and see whether the difference is significant. Of course, I think the design implications are likely to be more important anyway :)
Yes, the values will be initialized twice and without profiling it is difficult to say whether or not this performance hit would be significant.
The default constructor of a struct initializes all members to their default values. After this happens your constructor will run in which you undoubtedly set the values of those properties again.
I would imagine this would be no different than the CLR's practice of initializing all fields of a reference type upon instantiation.
The reason the C# compiler requires you to chain to the default constructor (i.e. append : this() to your constructor declaration) when auto-implemented properties are used is because all variables need to be assigned before exiting the constructor. Now, auto-implemented properties mess this up a bit in that they don't allow you to directly access the variables that back the properties. The method the compiler uses to get around this is to automatically assign all the variables to their default values, and to insure this, you must chain to the default constructor. It's not a particularly clever method, but it does the job well enough.
So indeed, this will mean that some variables will end up getting initialised twice. However, I don't think this will be a big performance problem. I would be very surprised it the compiler (or at very least the JIT) didn't simply remove the first initialisation statement for any variable that is set twice in your constructor. A quick benchmark should confirm this for you, though I'm quite sure you will get the suspected results. (If you by chance don't, and you absolutely need the tiny performance boost that avoidance of duplicate initialisation offers, you can just define your properties the normal way, i.e. with backing variables.)
To be honest, my advice would be not even to bother with auto-implemented properties in structures. It's perfectly acceptable just to use public variables in lieu of them, and they offer no less functionality than auto-implemented properties. Classes are a different situation of course, but I really wouldn't hesitate to use public variables in structs. (Any complex properties can be defined normally, if you need them.)
Hope that helps.
Don't use automatic properties with structure types. Simply expose fields directly. If a struct has an exposed public field Foo of type Bar, the fact that Foo is an exposed field of type Bar (information readily available from Intellisense) tells one pretty much everything there is to know about it. By contrast, the fact that a struct Foo has an exposed read-write property of Boz does not say anything about whether writing to Boz will mutate a field in the struct, or whether it will mutate some object to which Boz holds a reference. Exposing fields directly will offer cleaner semantics, and often also result in faster-running code.

Categories

Resources