C# Property Access Optimization

C# Property Access Optimization - c#

In C# (or VB .NET), does the compiler make attempts to optimize property accesses? For eg.,
public ViewClass View
{
get
{
...
Something is computed here
....
}
}
if (View != null)
View.Something = SomethingElse;
I would imagine that if the compiler could somehow detect that View remains constant between the two accesses, it can refrain from computing the value twice. Are these kind of optimizations performed?
I understand that if View has some intensive computations, it should probably be refactored into a function (GetView()). In my particular case, View involves climbing the visual tree looking for an element of a particular type.
Related: Any references on the workings of the (Microsoft) C# compiler?

Not in general, no. As Steven mentioned there are numerous factors to consider regarding multithreading, if you truly are computing something that might change, you're correct it should be refactored away from a property. If it won't change, you should lazy-load it (check if the private member is null, if so then calculate, then return the value).
If it won't change and depends on a parameter, you can use a Dictionary or Hashtable as a cache - given the parameter (key) you will store the value. You could have each entry as a WeakReference to the value too, so when the value isn't referenced anywhere and garbage collection happens, the memory will be freed.
Hope that helps.

The question is very unclear, it isn't obvious to me how the getter and the snippet below it are related. But yes, property accessors are normally heavily optimized. Not by the C# compiler, by the JIT compiler. For one, they are often inlined so you don't pay for the cost of a method call.
That will only happen if the getter doesn't contain too much code and doesn't monkey with locks and exception handling. You can help the JIT compiler to optimize the common case with code like this:
get
{
if (_something == null) {
_something = createSomething();
}
return _something;
}
This will inline the common case and allow the creation method to remain un-inlined. This gets typically compiled to three machine code instructions in the Release build (load + test + jump), about a nano-second of execution time. It is a micro-optimization, seeing an actual perf improvement would be quite rare.
Do note that the given sample code is not thread-safe. Always write correct code rather than fast code first.

No, which is why you should use Lazy<T> to implement a JIT calculation.

From my understanding there is no implicit caching - you have to cache the value of a given property yourself the first time it is calculated
For example:
object mCachedValue = null;
public Object MyProperty
{
get
{
if (mCachedValue == null)
{
lock(mCachedValue)
{
//after acquiring the lock check if the property has not been initialized in the mean time - only calculate once
if (mCachedValue == null)
{
//calculate value the first time
}
}
}
return mCachedValue;
}

Related

Is it best practice to create a variable if accessing a property of an object more than once in a routine?

When I first began as a junior C# dev, I was always told during code reviews that if I was accessing an object's property more than once in a given scope then I should create a local variable within the routine as it was cheaper than having to retrieve it from the object. I never really questioned it as it came from more people I perceived to be quite knowledgeable at the time.
Below is a rudimentary example
Example 1: storing an objects identifer in a local variable
public void DoWork(MyDataType object)
{
long id = object.Id;
if (ObjectLookup.TryAdd(id, object))
{
DoSomeOtherWork(id);
}
}
Example 2: retrieving the identifier from the Id property of the object property anytime it is needed
public void DoWork(MyDataType object)
{
if (ObjectLookup.TryAdd(object.Id, object))
{
DoSomeOtherWork(object.Id);
}
}
Does it actually matter or was it more a preference of coding style where I was working? Or perhaps a situational design time choice for the developer to make?

As explained in this answer, if the property is a basic getter/setter than the CLR "will inline the property access and generate code that’s as efficient as accessing a field directly". However, if your property, for example, does some calculations every time the property is accessed, then storing the value of the property in a local variable will avoid the overhead of additional calculations being done.

All the memory allocation stuff aside, there is the principle of DRY(don't repeat yourself). When you can deal with one variable with a short name rather than repeating the object nesting to access the external property, why not do that?
Apart from that, by creating that local variable you are respecting the single responsibility principle by isolating the methods from the external entity they don't need to know about.
And lastly if the so-called resuing leads to unwanted instantiation of reference types or any repetitive calculation, then it is a must to create the local var and reuse it throughout the class/method.
Any way you look at it, this practice helps with readability and more maintainable code, and possibly safer too.

I don't know if it is faster or not (though I would say that the difference is negligible and thus unimportant), but I'll cook up some benchmark for you.
What IS important though will be made evident to you with an example;
public Class MyDataType
{
publig int id {
get {
// Some actual code
return this.GetHashCode() * 2;
}
}
}
Does this make more sense? The first time I will access the id Getter, some code will be executed. The second time, the same code will be executed costing twice as much with no need.
It is very probable, that the reviewers had some such case in mind and instead of going into every single one property and check what you are doing and if it is safe to access, they created a new rule.
Another reason to store, would be useability.
Imagine the following example
object.subObject.someOtherSubObject.id
In this case I ask in reviews to store to a variable even if they use it just once. That is because if this is used in a complicated if statement, it will reduce the readability and maintainability of the code in the future.

A local variable is essentially guaranteed to be fast, whereas there is an unknown amount of overhead involved in accessing the property.
It's almost always a good idea to avoid repeating code whenever possible. Storing the value once means that there is only one thing to change if it needs changing, rather than two or more.
Using a variable allows you to provide a name, which gives you an opportunity to describe your intent.
I would also point out that if you're referring to other members of an object a lot in one place, that can often be a strong indication that the code you're writing actually belongs in that other type instead.

You should consider that getting a value from a method that is calculated from an I/O-bound or CPU-bound process can be irrational. Therefore, it's better to define a var and store the result to avoid multiple same processing.
In the case that you are using a value like object.Id, utilizing a variable decorated with const keyword guarantees that the value will not change in the scope.
Finally, it's better to use a local var in the classes and methods.

C# create List<T> in initialization vs get

I was wondering what is the best method to create a list in a certain object.
1) DefA "always" occupies memory beforehand even if it is never called, right?
2) DefB will "always" have to check for the null condition or does the compiler optimizes this?
3) Is there a better way to implement this?
Thanks
private List<A> _defA = new List<A>();
public List<A> DefA
{
get { return _defA; }
}
private List<B> _defB;
public List<B> DefB
{
get
{
if (_defB == null)
_defB = new List<B>();
return _defB;
}
}

Because I think both options will not affect on performance of your application, my suggestion to choose one which keep code cleaner
Use Lazy type - Lazy on MSDN
From MSDN about Lazy initialization:
By default, Lazy objects are thread-safe. That is, if the
constructor does not specify the kind of thread safety, the Lazy
objects it creates are thread-safe. In multi-threaded scenarios, the
first thread to access the Value property of a thread-safe Lazy
object initializes it for all subsequent accesses on all threads, and
all threads share the same data. Therefore, it does not matter which
thread initializes the object, and race conditions are benign.
So in your case
private Lazy<List<A>> _defA = new Lazy<List<A>>(() => new List<A>());
public List<A> DefA
{
get
{
return _defA.Value;
}
}
In addition this approach will tell your intents to other developers who may work with your code.

In this specific example, the delayed (lazy) instantiation might save a few milliseconds on startup; but at the risk of issues in a multi-threaded scenario.
Say two threads call DefB (Get) almost simultaneously - they might end up setting _defB twice, instead of the once that you intend.
_defA will always take the memory of an empty list, as I understand it, yes - so you'll save some memory the second way if it's not called - but it does make the code MUCH harder to understand. Also, what if a local piece of code doesn't call the accessor method, but just does _defB.Add() or whatever? (which might not be deliberate now, but because it's more complex it's easy to forget/miss in the future)

First of all, don't optimize something that doesn't need optimizing.
If you're creating thousands or millions of the object that contains that property, and this property is seldom used and thus seldom needed, then yes, adding lazy on-demand initialization is probably a good idea. I say probably because there may be other performance-related issues as well.
However, to answer your specific questions, other than "what is the best way":
The initialization of _defA will construct a List<A> object even if the property is never used, that is correct.
The getter method of DefB will always do the null check, that is also correct. The compiler cannot optimize this away.
As for "better way"? That part of the question falls into the "primarily opinion-based" close option here on Stack Overflow. It depends largely on what you determine is better:
More expressive syntax (shorter code)
Less memory spent (option B)
Less code in the getter (option A)
I can give you an alternative to the syntax in option A:
public List<A> DefA
{
get;
} = new List<A>();
This syntax is available in Visual Studio 2015 with C# 6 (even if you compiler for older .NET runtime versions) and is called Auto-property initializer.
The compiler will automagically create the backing field for you (the _defA equivalent) and mark it read-only, so feature-wise this is 100% identical to option A, it's just a different syntax.

Compiler optimization of properties that remain static for the duration of a loop

I was reading Improving .NET Application Performance and Scalability. The section titled Avoid Repetitive Field or Property Access contains a guideline:
If you use data that is static for the duration of the loop, obtain it
before the loop instead of repeatedly accessing a field or property.
The following code is given as an example of this:
for (int item = 0; item < Customer.Orders.Count; item++)
{
CalculateTax(Customer.State, Customer.Zip, Customer.Orders[item]);
}
becomes
string state = Customer.State;
string zip = Customer.Zip;
int count = Customers.Orders.Count;
for (int item = 0; item < count; item++)
{
CalculateTax(state, zip, Customer.Orders[item]);
}
The article states:
Note that if these are fields, it may be possible for the compiler to
do this optimization automatically. If they are properties, it is much
less likely. If the properties are virtual, it cannot be done
automatically.
Why is it "much less likely" for properties to be optimized by the compiler in this manner, and when can one expect for a particular property to be or not to be optimized? I would assume that properties where additional operations are performed in the accessors are harder for the compiler to optimize, and that those that only modify a backing field are more likely to be optimized, but would like some more concrete rules. Are auto-implemented properties always optimized?

It requires the jitter to apply two optimizations:
First the property getter method must be inlined so it turns into the equivalent of a field access. That tends to work when the getter is small and does not throw exceptions. This is necessary so the optimizer can be sure that the getter does not rely on state that can be affected by other code.
Note how the hand-optimized code would be wrong if, say, the Customer.Orders[] indexer would alter the Customer.State property. Lazy code like this is pretty unlikely of course but it's not like this has never been done :) The optimizer has to be sure.
Secondly, the field access code has to be hoisted out of the loop body. An optimization called "invariant code motion". Works on simple property getter code when the jitter can prove that the statements inside the loop body don't affect the value.
The jitter optimizer implements it but it is not stellar at it. In this particular case it is pretty likely that it will give up when it cannot inline the CalculateTax() method. A native compiler optimizes it much more aggressively, it can afford to burn the memory and analysis time on it. The jitter optimizer must meet a pretty hard deadline to avoid pauses.
Do keep the constraints of the optimizer in mind when you do this yourself. Pretty darn ugly bug of course if these methods do have side-effects that you did not count on. And only do this when the profiler told you that this code is on the hot path, the typical ~10% of your code that actually affects the execution time. Low odds here, the dbase query to get customer/order data is going to orders of magnitude more expensive than calculating tax. Luckily code transforms like this also tend to make code more readable so you usually get it for free. YMMV.
A backgrounder on jitter optimizations is here.

Why is it "much less likely" for properties to be optimized by the compiler in this manner, and when can one expect for a particular property to be or not to be optimized?
Properties are not always just wrappers for a field. If there is any degree of logic in a property, it becomes significantly more difficult for a compiler to prove that it is correct to re-use the value it first got when the loop began.
As an extreme example, consider
private Random rnd = new Random();
public int MyProperty
{
get { return rnd.Next(); }
}

Why does do this code do if(sz !=sz2) sz = sz2?

For the first time i created a linq to sql classes. I decided to look at the class and found this.
What... why is it doing if(sz !=sz2) { sz = sz2; }. I dont understand. Why isnt the set generated as this._Property1 = value?
private string _Property1;
[Column(Storage="_Property1", CanBeNull=false)]
public string Property1
{
get
{
return this._Property1;
}
set
{
if ((this._Property1 != value))
{
this._Property1 = value;
}
}
}

It only updates the property if it has changed. This is probably based on the assumption that a comparison is cheaper than updating the reference (and all the entailed memory management) that might be involved.

Where are you seeing that? The usual LINQ-to-SQL generated properties look like the following:
private string _Property1;
[Column(Storage="_Property1", CanBeNull=false)]
public string Property1 {
get {
return this._Property1;
}
set {
if ((this._Property1 != value)) {
this.OnProperty1Changing(value);
this.SendPropertyChanging();
this._Property1 = value;
this.SendPropertyChanged("Property1");
this.OnProperty1Changed();
}
}
}
And now it's very clear that the device is to avoid sending property changing/changed notifications when the property is not actually changing.
Now, it turns out that OnProperty1Changing and OnProperty1Changed are partial methods so that if you don't declare a body for them elsewhere the calls to those methods will not be compiled into the final assembly (so if, say, you were looking in Reflector you would not see these calls). But SendPropertyChanging and SendPropertyChanged are protected methods that can't be compiled out.
So, did you perhaps change a setting that prevents the property changing/changed notifications from being emitted by the code generator?

Setting a field won't cause property change notifications, so that's not the reason.
I would guess that this design choice was driven by something like the following:
That string is an immutable reference type. Therefore the original and new instances are interchangeable. However the original instance may have been around longer and on average may therefore be slightly more expensive to collect (*). So performance may be better if the original instance is retained rather than being replaced by a new identical instance.
(*) The new value has in most cases only just been allocated, and won't be reused after the property is set. So it is very often a Gen0 object that is efficient to collect, whereas the original value's GC generation is unknown.
If this reasoning is correct, I wouldn't expect to see the same pattern for value-type properties (int, double, DateTime, ...).
But of course this is only speculation and I may be completely wrong.

Looks like there's persistence going on here. If something is using reflection (or a pointcut, or something) to create a SQL UPDATE query when _Property1 changes, then it'll be very much more expensive to update the field than to do the comparison.

It comes from Heijlsberg's ObjectPascal root.... at least that's how most of the Borland Delphi VCL is implemented... ;)

Logic in get part of property. Good practice?

When databinding my xaml to some data I often use the "get" part of a property to do some logic. Like giving to sum of totals of a list or a check if something is positive.
For example:
public List<SomeClass> ListOfSomeClass{get;set;}
public double SumOfSomeClass
{
get
{
return ListOfSomeClass.Sum(s => s.Totals);
}
}
public bool SumPositive
{
get
{
if(SumOfSomeClass >= 0)
return true;
else
return false;
}
}
This way I can bind to SumPositive and SumOfSomeClass. Is this considered good practice? Even if it gets more complex than this? Or would it be better call a method and return the outcome? What about calls to another class or even a database?

Property getters are expected to be fast and idempotent (i.e. no destructive actions should be performed there). Though it's perfectly fine to iterate over an in-memory collection of objects, I wouldn't recomment doing any kind of heavy lifting in either get or set parts. And speaking of iterating, I'd still cache the result to save a few milliseconds.

Yes, unless it is an operation that might have performance implications. In that case you should use a method instead (as it is more intuitive to the end user that a method might be slow whereas a property will be quick)

I like your naming conventions and I agree entirely with using content such as your example in property getters, if you're delivering an API to be used with binding.
I don't agree with the point others have made about moving code into a method just because it is computationally heavy - that's not a distinction I'd ever make nor have I heard other people suggest that being in a method implies slower than a property.
I do believe that properties should be side-effect-free on the object on which they are called. It's vastly more difficult to guarantee they have no effect on the broader environment - even a relatively trivial property might pull data into memory or at least change the processor cache or vm state.

I say yes, but try to store on a private variable de results of ListOfSomeClass.Sum(s => s.Totals). Specially if you use it more than once.

I don't see any direct issue (unless the list is quite huge) but I would personally use the myInstance.SomeList.Sum() method if possible (.net >= 2.0).

For basic calculations off of fields or other properties in the collection it would be acceptable to do that inside the Get property. As everyone else said true logic should never be done in the getter.

Please change that getter to this:
public bool SumPositive
{
get
{
return SumOfSomeClass >= 0;
}
}
You are already using a boolean expression, no need to explicitly return true or false

Having complex logic in getters/setters is not a good practice. I recommend to move complex logic to separate methods (like GetSumOfXYZ()) and use memoization in property accessors.
You can avoid complex properties by using ObjectDataProvider - it allows you to define method to pull some data.

Depends... if this was on a domain entity then I wouldn't be in favor having complex logic in a getter and especially not a setter. Using a method (to me) signals a consumer of the entity that an operation is being performed while a getter signals a simple retrieval.
Now if this logic was in a ViewModel, then I think the getter aspect is a little more forgivable / expected.

I think that there is some level of logic that is expected in Getters and Setters, otherwise you just have a kind of convoluted way to declare your members public.

I would be careful about putting any logic in the Getter of a property. The more expensive it is to do, the more dangerous it is. Other developers expect a getter to return a value immediately just like getting a value from a member variable. I've seen a lot of instances where a developer uses a property on every iteration of a loop, thinking that they are just getting back a value, while the property is actually doing a lot of work. This can be a major slowdown in your code execution.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.