Object memory optimization question - c#

Please pardon me for a n00bish question.
Please consider the following code:
public class SampleClass
{
public string sampleString { get; set; }
public int sampleInt { get; set; }
}
class Program
{
SampleClass objSample;
public void SampleMethod()
{
for (int i = 0; i < 10; i++)
{ objSample = new SampleClass();
objSample.sampleInt = i;
objSample.sampleString = "string" + i;
ObjSampleHandler(objSample);
}
}
private void ObjSampleHandler(SampleClass objSample)
{
//Some Code here
}
}
In the given example code, each time the SampleMethod() is called, it would iterate for 10 times and allocate new memory space for the instance of SampleClass and would assign to objSample object.
I wonder,
If this is a bad approach as a lot of
memory space is being wasted with it?
If that is the case, is there a
better approach to reuse/optimize the
allocated memory?
Or, Am I getting worried for no reason at all and getting into unneccesary micro optimisation mode ? :)
Edit: Also consider the situation when such a method is being used in a multi threaded enviornment. Would that change anything?

The technical term for what you are doing is premature optimization
You're definitely doing well to think about the performance implications of things. But in this case, the .NET Garbage Collector will handle the memory fine. And .NET is very good at creating objects fast.
As long as your class's constructor isn't doing a lot of complex, time-consuming things, this won't be a big problem.

Second option.
You shouldn't be concerned with this kind of optimization unless you're having a performance issue.
And even if you are, it would depend of what you do with the object after you create it, for example, if in ObjSampleHandler() you're storing the objects to use them later, you simply cannot avoid what you're doing.
Remember, "early optimization is the root of all evil" or so they say ;)

As you are creating a new object (objSample = new SampleClass();), you are not reusing that object. You are only reusing the reference to an instance of SampleClass.
But now you are forcing that reference to be a member-variable of your class Program, where it could have been a local variable of the method SampleMethod.

Assuming your code in ObjSampleHandler method doesnt create any non-local references to objSample, the object will become eligible for Garbage Collection once the method finishes, which will be quite memory efficient, and unlikely to be of concern.
However, if you are having problems specifically with the managed heap because of this type of code then you could change your class to a struct, and it will be stored on the Stack rather than the Heap which is more efficient. Please remember though that structs are copied by value rather than reference, and you need to understand the consequences of this in the remainder of your code.
public struct SampleClass
{
public string sampleString { get; set; }
public int sampleInt { get; set; }
}

Related

C# Struct this() initializer - Memory, Performance, and cleanliness

Resharper recommended a change to my .net struct that I was unaware of. I am having a hard time finding Microsoft information about the this() initializer on a struct.
I have a constructor on my struct where i am passing in the values, but want the struct properties to be read-only to once the struct has been created. The Resharper proposed way makes the code much cleaner looking.
Questions:
Memory: I want to avoid generating any extra garbage if possible. I worry using this() may pre-initialize my value types, prior to setting them.
Performance: I worry that using the this() will first initialize the struct values with defaults, then set the values. An unnecessary operation. It would be nice to avoid that.
Cleanliness: Its obvious that using the :this() makes the struct much cleaner. Any reason why we wouldn't want to use that?
Example:
public struct MyContainer
{
public MyContainer(int myValue) : this()
{
MyValue = myValue;
}
public int MyValue { get; private set; }
}
public struct MyContainer2
{
private readonly int _myValue;
public MyContainer2(int myValue)
{
_myValue = myValue;
}
public int MyValue
{
get { return _myValue; }
}
}
If you are trying to optimize performance and less .net garbage, which is the correct route to go? Is there even a difference when it gets compiled?
I don't want to blindly accept using this, when I am creating millions of structs for data processing. They are short lived container objects so .net garbage and performance matters.
I create a quick benchmark of a struct with the "this()" initializer and one without, like this:
struct Data
{
public Data(long big, long big2, int small)
{
big_data = big;
big_data2 = big2;
small_data = small;
}
public long big_data;
public long big_data2;
public int small_data;
}
I benchmarked by initializing 5 billion structs of each type. I found that in debug mode, the struct test without "this()" initializer was measurably faster. In release mode, they were almost equal. I am assuming that in release mode, the "this()" is being optimized out and in debug it is running the "this()" and possibly even initializing the struct fields to default.
This is a short coming of the language concerning auto implemented properties and structs. It's fixed in C# 6 where the explicit call to this is not necessary , and you could even do away with the private setter:
public struct MyContainer
{
public int MyValue { get; }
public MyContainer(int value)
{
MyValue = value; //readonly properties can be set in the constructor, similar to how readonly fields behave
}
}
As to performance. I'd be very much surprised if there is a noticeable difference between the two (I can't currently check the differences in the generated IL). (As per comments, the next bit of the answer is irrelevant, calling this() will not generate additional "garbage") Also, if the objects are short lived like you claim, I wouldn't worry about garbage at all as they would all be stored in the stack, not the heap memory.

Do unused properties cause overhead?

Suppose we had a generated class with a lot of redundant accessors. They are just accessors, they are not fields. They are not called anywhere. They just sit there being redundant and ugly.
For example:
public class ContrivedExample
{
public int ThisOneIsUsed { get; set; }
public int ThisOneIsNeverCalled0 { get { /* Large amounts of logic go here, but is never called. */ } }
public int ThisOneIsNeverCalled1 { get { /* Large amounts of logic go here, but is never called. */ } }
public int ThisOneIsNeverCalled2 { get { /* Large amounts of logic go here, but is never called. */ } }
//...
public int ThisOneIsNeverCalled99 { get { /* Large amounts of logic go here, but is never called.*/ } }
}
ContrivedExample c = new ContrivedExample() { ThisOneIsUsed = 5; }
The only overhead I can think of is that it would make the .DLL larger. I expect that there would be zero runtime penalties.
Does this cause any other overhead? Even a tiny overhead of any kind?
It is unlikely to have any measurable run-time overhead. In any case since it is performance question - measure your usage and make decisions for your case if in doubt.
Unreferenced methods do not get compiled by JIT nor cause direct run-time overhead.
Metadata for the class will be bigger (along with size of assembly as you've mentioned).
You may get indirect impact if the class is used in some code that involves a lot of reflection, also if code repeatedly reflects over the same class it is likely wrong by itself.

Why doesn't C# support const on a class / method level?

I've been wondering for a while why C# doesn't support const on a class or a method level. I know that Jon Skeet have wanted support for immutability for a long time, and I recon that using the C++ syntax of function const could aid in that. By adding a const keyword on a class level we would have total support.
Now, my question is, what the reason is for the C# team to not have developed this kind of support?
I'd imagine everything could be created with a compile-time check or through attributes, without needing to change the CLR. I don't mind code being able to override the const behavior through reflection.
Imagine this:
const class NumberContainer
{
public int Number { get; }
}
.. Such a class could only be populated at construction time, so we'd need a constructor to take in an int.
Another example is const on a method-level:
public int AddNumbers(NumberContainer n1, NumberContainer n2) const
{
return n1.Number + n2.Number;
}
Const-level methods should not be able to alter state in their own class or instances of reference types passed to them. Also, const-level functions could only invoke other const-level functions while in their scope.
I'm not really sure if lambdas and delegates would make everything too hard (or impossible) to achieve, but I'm sure someone with more experience in language and compiler design could tell me.
As Steve B pointed out in the comments, the existence of readonly makes things a bit more complex, as const and readonly are close to the same during runtime, but readonly values can't be determined during compile-time. I guess we could have const and readonly level but that might be too confusing?
So, what's the reason for not implementing this? Usability concerns (understanding constness in C++ usually quite hard for new users), language design concerns (can't be done) or simply priority concerns (the days of the immutability-buzz are over)..?
Risking a somewhat circular explanation, C# doesn't support const because the CLR has no support for it whatsoever. The CLR doesn't support it because it is drastically non-CLS compliant.
There are very few languages that have the concept. The C language has support for const, that's well supported in C# by readonly keyword. But the big dog is of course C++ that has a much wider applicability for const, no doubt the one you are looking for. I'll avoid pinning down what const should mean, that's a wormhole in itself and just talk of "const-ness", the property of having const applied.
The trouble with const-ness is that it needs to be enforced. That's a problem in C# when an arbitrary other language can use a C# class and completely ignore const-ness just because the language doesn't support it. Bolting it onto every other CLS language just because C# supports it is of course very unpractical.
Enforceability is a problem in C++ as well. Because the language also supports const_cast<>. Any client code can cast the const-ness away swiftly and undiagnosably. You are not supposed to, but then sometimes you have to. Because there are two kinds of const-ness, strict and observable. Roughly analogous to private const-ness and public const-ness. The mutable keyword was added to the language later to try to deal with the need for observable const-ness so at least the inevitable usage of const_cast<> could be avoided. Some people say that C++ is a difficult language. Don't hear that of C# much.
You say the CLR wouldn't need to be changed, but consider that there's no standard way to express this "const"ness within compiled assemblies - and that these assemblies might not be consumed by C# code anyway. It's not something you can just do for C# - you'd have to do it for all .NET languages.
As I believe the case to be, const means different things in C# compared to C++.
In C# you can use the readonly keyword to get the level of functionality you're wanting from const.
I was once surpised by the following situation:
class Vector
{
private double[] m_data;
public int Dimension {get;set;}
public double this[int i]
{
get {return m_data[i];}
set {m_data[i] = value;}
}
public Vector(int n)
{
this.Dimension = n;
this.m_data = new double(n);
}
public static Vector Zero(int n)
{
Vector v = new Vector(n);
for (int i = 0; i < n; i++)
{
v[i] = 0.0;
}
return v;
}
public static readonly Vector Zero3 = Zero(3);
}
Thou Vector.Zero3 is readonly and you cannot assign to it, you can still access its component, and then the following stupid thing happens:
Vector a = Vector.Zero3;
a[0] = 2.87;
and now, since a ist nothing but a reference to Vector.Vector3 the latter also has Vector.Vector3[0] == 2.87!
After I fell into this pit once, I invented a very simple hack, though not being elegant, fulfills its function.
Namely, into a class that I suppose to produce static readonly "constants", I introduce a Boolean flag:
class Vector
{
private double[] m_data;
public int Dimension {get;set;}
private bool m_bIsConstant = false;
...
public double this[int i]
{
get {return m_data[i];}
set
{
if (!m_bIsConstant)
{
m_data[i] = value;
}
}
}
...
public static Vector Zero(int n)
{
Vector v = new Vector(n);
for (int i = 0; i < n; i++)
{
v[i] = 0.0;
}
v.m_bIsConstant = true;
return v;
}
...
}
This hack guarantees that your static readonly variable will never be modified.
In the case of your proposal for a const-class, you say:
Such a class could only be populated at construction time, so we'd need a constructor to take in an int
But by making all properties read-only anyway you have already achieved what you've said.
I cannot speak for the C# language designers but maybe the reason of not having const applied to lots of other constructs is because adding it was simply not worth the effort and you can get around the issue in other ways (as described above and in other answers/comments).
I can't tell from your question, how this overloading of the const keyword would be especially beneficial.
Your first example could be rewritten legally as
public class NumberContainer
{
private readonly int number;
public NumberContainer(int number)
{
this.number = number;
}
public int Number
{
get { return number; }
}
}
Perhaps, if the compiler is unable to discern the immutability of this class (I don't know), some attribute could be useful?
In your second example, I do not understand what you are driving at. If a function returns a constant value then it can be replaced with a constant field.

CA1819: Properties shouldn't return arrays - What is the right alternative?

I encountered this FxCop rule before and wasn't really content with how to solve violations (thread1, thread2). I now have another case where I need to correct violations of the CA1819 kind.
Specifically, I have an algorithm-library that performs some analytic calculations on a curve (x,y), with a public "input object" like this:
public class InputObject
{
public double[] X { get; set; }
public double[] Y { get; set; }
// + lots of other things well
}
This object's X and Y properties are used in hundreds of locations within library, typically using indexes. The input object is never altered by the algorithms, but actually it shouldn't matter if so. Also, .Length is called pretty frequently. It's a mathematical library, and double[] is kind of the standard data type in there. In any case, fixing CA1819 will require quite some work.
I thought about using List<double>, since Lists support indexing and are quite similar to arrays but I'm not sure whether this may slow down the algorithms or whether FxCop will be happy with those Lists.
What is the best option to replace these double[] properties?
If it is read only to external consumer and consumer does not want to access it by index then the best is to have a public read only property of type IEnumerable<> with method accessors to add and remove, this way you will not have to expose your array to someone to mess with.
If you need to access the indexers then expose it as read only property of type IList<> and probably return a ReadOnly instance, with methods to add and remove.
This way you keep encapsulation of the internal list and allow consumer to access it in a read only way
Sometime FxCop from my point of view exagerates.
It all depends on what you have to do, if you are writing a complex system where security and very clean code is required, you should returns a readonly version of that array.
That is, cast the array as IEnumerable as suggests devdigital or use the good idea ImmutableArray of Mohamed Abed, that i prefer.
If your are writing software that require high performance... there is nothing better than an array for performances in C#.
Arrays can be a lot more performant for iterating and reading.
If performances are really important I suggest you to ignore that warning.
Is still legal, also if not too much clean, to return a readonly array.
for (int i = 0; i < array.Length; ++i) { k = array[i] + 1; }
This is very fast for big arrays in C#: it avoids array bounds check.
It will perform very much as a C compiled code would do.
I always wished a "readonly array" type in C# :) but there is no hope to see it.
As your link suggests:
To fix a violation of this rule, either make the property a method or
change the property to return a collection.
Using a collection such as a List should not have a significant impact on performance.
The big problem here isn't really what your library does with the values (which is a potential problem, albeit a much more manageable one), but rather what callers might do with the values. If you need to treat them as immutable, then you need to ensure that a library consumer cannot change the contents after their original assignment. The easy fix here would be to create an interface that exposes all the array members that your library uses, then create an immutable wrapper class for an array that implements this interface to use in your InputObject class. e.g.:
public interface IArray<T>
{
int Length { get; }
T this[int index] { get; }
}
internal sealed class ImmutableArray<T> : IArray<T>
where T : struct
{
private readonly T[] _wrappedArray;
internal ImmutableArray(IEnumerable<T> data)
{
this._wrappedArray = data.ToArray();
}
public int Length
{
get { return this._wrappedArray.Length; }
}
public T this[int index]
{
get { return this._wrappedArray[index]; }
}
}
public class InputObject
{
private readonly IArray<double> _x;
private readonly IArray<double> _y;
public InputObject(double[] x, double[] y)
{
this._x = new ImmutableArray<double>(x);
this._y = new ImmutableArray<double>(y);
}
public IArray<double> X
{
get { return this._x; }
}
public IArray<double> Y
{
get { return this._y; }
}
//...
}
The elements in your "immutable" array contents would still be mutable if T is mutable, but at least you're safe for the double type.
Change array [] to IEnumerable:
public class InputObject
{
public IEnumerable<double> X { get; set; }
public IEnumerable<double> Y { get; set; }
// + lots of other things well
}

Holding out on object creation

Is there ever a case where holding the necessary data to create an object and only creating it when is absolutely necessary, is better/more efficient than holding the object itself?
A trivial example:
class Bar
{
public string Data { get; set; }
}
class Foo
{
Bar bar;
readonly string barData;
public Foo(string barData)
{
this.barData = barData;
}
public void MaybeCreate(bool create)
{
if (create)
{
bar = new Bar { Data = barData };
}
}
public Bar Bar { get { return bar; } }
}
It makes sense if the object performs some complex operation on construction, such as allocate system resources.
You have Lazy<T> to help you delay an object's instantiation. Among other things, it has thread safety built in, if you need it.
In general, no. (If I understand your question correct).
Allocations/constructions are cheap in terms of performance. Unless you are doing something crazy, construct your objects when it feels natural for the design - don't optimize prematurely.
Yes if creating the object means populating it, and to populate it you need to do a slow operation.
For example,
List<int> ll = returnDataFromDBVeryVerySlowly();
or
Lazy<List<int>> ll = new Lazy<List<int>>(() =>
{
return returnDataFromDBVeryVerySlowly();
});
In first example returnDataFromDBVeryVerySlowly will be called always, even if you don't need it. In the second one it will be called only if it's necessary. This is quite common, for example, in ASP.NET where you want to have "ready" many "standard" datasets, but you don't want them to be populated unless they are needed and you want to put them as members of your Page, so that multiple methods can access them (otherwhise a method could call directly returnDataFromDBVeryVerySlowly)

Categories

Resources