C# Struct this() initializer - Memory, Performance, and cleanliness

C# Struct this() initializer - Memory, Performance, and cleanliness - c#

Resharper recommended a change to my .net struct that I was unaware of. I am having a hard time finding Microsoft information about the this() initializer on a struct.
I have a constructor on my struct where i am passing in the values, but want the struct properties to be read-only to once the struct has been created. The Resharper proposed way makes the code much cleaner looking.
Questions:
Memory: I want to avoid generating any extra garbage if possible. I worry using this() may pre-initialize my value types, prior to setting them.
Performance: I worry that using the this() will first initialize the struct values with defaults, then set the values. An unnecessary operation. It would be nice to avoid that.
Cleanliness: Its obvious that using the :this() makes the struct much cleaner. Any reason why we wouldn't want to use that?
Example:
public struct MyContainer
{
public MyContainer(int myValue) : this()
{
MyValue = myValue;
}
public int MyValue { get; private set; }
}
public struct MyContainer2
{
private readonly int _myValue;
public MyContainer2(int myValue)
{
_myValue = myValue;
}
public int MyValue
{
get { return _myValue; }
}
}
If you are trying to optimize performance and less .net garbage, which is the correct route to go? Is there even a difference when it gets compiled?
I don't want to blindly accept using this, when I am creating millions of structs for data processing. They are short lived container objects so .net garbage and performance matters.

I create a quick benchmark of a struct with the "this()" initializer and one without, like this:
struct Data
{
public Data(long big, long big2, int small)
{
big_data = big;
big_data2 = big2;
small_data = small;
}
public long big_data;
public long big_data2;
public int small_data;
}
I benchmarked by initializing 5 billion structs of each type. I found that in debug mode, the struct test without "this()" initializer was measurably faster. In release mode, they were almost equal. I am assuming that in release mode, the "this()" is being optimized out and in debug it is running the "this()" and possibly even initializing the struct fields to default.

This is a short coming of the language concerning auto implemented properties and structs. It's fixed in C# 6 where the explicit call to this is not necessary , and you could even do away with the private setter:
public struct MyContainer
{
public int MyValue { get; }
public MyContainer(int value)
{
MyValue = value; //readonly properties can be set in the constructor, similar to how readonly fields behave
}
}
As to performance. I'd be very much surprised if there is a noticeable difference between the two (I can't currently check the differences in the generated IL). (As per comments, the next bit of the answer is irrelevant, calling this() will not generate additional "garbage") Also, if the objects are short lived like you claim, I wouldn't worry about garbage at all as they would all be stored in the stack, not the heap memory.

Related

C# fixed size array

I started to learn about C# and I usually use C++.
There is a bunch of things that I'm trying to adapt, but std::array seem like impossible...
I just want to run this kind of code:
public struct Foo {};
public struct Test
{
public Foo value[20];
};
I don't want to allocate each time I use this struct and I don't want to use a class ever...
I saw fixed keyword but it works only for basic types...
There is not equivalent to something as simple as std::array?
I can even do that in C.
How would you sove this problem? (Even if it's still dynamically alocated..)

Using a fixed size buffer (fixed) is only possible for primitive types since its use is intended for interop. Array types are reference types, and so they can have dynamic size:
public struct Test
{
public Foo[] value;
}
Note however that copying the struct will only copy the reference, so the arrays will be identical. I suggest you either make the type immutable (by disabling writing to the array), or change struct to class and control cloning explicitly.
There is no such thing as a fixed size by-value array type in C# (although I have proposed it once). The closest thing you can get to it is a value tuple.

So it seems like there is no way to not do something as stupid as dynamically allocate something know at compile time. But that's C# so I just need to... try to close my eyes.
Anyway I did something to solve array alias and fixed array at the same time (I didn't ask about array alias on this question thought).
public abstract
class Array<T>
{
private T[] data;
protected Array(int size) { data = new T[size]; }
public T this[int i]
{
get { return data[i]; }
set { data[i] = value; }
}
};
public Alias : Array<int>
{
static public int Length = 10;
public Area() : base(Length) {}
};
And some people say it's quicker to write code with C#...
If someone have better I'll glady take it!

Restricting use of a structure in C#

Ok so lets say I have a structure A like that:
Struct A{
private String _SomeText;
private int _SomeValue;
public A(String someText, int SomeValue) { /*.. set the initial values..*/ }
public String SomeText{ get { return _SomeText; } }
public int SomeValue{ get { return _SomeValue; } }
}
Now what I want to be able to do is to return that Structure A as a result of a method in a Class ABC, like that:
Class ABC{
public A getStructA(){
//creation of Struct A
return a;
}
}
I don't want any programmer using my library (which will have Struct A and Class ABC and some more stuff) to ever be able to create an instance of Struct A.
I want the only way for it to be created is as a return from the getStructA() method. Then the values can be accessed through the appropriate getters.
So is there any way to set a restrictions like that? So a Structure can't be instantiated outside of a certain class? Using C#, .Net4.0.
Thanks for your help.
---EDIT:----
To add some details on why am I trying to achieve this:
My class ABC has some "status" a person can query. This status has 2 string values and then a long list of integers.
There never will be a need to create an object/instance of "Status" by the programmer, the status can only be returned by "getStatus()" function of the class.
I do not want to split these 3 fields to different methods, as to obtain them I am calling Windows API (p/invoke) which returns similar struct with all 3 fields.
If I was indeed going to split it to 3 methods and not use the struct, I would have to either cache results or call the method from Windows API every time one of these 3 methods is called...
So I can either make a public struct and programmers can instantiate it if they want, which will be useless for them as there will be no methods which can accept it as a parameter. Or I can construct the library in such a way that this struct (or change it to a class if it makes things easier) can be obtained only as a return from the method.

If the "restricted" type is a struct, then no, there is no way to do that. The struct must be at least as public as the factory method, and if the struct is public then it can be constructed with its default constructor. However, you can do this:
public struct A
{
private string s;
private int i;
internal bool valid;
internal A(string s, int i)
{
this.s = s;
this.i = i;
this.valid = true;
}
...
and now you can have your library code check the "valid" flag. Instances of A can only be made either (1) by a method internal to your library that can call the internal constructor, or (2) by the default constructor. You can tell them apart with the valid flag.
A number of people have suggested using an interface, but that's a bit pointless; the whole point of using a struct is to get value type semantics and then you go boxing it into an interface. You might as well make it a class in the first place. If it is going to be a class then it is certainly possible to make a factory method; just make all the ctors of the class internal.
And of course I hope it goes without saying that none of this gear should be used to implement code that is resistant to attack by a fully-trusted user. Remember, this system is in place to protect good users from bad code, not good code from bad users. There is nothing whatsoever that stops fully trusted user code from calling whatever private methods they want in your library via reflection, or for that matter, altering the bits inside a struct with unsafe code.

Create a public interface and make the class private to the class invoking it.
public ISpecialReturnType
{
String SomeText{ get; }
int SomeValue{ get; }
}
class ABC{
public ISpecialReturnType getStructA(){
A a = //Get a value for a;
return a;
}
private struct A : ISpecialReturnType
{
private String _SomeText;
private int _SomeValue;
public A(String someText, int SomeValue) { /*.. set the initial values..*/ }
public String SomeText{ get { return _SomeText; } }
public int SomeValue{ get { return _SomeValue; } }
}
}

What exactly are you concerned about? A structure is fundamentally a collection of fields stuck together with duct tape. Since struct assignment copies all of the fields from one struct instance to another, outside the control of the struct type in question, structs have a very limited ability to enforce any sort of invariants, especially in multi-threaded code (unless a struct is exactly 1, 2, or 4 bytes, code that wants to create an instance which contains a mix of data copied from two different instances may do so pretty easily, and there's no way the struct can prevent it).
If you want to ensure that your methods will not accept any instances of a type other than those which your type has produced internally, you should use a class that either has only internal or private constructors. If you do that, you can be certain that you're getting the instances that you yourself produced.
EDIT
Based upon the revisions, I don't think the requested type of restriction is necessary or particularly helpful. It sounds like what's fundamentally desired to stick a bunch of values together and store them into a stuck-together group of variables held by the caller. If you declare a struct as simply:
public struct QueryResult {
public ExecutionDuration as Timespan;
public CompletionTime as DateTime;
public ReturnedMessage as String;
}
then a declaration:
QueryResult foo;
will effectively create three variables, named foo.ExecutionDuration, foo.CompletionTime, and foo.ReturnedMessage. The statement:
foo = queryPerformer.performQuery(...);
will set the values of those three variables according to the results of the function--essentially equivalent to:
{
var temp = queryPerformer.performQuery(...);
foo.ExecutionDuration = temp.ExecutionDuration
foo.CompletionTime = temp.CompletionTime;
foo.ReturnedMessage = temp.ReturnedMessage;
}
Nothing will prevent user code from doing whatever it wants with those three variables, but so what? If user code decides for whatever reason to say foo.ReturnedMessage = "George"; then foo.ReturnedMessage will equal George. The situation is really no different from if code had said:
int functionResult = doSomething();
and then later said functionResult = 43;. The behavior of functionResult, like any other variable, is to hold the last thing written to it. If the last thing written to it is the result of the last call to doSomething(), that's what it will hold. If the last thing written was something else, it will hold something else.
Note that a struct field, unlike a class field or a struct property, can only be changed either by writing to it, or by using a struct assignment statement to write all of the fields in one struct instance with the values in corresponding fields of another. From the consumer's perspective, a read-only struct property carries no such guarantee. A struct may happen to implement a property to behave that way, but without inspecting the code of the property there's no way to know whether the value it returns might be affected by some mutable object.

Why doesn't C# support const on a class / method level?

I've been wondering for a while why C# doesn't support const on a class or a method level. I know that Jon Skeet have wanted support for immutability for a long time, and I recon that using the C++ syntax of function const could aid in that. By adding a const keyword on a class level we would have total support.
Now, my question is, what the reason is for the C# team to not have developed this kind of support?
I'd imagine everything could be created with a compile-time check or through attributes, without needing to change the CLR. I don't mind code being able to override the const behavior through reflection.
Imagine this:
const class NumberContainer
{
public int Number { get; }
}
.. Such a class could only be populated at construction time, so we'd need a constructor to take in an int.
Another example is const on a method-level:
public int AddNumbers(NumberContainer n1, NumberContainer n2) const
{
return n1.Number + n2.Number;
}
Const-level methods should not be able to alter state in their own class or instances of reference types passed to them. Also, const-level functions could only invoke other const-level functions while in their scope.
I'm not really sure if lambdas and delegates would make everything too hard (or impossible) to achieve, but I'm sure someone with more experience in language and compiler design could tell me.
As Steve B pointed out in the comments, the existence of readonly makes things a bit more complex, as const and readonly are close to the same during runtime, but readonly values can't be determined during compile-time. I guess we could have const and readonly level but that might be too confusing?
So, what's the reason for not implementing this? Usability concerns (understanding constness in C++ usually quite hard for new users), language design concerns (can't be done) or simply priority concerns (the days of the immutability-buzz are over)..?

Risking a somewhat circular explanation, C# doesn't support const because the CLR has no support for it whatsoever. The CLR doesn't support it because it is drastically non-CLS compliant.
There are very few languages that have the concept. The C language has support for const, that's well supported in C# by readonly keyword. But the big dog is of course C++ that has a much wider applicability for const, no doubt the one you are looking for. I'll avoid pinning down what const should mean, that's a wormhole in itself and just talk of "const-ness", the property of having const applied.
The trouble with const-ness is that it needs to be enforced. That's a problem in C# when an arbitrary other language can use a C# class and completely ignore const-ness just because the language doesn't support it. Bolting it onto every other CLS language just because C# supports it is of course very unpractical.
Enforceability is a problem in C++ as well. Because the language also supports const_cast<>. Any client code can cast the const-ness away swiftly and undiagnosably. You are not supposed to, but then sometimes you have to. Because there are two kinds of const-ness, strict and observable. Roughly analogous to private const-ness and public const-ness. The mutable keyword was added to the language later to try to deal with the need for observable const-ness so at least the inevitable usage of const_cast<> could be avoided. Some people say that C++ is a difficult language. Don't hear that of C# much.

You say the CLR wouldn't need to be changed, but consider that there's no standard way to express this "const"ness within compiled assemblies - and that these assemblies might not be consumed by C# code anyway. It's not something you can just do for C# - you'd have to do it for all .NET languages.

As I believe the case to be, const means different things in C# compared to C++.
In C# you can use the readonly keyword to get the level of functionality you're wanting from const.

I was once surpised by the following situation:
class Vector
{
private double[] m_data;
public int Dimension {get;set;}
public double this[int i]
{
get {return m_data[i];}
set {m_data[i] = value;}
}
public Vector(int n)
{
this.Dimension = n;
this.m_data = new double(n);
}
public static Vector Zero(int n)
{
Vector v = new Vector(n);
for (int i = 0; i < n; i++)
{
v[i] = 0.0;
}
return v;
}
public static readonly Vector Zero3 = Zero(3);
}
Thou Vector.Zero3 is readonly and you cannot assign to it, you can still access its component, and then the following stupid thing happens:
Vector a = Vector.Zero3;
a[0] = 2.87;
and now, since a ist nothing but a reference to Vector.Vector3 the latter also has Vector.Vector3[0] == 2.87!
After I fell into this pit once, I invented a very simple hack, though not being elegant, fulfills its function.
Namely, into a class that I suppose to produce static readonly "constants", I introduce a Boolean flag:
class Vector
{
private double[] m_data;
public int Dimension {get;set;}
private bool m_bIsConstant = false;
...
public double this[int i]
{
get {return m_data[i];}
set
{
if (!m_bIsConstant)
{
m_data[i] = value;
}
}
}
...
public static Vector Zero(int n)
{
Vector v = new Vector(n);
for (int i = 0; i < n; i++)
{
v[i] = 0.0;
}
v.m_bIsConstant = true;
return v;
}
...
}
This hack guarantees that your static readonly variable will never be modified.

In the case of your proposal for a const-class, you say:
Such a class could only be populated at construction time, so we'd need a constructor to take in an int
But by making all properties read-only anyway you have already achieved what you've said.
I cannot speak for the C# language designers but maybe the reason of not having const applied to lots of other constructs is because adding it was simply not worth the effort and you can get around the issue in other ways (as described above and in other answers/comments).

I can't tell from your question, how this overloading of the const keyword would be especially beneficial.
Your first example could be rewritten legally as
public class NumberContainer
{
private readonly int number;
public NumberContainer(int number)
{
this.number = number;
}
public int Number
{
get { return number; }
}
}
Perhaps, if the compiler is unable to discern the immutability of this class (I don't know), some attribute could be useful?
In your second example, I do not understand what you are driving at. If a function returns a constant value then it can be replaced with a constant field.

CA1819: Properties shouldn't return arrays - What is the right alternative?

I encountered this FxCop rule before and wasn't really content with how to solve violations (thread1, thread2). I now have another case where I need to correct violations of the CA1819 kind.
Specifically, I have an algorithm-library that performs some analytic calculations on a curve (x,y), with a public "input object" like this:
public class InputObject
{
public double[] X { get; set; }
public double[] Y { get; set; }
// + lots of other things well
}
This object's X and Y properties are used in hundreds of locations within library, typically using indexes. The input object is never altered by the algorithms, but actually it shouldn't matter if so. Also, .Length is called pretty frequently. It's a mathematical library, and double[] is kind of the standard data type in there. In any case, fixing CA1819 will require quite some work.
I thought about using List<double>, since Lists support indexing and are quite similar to arrays but I'm not sure whether this may slow down the algorithms or whether FxCop will be happy with those Lists.
What is the best option to replace these double[] properties?

If it is read only to external consumer and consumer does not want to access it by index then the best is to have a public read only property of type IEnumerable<> with method accessors to add and remove, this way you will not have to expose your array to someone to mess with.
If you need to access the indexers then expose it as read only property of type IList<> and probably return a ReadOnly instance, with methods to add and remove.
This way you keep encapsulation of the internal list and allow consumer to access it in a read only way

Sometime FxCop from my point of view exagerates.
It all depends on what you have to do, if you are writing a complex system where security and very clean code is required, you should returns a readonly version of that array.
That is, cast the array as IEnumerable as suggests devdigital or use the good idea ImmutableArray of Mohamed Abed, that i prefer.
If your are writing software that require high performance... there is nothing better than an array for performances in C#.
Arrays can be a lot more performant for iterating and reading.
If performances are really important I suggest you to ignore that warning.
Is still legal, also if not too much clean, to return a readonly array.
for (int i = 0; i < array.Length; ++i) { k = array[i] + 1; }
This is very fast for big arrays in C#: it avoids array bounds check.
It will perform very much as a C compiled code would do.
I always wished a "readonly array" type in C# :) but there is no hope to see it.

As your link suggests:
To fix a violation of this rule, either make the property a method or
change the property to return a collection.
Using a collection such as a List should not have a significant impact on performance.

The big problem here isn't really what your library does with the values (which is a potential problem, albeit a much more manageable one), but rather what callers might do with the values. If you need to treat them as immutable, then you need to ensure that a library consumer cannot change the contents after their original assignment. The easy fix here would be to create an interface that exposes all the array members that your library uses, then create an immutable wrapper class for an array that implements this interface to use in your InputObject class. e.g.:
public interface IArray<T>
{
int Length { get; }
T this[int index] { get; }
}
internal sealed class ImmutableArray<T> : IArray<T>
where T : struct
{
private readonly T[] _wrappedArray;
internal ImmutableArray(IEnumerable<T> data)
{
this._wrappedArray = data.ToArray();
}
public int Length
{
get { return this._wrappedArray.Length; }
}
public T this[int index]
{
get { return this._wrappedArray[index]; }
}
}
public class InputObject
{
private readonly IArray<double> _x;
private readonly IArray<double> _y;
public InputObject(double[] x, double[] y)
{
this._x = new ImmutableArray<double>(x);
this._y = new ImmutableArray<double>(y);
}
public IArray<double> X
{
get { return this._x; }
}
public IArray<double> Y
{
get { return this._y; }
}
//...
}
The elements in your "immutable" array contents would still be mutable if T is mutable, but at least you're safe for the double type.

Change array [] to IEnumerable:
public class InputObject
{
public IEnumerable<double> X { get; set; }
public IEnumerable<double> Y { get; set; }
// + lots of other things well
}

Object memory optimization question

Please pardon me for a n00bish question.
Please consider the following code:
public class SampleClass
{
public string sampleString { get; set; }
public int sampleInt { get; set; }
}
class Program
{
SampleClass objSample;
public void SampleMethod()
{
for (int i = 0; i < 10; i++)
{ objSample = new SampleClass();
objSample.sampleInt = i;
objSample.sampleString = "string" + i;
ObjSampleHandler(objSample);
}
}
private void ObjSampleHandler(SampleClass objSample)
{
//Some Code here
}
}
In the given example code, each time the SampleMethod() is called, it would iterate for 10 times and allocate new memory space for the instance of SampleClass and would assign to objSample object.
I wonder,
If this is a bad approach as a lot of
memory space is being wasted with it?
If that is the case, is there a
better approach to reuse/optimize the
allocated memory?
Or, Am I getting worried for no reason at all and getting into unneccesary micro optimisation mode ? :)
Edit: Also consider the situation when such a method is being used in a multi threaded enviornment. Would that change anything?

The technical term for what you are doing is premature optimization
You're definitely doing well to think about the performance implications of things. But in this case, the .NET Garbage Collector will handle the memory fine. And .NET is very good at creating objects fast.
As long as your class's constructor isn't doing a lot of complex, time-consuming things, this won't be a big problem.

Second option.
You shouldn't be concerned with this kind of optimization unless you're having a performance issue.
And even if you are, it would depend of what you do with the object after you create it, for example, if in ObjSampleHandler() you're storing the objects to use them later, you simply cannot avoid what you're doing.
Remember, "early optimization is the root of all evil" or so they say ;)

As you are creating a new object (objSample = new SampleClass();), you are not reusing that object. You are only reusing the reference to an instance of SampleClass.
But now you are forcing that reference to be a member-variable of your class Program, where it could have been a local variable of the method SampleMethod.

Assuming your code in ObjSampleHandler method doesnt create any non-local references to objSample, the object will become eligible for Garbage Collection once the method finishes, which will be quite memory efficient, and unlikely to be of concern.
However, if you are having problems specifically with the managed heap because of this type of code then you could change your class to a struct, and it will be stored on the Stack rather than the Heap which is more efficient. Please remember though that structs are copied by value rather than reference, and you need to understand the consequences of this in the remainder of your code.
public struct SampleClass
{
public string sampleString { get; set; }
public int sampleInt { get; set; }
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Struct this() initializer - Memory, Performance, and cleanliness - c#

Related

C# fixed size array

Restricting use of a structure in C#

Why doesn't C# support const on a class / method level?

CA1819: Properties shouldn't return arrays - What is the right alternative?

Object memory optimization question

Categories

Resources