why we cannot initialize an instance field at declaration in c# struct? - c#

In c# -> struct, we cannot assign a value to instance field at declaration. Can you tell me the reason? Thanks.
A simple example:
struct Test
{
public int age =10; // it's not allowed.
}

I think the answer is very simple, but hard to get a grasp of if you do not know the difference between value types and reference types.
Maybe something to note is that reference type are held in the heap, which the garbage collect cleans. And a value type lives in the stack. Every time you define a scope, like:
{
}
A new local stack is created. Once you exit this scope, all value types on the stack are disposed unless a reference is held to them on the heap.
Seeing as reference types and value types are very differently handled, they are also designed with these changes in mind. Not being able to have empty constructors and also not being able to assign values on construction is a logical result of this.
I found a very old stackoverflow question regarding the same, they also have some short answers regarding it being designed like that for performance reasons:
Why can't I initialize my fields in my structs?
My source for this info was the ref book for 70-483.
Hope this gave you the clarification you are looking for

Related

What exactly is a reference in C#

From what I understand by now, I can say that a reference in C# is a kind of pointer to an object which has reference count and knows about the type compatibility. My question is not about how a value type is different than a reference type, but more about how a reference is implemented.
I have read this post about what differences are between references and pointers, but that does not cover that much about what a reference is but it it's describing more it's properties compared with a pointer in C++. I also understand the differences between passing by reference an passing by value (as in C# objects are by default passed by value, even references), but it is hard for me to understand what really is a reference when I have tried to explain to my colleagues why a parameter sent by reference can not be stored inside a closure as in the Eric Lippert blog entry about the stack as an implementation detail.
Can somebody provide me with a complete, but hopefully simple explanation about what references really are in C# and a bit about how they are imlemented?
Edit: this is not a duplicate, because in the Reference type in C# it is explained how a reference works and how is it different of a value, but what am I asking is how a reference is defined at a low level.
From what I understand by now, I can say that a reference in C# is a kind of pointer to an object
If by "kind of" you mean "is conceptually similar to", yes. If you mean "could be implemented by", yes. If you mean "has the is-a-kind-of relationship to", as in "a string is a kind of object" then no. The C# type system does not have a subtyping relationship between reference types and pointer types.
which has reference count
Implementations of the CLR are permitted to use reference counting semantics but are not required to do so, and most do not.
and knows about the type compatibility.
I'm not sure what this means. Objects know their own actual type. References have a static type which is compatible with the actual type in verifiable code. Compatibility checking is implemented by the runtime's verifier when the IL is analyzed.
My question is not about how a value type is different than a
reference type, but more about how a reference is implemented.
How references are implemented is, not surprisingly, an implementation detail.
Can somebody provide me with a complete, but hopefully simple explanation about what references really are in C#
References are things that act as references are specified to act by the C# language specification. That is:
objects (of reference type) have identity independent from the values of their fields
any object may have a reference to it
such a reference is a value which may be passed around like any other value
equality comparison is implemented for those values
two references are equal if and only if they refer to the same object; that is, references reify object identity
there is a unique null reference which refers to no object and is unequal to any valid reference to an object
A static type is always known for any reference value, including the null reference
If the reference is non-null then the static type of the reference is always compatible with the actual type of the referent. So for example, if we have a reference to a string, the static type of the reference could be string or object or IEnumerable, but it cannot be Giraffe. (Obviously if the reference is null then there is no referent to have a type.)
There are probably a few rules that I've missed, but that gets across the idea. References are anything that behaves like a reference. That's what you should be concentrating on. References are a useful abstraction because they are the abstraction which enables object identity independent of object value.
and a bit about how they are implemented?
In practice, objects of reference type in C# are implemented as blocks of memory which begin with a small header that contains information about the object, and references are implemented as pointers to that block. This simple scheme is then made more complicated by the fact that we have a multigenerational mark-and-sweep compacting collector; it must somehow know the graph of references so that it can move objects around in memory when compacting the heap, without losing track of referential identity.
As an exercise you might consider how you would implement such a scheme. It builds character to try to figure out how you would build a system where references are pointers and objects can move in memory. How would you do it?
it is hard for me to understand what really is a reference when I have tried to explain to my colleagues why a parameter sent by reference can not be stored inside a closure
This is tricky. It is important to understand that conceptually, a reference to a variable -- a ref parameter in C# -- and a reference to an object of reference type are conceptually similar but actually different things.
In C# you can think of a reference to a variable as an alias. That is, when you say
void M()
{
int x = 123;
N(ref x);
}
void N(ref int y)
{
y = 456;
Essentially what we are saying is that x and y are different names for the same variable. The ref is an unfortunate choice of syntax because it emphasizes the implementation detail -- that behind the scenes, y is a special "reference to variable" type -- and not the semantics of the operation, which is that logically y is now just another name for x; we have two names for the same variable.
References to variables and references to objects are not the same thing in C#; you can see this in the fact that they have different semantics. You can compare two references to objects for equality. But there is no way in C# to say:
static bool EqualAliases(ref int y, ref int z)
{
return true iff y and z are both aliases for the same variable
}
the way you can with references:
static bool EqualReferences(object x, object y)
{
return x == y;
}
Behind the scenes both references to variables and references to objects are implemented by pointers. The difference is that a reference to a variable might refer to a variable on the short-term storage pool (aka "the stack"), whereas a reference to an object is a pointer to the heap-allocated object header. That's why the CLR restricts you from storing a reference to a variable into long-term storage; it does not know if you are keeping a long-term reference to something that will be dead soon.
Your best bet to understand how both kinds of references are implemented as pointers is to take a step down from the C# type system into the CLI type system which underlies it. Chapter 8 of the CLI specification should prove interesting reading; it describes different kinds of managed pointers and what each is used for.
References in C# are very similar to C++ references. Yes, indeed, underneath there is garbage collection magic going on, but I would say how that works is a different and larger topic.
C# references are similar to C++ references/immutable pointers: No pointer arithmetic, etc - but you can reassign them (Thanks Ben!).
I'd say in practice, one difference is that since pointers aren't generally available in C# (unsafe keyword and its associated pointers is again a different and larger topic) , you'll find yourself using "out" keyword to do what pointer-to-pointer used to do.
Also you are correct in asserting references carry type information. All references in C# come from the Object class, which itself has GetType() method.
Be advised, however, structs - which are generally treated as value, not reference - also have GetType().

What does .Net do when you declare an object without an instance?

I wonder to know how the .Net Framework handles the declared but not instantiated object situation.
For example i declare an object like
DropDownList ddl;
and do nothing about it. I know that i should do something with this variable and get a warning about it, but what i don't know is the where it will be stored.
Is there a lookup table that stores the data of all declared variables? Or is there a virtual reference for every declaration?
Edit : I just wanted to know how the memory allocated for this object declaration.
Edit2 : Whether it's a local variable or not, i'm just talking about the memory allocation structure. I wonder to know where this references stored.
If ddl is a field, then the value of ddl will be null, as it is a reference type.
Any attempt to call a member on it will result in a NullReferenceException.
If it is a local variable it will simply be unassigned.
Value types will get the default(T) of their type.
The compiler itself may remove the call completely, depending on where it was declared, but this is an implementation detail.
If you are talking about a local variable then the compiler can simply optimize it out of existence since noone can be using it (if you attempted to use it without initializing the compiler would have protested with an error). In fact the .NET 4 compiler did this for me when I tested just moments ago.
If you are talking about a field in a class then it is initialized with the default value for its type as part of the object construction.
From your description, it sounds like you're talking about a local variable. When you declare a local variable in usual implementations and without any optimizations, then space is reserved for it on the stack (most probably), with a null reference as its value.
You could look into the StackFrame class if you want to inspect further (I've never used it).
The variable is stored in your assembly. It will always have it's default value null.
In release mode (compiler is set to optimize) it's optimized and it is not stored anywhere.
If you want to know more about IL and how the compiler works, wikipedia has a good article to start.
All variables are stored into a class or method. Variables declared into a class can be listed using .NET Reflection :
class Class1 { private int i; public string s; }
typeof(Class1).GetFields(BindingFlags.Instance); // returns all instance fields
typeof(Class1).GetFields(); // returns all instance public fields
typeof(Class1).GetProperties(); // returns all instance public properties
Variables declared into a method cannot be inspected with .NET Reflection mechanisms.

What's wrong with this C# struct?

Note: My question has several parts to it. I'd appreciate it if you would please answer each of the questions, instead of simply telling me what to do to get this to compile. :)
I'm not by any means good with C#. In fact, the reason why I don't know much about it is my class is focused on making efficient Algorithms and not really on teaching us .NET. Nevertheless all of our programs must be written in .NET and it hasn't been a problem until just now. I have the following code, but it won't compile and I don't really understand why. I have a gut feeling that this should be rewritten altogether, but before I do that, I want to know WHY this isn't allowed.
The point of the struct is to create a linked list like structure so I can add another node to the end of the "list" and then traverse and recall the nodes in reverse order
private struct BackPointer
{
public BackPointer previous;
public string a;
public string b;
public BackPointer(BackPointer p, string aa, string bb)
{
previous = p;
a = aa;
b = bb;
}
}
then later in my code I have something to the effect of
BackPointer pointer = new BackPointer();
pointer = new BackPointer(pointer, somestring_a, somestring_b);
The compile error I'm getting is Struct member 'MyClass.BackPointer.previous' of type 'MyClass.BackPointer' causes a cycle in the struct layout
This seems to be an obvious error. It doesn't like the fact that I am passing in the struct in the constructor of the same struct. But why is that not allowed? I would imagine this code would just create a new node in the list and return this node with a pointer back to the previous node, but apparently that's not what would happen. So what would actually happen then? Lastly what is the recommended way to resolve this? I was thinking to just tell it to be unmanaged just handle my pointers manually, but I only really know how to do that in C++. I don't really know what could go wrong in C#
That's not a pointer; it's an actual embedded struct value.
The whole point of structs is that they're (almost) never pointers.
You should use a class instead.
But why is that not allowed?
It's a struct - a value type. That means wherever you've got a variable of that type, that variable contains all the fields within the struct, directly inline. If something contains itself (or creates a more complicated cycle) then you clearly can't allocate enough space for it - because it's got to have enough space for all its fields and another copy of itself.
Lastly what is the recommended way to resolve this?
Write a class instead of a struct. Then the value of the variable will be a reference to an instance, not the data itself. That's how you get something close to "a pointer" in C#. (Pointers and references are different, mind you.)
I suggest you read my article on value types and reference types for more information - this is an absolutely critical topic to understand in C#.
Backpointer HAS to exist before creating a Backpointer, because you can't have a Backpointer without another Backpointer (which would then need another Backpointer and on and on). You simply can't create a Backpointer based on the way you've created it, because, as a struct, Backpointer can never be null.
In other words, it's impossible to create a Backpointer with this code. The compiler knows that, and so it forces you to make something that would work logically.
Structs are stored by value. In this case, your struct stores within itself another instance of the same struct. That struct stores within itself another struct and so on. Therefore this is impossible. It is like saying that every person in the world must have 1 child. There is no way this is possible.
What you need to use is a class. Classes store by reference, which means that it does not store the class within itself, it only stores a reference to that class.
A CLR struct is by definition a value type. What this means in your context is that the compiler needs to know the exact layout of the type. However, it cannot know how to layout a type which contains an instance of itself - does that sound reasonable? Change the struct to class (which makes your BackPointer to a reference type) and you'll see it's gonna work out of the box. The reason is that an instance of any reference type has always has the same layout - it is basically just a "pointer" to some location of the managed heap. I strongly recommend to read on a bit about the basics of C# or CLI type system.

Why do we need struct? (C#)

To use a struct, we need to instantiate the struct and use it just like a class. Then why don't we just create a class in the first place?
A struct is a value type so if you create a copy, it will actually physically copy the data, whereas with a class it will only copy the reference to the data
A major difference between the semantics of class and struct is that structs have value semantics. What is this means is that if you have two variables of the same type, they each have their own copy of the data. Thus if a variable of a given value type is set equal to another (of the same type), operations on one will not affect the other (that is, assignment of value types creates a copy). This is in sharp contrast to reference types.
There are other differences:
Value types are implicitly sealed (it is not possible to derive from a value type).
Value types can not be null.
Value types are given a default constructor that initialzes the value type to its default value.
A variable of a value type is always a value of that type. Contrast this with classes where a variable of type A could refer to a instance of type B if B derives from A.
Because of the difference in semantics, it is inappropriate to refer to structs as "lightweight classes."
All of the reasons I see in other answers are interesting and can be useful, but if you want to read about why they are required (at least by the VM) and why it was a mistake for the JVM to not support them (user-defined value types), read Demystifying Magic: High-level Low-level Programming. As it stands, C# shines in talking about the potential to bring safe, managed code to systems programming. This is also one of the reasons I think the CLI is a superior platform [than the JVM] for mobile computing. A few other reasons are listed in the linked paper.
It's important to note that you'll very rarely, if ever, see an observable performance improvement from using a struct. The garbage collector is extremely fast, and in many cases will actually outperform the structs. When you add in the nuances of them, they're certainly not a first-choice tool. However, when you do need them and have profiler results or system-level constructs to prove it, they get the job done.
Edit: If you wanted an answer of why we need them as opposed to what they do, ^^^
In C#, a struct is a value type, unlike classes which are reference types. This leads to a huge difference in how they are handled, or how they are expected to be used.
You should probably read up on structs from a book. Structs in C# aren't close cousins of class like in C++ or Java.
This is a myth that struct are always created on heap.
Ok it is right that struct is value type and class is reference type. But remember that
1. A Reference Type always goes on the Heap.
2. Value Types go where they were declared.
Now what that second line means is I will explain with below example
Consider the following method
public void DoCalulation()
{
int num;
num=2;
}
Here num is a local variable so it will be created on stack.
Now consider the below example
public class TestClass
{
public int num;
}
public void DoCalulation()
{
TestClass myTestClass = new TestClass ();
myTestClass.num=2;
}
This time num is the num is created on heap.Ya in some cases value types perform more than reference types as they don't require garbage collection.
Also remeber:
The value of a value type is always a value of that type.
The value of a reference type is always a reference.
And you have to think over the issue that if you expect that there will lot be instantiation then that means more heap space yow will deal with ,and more is the work of garbage collector.For that case you can choose structs.
Structs have many different semantics to classes. The differences are many but the primary reasons for their existence are:
They can be explicitly layed out in memmory
this allows certain interop scenarios
They may be allocated on the stack
Making some sorts of high performance code possible in a much simpler fashion
the difference is that a struct is a value-type
I've found them useful in 2 situations
1) Interop - you can specify the memory layout of a struct, so you can guarantee that when you invoke an unmanaged call.
2) Performance - in some (very limited) cases, structs can be faster than classes, In general, this requires structs to be small (I've heard 16 bytes or less) , and not be changed often.
One of the main reasons is that, when used as local variables during a method call, structs are allocated on the stack.
Stack allocation is cheap, but the big difference is that de-allocation is also very cheap. In this situation, the garbage collector doesn't have to track structs -- they're removed when returning from the method that allocated them when the stack frame is popped.
edit - clarified my post re: Jon Skeet's comment.
A struct is a value type (like Int32), whereas a class is a reference type. Structs get created on the stack rather than the heap. Also, when a struct is passed to a method, a copy of the struct is passed, but when a class instance is passed, a reference is passed.
If you need to create your own datatype, say, then a struct is often a better choice than a class as you can use it just like the built-in value types in the .NET framework. There some good struct examples you can read here.

Are arrays or lists passed by default by reference in c#?

Do they? Or to speed up my program should I pass them by reference?
The reference is passed by value.
Arrays in .NET are object on the heap, so you have a reference. That reference is passed by value, meaning that changes to the contents of the array will be seen by the caller, but reassigning the array won't:
void Foo(int[] data) {
data[0] = 1; // caller sees this
}
void Bar(int[] data) {
data = new int[20]; // but not this
}
If you add the ref modifier, the reference is passed by reference - and the caller would see either change above.
They are passed by value (as are all parameters that are neither ref nor out), but the value is a reference to the object, so they are effectively passed by reference.
Yes, they are passed by reference by default in C#. All objects in C# are, except for value types. To be a little bit more precise, they're passed "by reference by value"; that is, the value of the variable that you see in your methods is a reference to the original object passed. This is a small semantic point, but one that can sometimes be important.
(1) No one explicitly answered the OP's question, so here goes:
No. Explicitly passing the array or list as a reference will not affect performance.
What the OP feared might be happening is avoided because the function is already operating on a reference (which was passed by value). The top answer nicely explains what this means, giving an Ikea way to answer the original question.
(2) Good advice for everyone:
Read Eric Lippert's advice on when/how to approach optimization. Premature optimization is the root of much evil.
(3) Important, not already mentioned:
Use cases that require passing anything - values or references - by reference are rare.
Doing so gives you extra ways to shoot yourself in the foot, which is why C# makes you use the "ref" keyword on the method call as well. Older (pre-Java) languages only made you indicate pass-by-reference on the method declaration. And this invited no end of problems. Java touts the fact that it doesn't let you do it at all.

Categories

Resources