If we pass an array of structs as method parameter, in the method body do we have a reference to an array of structs, or a new array of structs?
You'll have a reference to an array of structs.
Array itself is a reference type, so an array of structs will be an object with the values stored inline.
If you pass an array to a method, you pass a reference to the array object. The reference itself is passed by value.
When you declare an array of value types, .NET allocates memory on the heap not stack. So it is always referred to be its reference.
The only exception is stackalloc where a memory area is allocated on the stack and can be used unsafely and it is faster that heap access.
Array is a class in the .net framework so if you create a struct arrays so you will have a reference type ,i am not commenting how and where these will be stored whether it is stack or heap because it is pure implementational details but Microsoft implementation of reference type will go on the HEAP.
Related
From what I know of C# (I believe I am right), value types are allocated on the stack and reference types are alllocated on the heap. But if a field in a class is value type, it is allocated on the heap rather than on the stack (I'm still right, right?).
With that said, I also know that every C# program is a class and is made up of classes. That should imply that any variable declared in a C# program, value type or reference type, should be allocated on the heap.
What I can infer, then, is that the stack may not be really used in a C# program. I say 'may' because there could be extraordinary cases, not that I know one, though.
You are mostly correct :)
Variables that are local to a method, however, are indeed allocated from the stack. This is the whole truth for value types. For reference types, the actual object, string, array, and so on, is allocated on the heap, but the pointer itself is allocated on stack.
A reference type is stored in the heap. This is true for the value types that are contained in the reference type. On the contrary in the stack is stored the reference to the object that is stored in the heap.
Regarding the variables that are local to a method that 500 - Internal Server Error has pointed out holds. They are allocated to the stack.
Simple code for example:
Object test=new Object();
I understand that memory for test object allocated into heap.
Quote from MSDN:
Variables of reference types store references to their data (objects)
But I really can't understand where stored this variable values (references to heap data), into stack or into heap or another place?
test variable is stored on stack - it holds address of object on heap. And object instance is stored on heap.
I suggest you to read .NET Type Fundamentals article by Jeffrey Richter:
When an object is allocated from the managed heap, the new operator
returns the memory address of the object. You usually store this
address in a variable. This is called a reference type variable
because the variable does not actually contain the object's bits;
instead, the variable refers to the object's bits.
In addition to reference types, the virtual object system supports
lightweight types called value types. Value type objects cannot be
allocated on the garbage-collected heap, and the variable representing
the object does not contain a pointer to an object; the variable
contains the object itself. Since the variable contains the object, a
pointer does not have to be dereferenced in order to manipulate the
object. This, of course, improves performance.
All types are derived from the Object class, but the value
types aren’t allocated on the heap. Value type variables actually contain
their values. so how then can these types be stored in arrays and used in
methods that expect reference variables ? Can somebody please explain me how these value types are stored on heap when they are part of an array?
Boxing and Unboxing. Also see Here for info pertaining to arrays specifically (part way down). Note this is for object arrays, a valuetype array (e.g. int[]) doesn't have any (un)boxing.
Have a look at this question:
Arrays, heap and stack and value types
You can pass the instance of a value type to a method expecting an object (ref class). In this case boxing and unboxing happens.
Value type arrays do not require boxing or unboxing!
The CLR handles arrays of value types specially. Of course an array is a reference type which is allocated on the heap, but the value type values are embedded into the heap record (not on the stack).
Similarly, when a reference type class contains a value type field, the value of the field is embedded into the record on the heap..
Value types may be allocated on stack.
This can happen only if they are in parameters or local variables or fields in a another value type which is.
Value types in arrays and fields in classes are stored locally in array or class, instead of pointer being stored there - value types result in more local memory access (performance improvement)
and in case of arrays value n is right after value n-1 in memory, something which is not guaranteed with objects in array of reference types (including boxed values in array of object - also no grantee of continuity). In arrays of reference types it is the references that are continual.
In .NET, Value type object such as int is stored in memory.
Reference type object requires separate allocations of memory for the reference and object, and the object is stored in .NET object heap.
And Array is created in the heap, so how an array of value types such as int[] stored in the heap? Does it mean value type object can be stored in the heap without boxing?
Yes, you are right. I suggest you read this:
https://ericlippert.com/2010/09/30/the-truth-about-value-types/
It's very very good, and it explains nearly everything you'll ever want to know.
Yes, an array is one way in which a value type value can be stored on the heap without boxing. Another is just having it in a normal class:
public class Foo
{
int value1;
string name;
// etc
}
All the variables associated with an instance of Foo are stored on the heap. The value of value1 is just the int, whereas the value of name is a string reference.
This is why the claim that "value types are stored on the stack, reference types are stored on the heap" is so obviously incorrect.
However, as Eric Lippert is rightly fond of pointing out, the stack/heap distinction is an implementation detail. For example, a future version of the CLR could store some objects on the stack, if it could work out that they wouldn't be needed after the method terminated.
Yes, it means that no boxing is done for reach element, because the entire array as a whole is "boxed" inside an Array object (although that's not what it's called).
There's really no requirement that says a value type has to be boxed before being placed on the heap. You can place a value type on the heap in three ways:
By wrapping it inside a regular object.
By boxing it.
By wrapping it inside an array object.
(There might be more ways but I don't think I've missed any.)
Just think of it this way, the object location in memory is defined by what kind of type it is and where it was declared. If the object is a value type, its value is stored where you declared the variable. If the object is a reference type, its reference is stored where you declared the variable while the actual object instance exists on the heap.
When you declare a local variable, you are declaring the variable on the stack. Therefore a value type's value will be on the stack. A reference type's reference will be on the stack, and the object instance is still on the heap.
If you declare an instance variable within a class (a reference type), you are effectively declaring the instance variables in the heap. A value type's value will be in the heap (in the object instance). A reference type's reference will also be in the heap (in the object instance), the object instance will be elsewhere in the heap.
If you declare an instance variable within a struct (a value type), where it resides depends on where the underlying struct was declared.
In the case of an array of int int[], arrays are reference types and you can think of the int values declared as "fields" to that type so your integers are effectively in the heap.
The .NET 1.0 way of creating collection of integers (for example) was:
ArrayList list = new ArrayList();
list.Add(i); /* boxing */
int j = (int)list[0]; /* unboxing */
The penalty of using this is the lack of type safety and performance due to boxing and unboxing.
The .NET 2.0 way is to use generics:
List<int> list = new List<int>();
list.Add(i);
int j = list[0];
The price of boxing (to my understanding) is the need to create an object on the heap, copy the stack allocated integer to the new object and vice-versa for unboxing.
How does the use of generics overcome this? Does the stack-allocated integer stays on the stack and being pointed to from the heap (I guess this is not the case because of what will happen when it will get out of scope)? It seems like there is still a need of copying it somewhere else out of the stack.
What is really going on?
When it comes to collections, generics make it possible to avoid boxing/unboxing by utilizing actual T[] arrays internally. List<T> for example uses a T[] array to store its contents.
The array, of course, is a reference type and is therefore (in the current version of the CLR, yada yada) stored on the heap. But since it's a T[] and not an object[], the array's elements can be stored "directly": that is, they're still on the heap, but they're on the heap in the array instead of being boxed and having the array contain references to the boxes.
So for a List<int>, for example, what you'd have in the array would "look" like this:
[ 1 2 3 ]
Compare this to an ArrayList, which uses an object[] and would therefore "look" something like this:
[ *a *b *c ]
...where *a, etc. are references to objects (boxed integers):
*a -> 1
*b -> 2
*c -> 3
Excuse those crude illustrations; hopefully you know what I mean.
Your confusion is a result of misunderstanding what the relationship is between the stack, the heap, and variables. Here's the correct way to think about it.
A variable is a storage location that has a type.
The lifetime of a variable can either be short or long. By "short" we mean "until the current function returns or throws" and by "long" we mean "possibly longer than that".
If the type of a variable is a reference type then the contents of the variable is a reference to a long-lived storage location. If the type of a variable is a value type then the contents of the variable is a value.
As an implementation detail, a storage location which is guaranteed to be short-lived can be allocated on the stack. A storage location which might be long-lived is allocated on the heap. Notice that this says nothing about "value types are always allocated on the stack." Value types are not always allocated on the stack:
int[] x = new int[10];
x[1] = 123;
x[1] is a storage location. It is long-lived; it might live longer than this method. Therefore it must be on the heap. The fact that it contains an int is irrelevant.
You correctly say why a boxed int is expensive:
The price of boxing is the need to create an object on the heap, copy the stack allocated integer to the new object and vice-versa for unboxing.
Where you go wrong is to say "the stack allocated integer". It doesn't matter where the integer was allocated. What matters was that its storage contained the integer, instead of containing a reference to a heap location. The price is the need to create the object and do the copy; that's the only cost that is relevant.
So why isn't a generic variable costly? If you have a variable of type T, and T is constructed to be int, then you have a variable of type int, period. A variable of type int is a storage location, and it contains an int. Whether that storage location is on the stack or the heap is completely irrelevant. What is relevant is that the storage location contains an int, instead of containing a reference to something on the heap. Since the storage location contains an int, you do not have to take on the costs of boxing and unboxing: allocating new storage on the heap and copying the int to the new storage.
Is that now clear?
Generics allows the list's internal array to be typed int[] instead of effectively object[], which would require boxing.
Here's what happens without generics:
You call Add(1).
The integer 1 is boxed into an object, which requires a new object to be constructed on the heap.
This object is passed to ArrayList.Add().
The boxed object is stuffed into an object[].
There are three levels of indirection here: ArrayList -> object[] -> object -> int.
With generics:
You call Add(1).
The int 1 is passed to List<int>.Add().
The int is stuffed into an int[].
So there are only two levels of indirection: List<int> -> int[] -> int.
A few other differences:
The non-generic method will require a sum of 8 or 12 bytes (one pointer, one int) to store the value, 4/8 in one allocation and 4 in the other. And this will probably be more due to alignment and padding. The generic method will require only 4 bytes of space in the array.
The non-generic method requires allocating a boxed int; the generic method does not. This is faster and reduces GC churn.
The non-generic method requires casts to extract values. This is not typesafe and it's a bit slower.
An ArrayList only handles the type object so to use this class requires casting to and from object. In the case of value types, this casting involves boxing and unboxing.
When you use a generic list the compiler outputs specialized code for that value type so that the actual values are stored in the list rather than a reference to objects that contain the values. Therefore no boxing is required.
The price of boxing (to my understanding) is the need to create an object on the heap, copy the stack allocated integer to the new object and vice-versa for unboxing.
I think you are assuming that value types are always instantiated on the stack. This is not the case - they can be created either on the heap, on the stack or in registers. For more information about this please see Eric Lippert's article: The Truth About Value Types.
In .NET 1, when the Add method is called:
Space is allocated on the heap; a new reference is made
The contents of the i variable is copied into the reference
A copy of the reference is put at the end of the list
In .NET 2:
A copy of the variable i is passed to the Add method
A copy of that copy is put at the end of the list
Yes, the i variable is still copied (after all, it's a value type, and value types are always copied - even if they're just method parameters). But there's no redundant copy made on the heap.
Why are you thinking in terms of WHERE the values\objects are stored? In C# value types can be stored on stack as well as heap depending upon what the CLR chooses.
Where generics make a difference is WHAT is stored in the collection. In case of ArrayList the collection contains references to boxed objects where as the List<int> contains int values themselves.