Boxing and unboxing with generics - c#

The .NET 1.0 way of creating collection of integers (for example) was:
ArrayList list = new ArrayList();
list.Add(i); /* boxing */
int j = (int)list[0]; /* unboxing */
The penalty of using this is the lack of type safety and performance due to boxing and unboxing.
The .NET 2.0 way is to use generics:
List<int> list = new List<int>();
list.Add(i);
int j = list[0];
The price of boxing (to my understanding) is the need to create an object on the heap, copy the stack allocated integer to the new object and vice-versa for unboxing.
How does the use of generics overcome this? Does the stack-allocated integer stays on the stack and being pointed to from the heap (I guess this is not the case because of what will happen when it will get out of scope)? It seems like there is still a need of copying it somewhere else out of the stack.
What is really going on?

When it comes to collections, generics make it possible to avoid boxing/unboxing by utilizing actual T[] arrays internally. List<T> for example uses a T[] array to store its contents.
The array, of course, is a reference type and is therefore (in the current version of the CLR, yada yada) stored on the heap. But since it's a T[] and not an object[], the array's elements can be stored "directly": that is, they're still on the heap, but they're on the heap in the array instead of being boxed and having the array contain references to the boxes.
So for a List<int>, for example, what you'd have in the array would "look" like this:
[ 1 2 3 ]
Compare this to an ArrayList, which uses an object[] and would therefore "look" something like this:
[ *a *b *c ]
...where *a, etc. are references to objects (boxed integers):
*a -> 1
*b -> 2
*c -> 3
Excuse those crude illustrations; hopefully you know what I mean.

Your confusion is a result of misunderstanding what the relationship is between the stack, the heap, and variables. Here's the correct way to think about it.
A variable is a storage location that has a type.
The lifetime of a variable can either be short or long. By "short" we mean "until the current function returns or throws" and by "long" we mean "possibly longer than that".
If the type of a variable is a reference type then the contents of the variable is a reference to a long-lived storage location. If the type of a variable is a value type then the contents of the variable is a value.
As an implementation detail, a storage location which is guaranteed to be short-lived can be allocated on the stack. A storage location which might be long-lived is allocated on the heap. Notice that this says nothing about "value types are always allocated on the stack." Value types are not always allocated on the stack:
int[] x = new int[10];
x[1] = 123;
x[1] is a storage location. It is long-lived; it might live longer than this method. Therefore it must be on the heap. The fact that it contains an int is irrelevant.
You correctly say why a boxed int is expensive:
The price of boxing is the need to create an object on the heap, copy the stack allocated integer to the new object and vice-versa for unboxing.
Where you go wrong is to say "the stack allocated integer". It doesn't matter where the integer was allocated. What matters was that its storage contained the integer, instead of containing a reference to a heap location. The price is the need to create the object and do the copy; that's the only cost that is relevant.
So why isn't a generic variable costly? If you have a variable of type T, and T is constructed to be int, then you have a variable of type int, period. A variable of type int is a storage location, and it contains an int. Whether that storage location is on the stack or the heap is completely irrelevant. What is relevant is that the storage location contains an int, instead of containing a reference to something on the heap. Since the storage location contains an int, you do not have to take on the costs of boxing and unboxing: allocating new storage on the heap and copying the int to the new storage.
Is that now clear?

Generics allows the list's internal array to be typed int[] instead of effectively object[], which would require boxing.
Here's what happens without generics:
You call Add(1).
The integer 1 is boxed into an object, which requires a new object to be constructed on the heap.
This object is passed to ArrayList.Add().
The boxed object is stuffed into an object[].
There are three levels of indirection here: ArrayList -> object[] -> object -> int.
With generics:
You call Add(1).
The int 1 is passed to List<int>.Add().
The int is stuffed into an int[].
So there are only two levels of indirection: List<int> -> int[] -> int.
A few other differences:
The non-generic method will require a sum of 8 or 12 bytes (one pointer, one int) to store the value, 4/8 in one allocation and 4 in the other. And this will probably be more due to alignment and padding. The generic method will require only 4 bytes of space in the array.
The non-generic method requires allocating a boxed int; the generic method does not. This is faster and reduces GC churn.
The non-generic method requires casts to extract values. This is not typesafe and it's a bit slower.

An ArrayList only handles the type object so to use this class requires casting to and from object. In the case of value types, this casting involves boxing and unboxing.
When you use a generic list the compiler outputs specialized code for that value type so that the actual values are stored in the list rather than a reference to objects that contain the values. Therefore no boxing is required.
The price of boxing (to my understanding) is the need to create an object on the heap, copy the stack allocated integer to the new object and vice-versa for unboxing.
I think you are assuming that value types are always instantiated on the stack. This is not the case - they can be created either on the heap, on the stack or in registers. For more information about this please see Eric Lippert's article: The Truth About Value Types.

In .NET 1, when the Add method is called:
Space is allocated on the heap; a new reference is made
The contents of the i variable is copied into the reference
A copy of the reference is put at the end of the list
In .NET 2:
A copy of the variable i is passed to the Add method
A copy of that copy is put at the end of the list
Yes, the i variable is still copied (after all, it's a value type, and value types are always copied - even if they're just method parameters). But there's no redundant copy made on the heap.

Why are you thinking in terms of WHERE the values\objects are stored? In C# value types can be stored on stack as well as heap depending upon what the CLR chooses.
Where generics make a difference is WHAT is stored in the collection. In case of ArrayList the collection contains references to boxed objects where as the List<int> contains int values themselves.

Related

How does C# LINQ change the data source and how do I prevent it? [duplicate]

Some guy asked me this question couple of months ago and I couldn't explain it in detail. What is the difference between a reference type and a value type in C#?
I know that value types are int, bool, float, etc and reference types are delegate, interface, etc. Or is this wrong, too?
Can you explain it to me in a professional way?
Your examples are a little odd because while int, bool and float are specific types, interfaces and delegates are kinds of type - just like struct and enum are kinds of value types.
I've written an explanation of reference types and value types in this article. I'd be happy to expand on any bits which you find confusing.
The "TL;DR" version is to think of what the value of a variable/expression of a particular type is. For a value type, the value is the information itself. For a reference type, the value is a reference which may be null or may be a way of navigating to an object containing the information.
For example, think of a variable as like a piece of paper. It could have the value "5" or "false" written on it, but it couldn't have my house... it would have to have directions to my house. Those directions are the equivalent of a reference. In particular, two people could have different pieces of paper containing the same directions to my house - and if one person followed those directions and painted my house red, then the second person would see that change too. If they both just had separate pictures of my house on the paper, then one person colouring their paper wouldn't change the other person's paper at all.
Value type:
Holds some value not memory addresses
Example:
Struct
Storage:
TL;DR: A variable's value is stored wherever it is decleared. Local variables live on the stack for example, but when declared inside a class as a member it lives on the heap tightly coupled with the class it is declared in.
Longer: Thus value types are stored wherever they are declared.
E.g.: an int's value inside a function as a local variable would be stored on the stack, whilst an in int's value declared as member in a class would be stored on the heap with the class it is declared in. A value type on a class has a lifetype that is exactly the same as the class it is declared in, requiring almost no work by the garbage collector. It's more complicated though, i'd refer to #JonSkeet's book "C# In Depth" or his article "Memory in .NET" for a more concise explenation.
Advantages:
A value type does not need extra garbage collection. It gets garbage collected together with the instance it lives in. Local variables in methods get cleaned up upon method leave.
Drawbacks:
When large set of values are passed to a method the receiving variable actually copies so there are two redundant values in memory.
As classes are missed out.it losses all the oop benifits
Reference type:
Holds a memory address of a value not value
Example:
Class
Storage:
Stored on heap
Advantages:
When you pass a reference variable to a method and it changes it indeed changes the original value whereas in value types a copy of the given variable is taken and that's value is changed.
When the size of variable is bigger reference type is good
As classes come as a reference type variables, they give reusability, thus benefitting Object-oriented programming
Drawbacks:
More work referencing when allocating and dereferences when reading the value.extra overload for garbage collector
I found it easier to understand the difference of the two if you know how computer allocate stuffs in memory and know what a pointer is.
Reference is usually associated with a pointer. Meaning the memory address where your variable reside is actually holding another memory address of the actual object in a different memory location.
The example I am about to give is grossly over simplified, so take it with a grain of salt.
Imagine computer memory is a bunch of PO boxes in a row (starting w/ PO Box 0001 to PO Box n) that can hold something inside it. If PO boxes doesn't do it for you, try a hashtable or dictionary or an array or something similar.
Thus, when you do something like:
var a = "Hello";
the computer will do the following:
allocate memory (say starting at memory location 1000 for 5 bytes) and put H (at 1000), e (at 1001), l (at 1002), l (at 1003) and o (at 1004).
allocate somewhere in memory (say at location 0500) and assigned it as the variable a.
So it's kind of like an alias (0500 is a).
assign the value at that memory location (0500) to 1000 (which is where the string Hello start in memory). Thus the variable a is holding a reference to the actual starting memory location of the "Hello" string.
Value type will hold the actual thing in its memory location.
Thus, when you do something like:
var a = 1;
the computer will do the following:
allocate a memory location say at 0500 and assign it to variable a (the same alias thing)
put the value 1 in it (at memory location 0500).
Notice that we are not allocating extra memory to hold the actual value (1).
Thus a is actually holding the actual value and that's why it's called value type.
This is from a post of mine from a different forum, about two years ago. While the language is vb.net (as opposed to C#), the Value Type vs. Reference type concepts are uniform throughout .net, and the examples still hold.
It is also important to remember that within .net, ALL types technically derive from the base type Object. The value types are designed to behave as such, but in the end they also inherit the functionality of base type Object.
A. Value Types are just that- they represent a distinct area in memory where a discrete VALUE is stored. Value types are of fixed memory size and are stored in the stack, which is a collection of addresses of fixed size.
When you make a statement like such:
Dim A as Integer
DIm B as Integer
A = 3
B = A
You have done the following:
Created 2 spaces in memory sufficient to hold 32 bit integer values.
Placed a value of 3 in the memory allocation assigned to A
Placed a value of 3 in the memory allocation assigned to B by assigning it the same value as the held in A.
The Value of each variable exists discretely in each memory location.
B. Reference Types can be of various sizes. Therefore, they can't be stored in the "Stack" (remember, the stack is a collection of memory allocations of fixed size?). They are stored in the "Managed Heap". Pointers (or "references") to each item on the managed heap are maintained in the stack (Like an Address). Your code uses these pointers in the stack to access objects stored in the managed heap. So when your code uses a reference variable, it is actually using a pointer (or "address" to an memory location in the managed heap).
Say you have created a Class named clsPerson, with a string Property Person.Name
In this case, when you make a statement such as this:
Dim p1 As clsPerson
p1 = New clsPerson
p1.Name = "Jim Morrison"
Dim p2 As Person
p2 = p1
In the case above, the p1.Name Property will Return "Jim Morrison", as you would expect. The p2.Name property will ALSO return "Jim Morrison", as you would Iintuitively expect. I believe that both p1 and p2 represent distinct addresses on the Stack. However, now that you have assigned p2 the value of p1, both p1 and p2 point to the SAME LOCATION on the managed heap.
Now COnsider THIS situation:
Dim p1 As clsPerson
Dim p2 As clsPerson
p1 = New clsPerson
p1.Name = "Jim Morrison"
p2 = p1
p2.Name = "Janis Joplin"
In this situation, You have created one new instance of the person Class on the Managed Heap with a pointer p1 on the Stack which references the object, and assigned the Name Property of the object instance a value of "Jim Morrison" again. Next, you created another pointer p2 in the Stack, and pointed it at the same address on the managed heap as that referenced by p1 (when you made the assignement p2 = p1).
Here comes the twist. When you the Assign the Name property of p2 the value "Janis Joplin" you are changing the Name property for the object REFERENCED by Both p1 and p2, such that, if you ran the following code:
MsgBox(P1.Name)
'Will return "Janis Joplin"
MsgBox(p2.Name)
'will ALSO return "Janis Joplin"Because both variables (Pointers on the Stack) reference the SAME OBJECT in memory (an Address on the Managed Heap).
Did that make sense?
Last. If you do THIS:
DIm p1 As New clsPerson
Dim p2 As New clsPerson
p1.Name = "Jim Morrison"
p2.Name = "Janis Joplin"
You now have two distinct Person Objects. However, the minute you do THIS again:
p2 = p1
You have now pointed both back to "Jim Morrison". (I am not exactly sure what happened to the Object on the Heap referenced by p2 . . . I THINK it has now gone out of scope. This is one of those areas where hopefullly someone can set me straight . . .). -EDIT: I BELIEVE this is why you would Set p2 = Nothing OR p2 = New clsPerson before making the new assignment.
Once again, if you now do THIS:
p2.Name = "Jimi Hendrix"
MsgBox(p1.Name)
MsgBox(p2.Name)
Both msgBoxes will now return "Jimi Hendrix"
This can be pretty confusing for a bit, and I will say one last time, I may have some of the details wrong.
Good Luck, and hopefully others who know better than me will come along to help clarify some of this . . .
value data type and reference data type
1) value( contain the data directly )
but
reference ( refers to the data )
2) in value( every variable has its own copy)
but
in reference (more than variable can refer to some objects)
3) in value (operation variable can`t effect on other variable )
but
in reference (variable can affect other )
4) value types are(int, bool, float)
but
reference type are (array , class objects , string )
Value Type:
Fixed memory size.
Stored in Stack memory.
Holds actual value.
Ex. int, char, bool, etc...
Reference Type:
Not fixed memory.
Stored in Heap memory.
Holds memory address of actual value.
Ex. string, array, class, etc...
"Variables that are based on value types directly contain values. Assigning one value type variable to another copies the contained value. This differs from the assignment of reference type variables, which copies a reference to the object but not the object itself." from Microsoft's library.
You can find a more complete answer here and here.
Sometimes explanations won't help especially for the beginners. You can imagine value type as data file and reference type as a shortcut to a file.
So if you copy a reference variable you only copy the link/pointer to a real data somewhere in memory. If you copy a value type, you really clone the data in memory.
This is probably wrong in esoterical ways, but, to make it simple:
Value types are values that are passed normally "by value" (so copying them). Reference types are passed "by reference" (so giving a pointer to the original value). There isn't any guarantee by the .NET ECMA standard of where these "things" are saved. You could build an implementation of .NET that is stackless, or one that is heapless (the second would be very complex, but you probably could, using fibers and many stacks)
Structs are value type (int, bool... are structs, or at least are simulated as...), classes are reference type.
Value types descend from System.ValueType. Reference type descend from System.Object.
Now.. In the end you have Value Type, "referenced objects" and references (in C++ they would be called pointers to objects. In .NET they are opaque. We don't know what they are. From our point of view they are "handles" to the object). These lasts are similar to Value Types (they are passed by copy). So an object is composed by the object (a reference type) and zero or more references to it (that are similar to value types). When there are zero references the GC will probably collect it.
In general (in the "default" implementation of .NET), Value type can go on the stack (if they are local fields) or on the heap (if they are fields of a class, if they are variables in an iterator function, if they are variables referenced by a closure, if they are variable in an async function (using the newer Async CTP)...). Referenced value can only go to the heap. References use the same rules as Value types.
In the cases of Value Type that go on the heap because they are in an iterator function, an async function, or are referenced by a closure, if you watch the compiled file you'll see that the compiler created a class to put these variables, and the class is built when you call the function.
Now, I don't know how to write long things, and I have better things to do in my life. If you want a "precise" "academic" "correct" version, read THIS:
http://blogs.msdn.com/b/ericlippert/archive/2010/09/30/the-truth-about-value-types.aspx
It's 15 minutes I'm looking for it! It's better than the msdn versions, because it's a condensed "ready to use" article.
The simplest way to think of reference types is to consider them as being "object-IDs"; the only things one can do with an object ID are create one, copy one, inquire or manipulate the type of one, or compare two for equality. An attempt to do anything else with an object-ID will be regarded as shorthand for doing the indicated action with the object referred to by that id.
Suppose I have two variables X and Y of type Car--a reference type. Y happens to hold "object ID #19531". If I say "X=Y", that will cause X to hold "object ID #19531". Note that neither X nor Y holds a car. The car, otherwise known as "object ID #19531", is stored elsewhere. When I copied Y into X, all I did was copy the ID number. Now suppose I say X.Color=Colors.Blue. Such a statement will be regarded as an instruction to go find "object ID#19531" and paint it blue. Note that even though X and Y now refer to a blue car rather than a yellow one, the statement doesn't actually affect X or Y, because both still refer to "object ID #19531", which is still the same car as it always has been.
Variable types and Reference Value are easy to apply and well applied to the domain model, facilitate the development process.
To remove any myth around the amount of "value type", I will comment on how this is handled on the platform. NET, specifically in C # (CSharp) when called APIS and send parameters by value, by reference, in our methods, and functions and how to make the correct treatment of the passages of these values​​.
Read this article Variable Type Value and Reference in C #
Suppose v is a value-type expression/variable, and r is a reference-type expression/variable
x = v
update(v) //x will not change value. x stores the old value of v
x = r
update(r) //x now refers to the updated r. x only stored a link to r,
//and r can change but the link to it doesn't .
So, a value-type variable stores the actual value (5, or "h"). A reference-type varaible only stores a link to a metaphorical box where the value is.
Before explaining the different data types available in C#, it's important to mention that C# is a strongly-typed language. This means that each variable, constant, input parameter, return type and in general every expression that evaluates to a value, has a type.
Each type contains information that will be embedded by the compiler into the executable file as metadata which will be used by the common language runtime (CLR) to guarantee type safety when it allocates and reclaims memory.
If you wanna know how much memory a specific type allocates, you can use the sizeof operator as follows:
static void Main()
{
var size = sizeof(int);
Console.WriteLine($"int size:{size}");
size = sizeof(bool);
Console.WriteLine($"bool size:{size}");
size = sizeof(double);
Console.WriteLine($"double size:{size}");
size = sizeof(char);
Console.WriteLine($"char size:{size}");
}
The output will show the number of bytes allocated by each variable.
int size:4
bool size:1
double size:8
char size:2
The information related to each type are:
The required storage space.
The maximum and minimum values. For example, the type Int32 accepts values between 2147483648 and 2147483647.
The base type it inherits from.
The location where the memory for variables will be allocated at run time.
The kinds of operations that are permitted.
The members (methods, fields, events, etc.) contained by the type. For example, if we check the definition of type int, we will find the following struct and members:
namespace System
{
[ComVisible(true)]
public struct Int32 : IComparable, IFormattable, IConvertible, IComparable<Int32>, IEquatable<Int32>
{
public const Int32 MaxValue = 2147483647;
public const Int32 MinValue = -2147483648;
public static Int32 Parse(string s, NumberStyles style, IFormatProvider provider);
...
}
}
Memory management
When multiple processes are running on an operating system and the amount of RAM isn't enough to hold it all, the operating system maps parts of the hard disk with the RAM and starts storing data in the hard disk. The operating system will use than specific tables where virtual addresses are mapped to their correspondent physical addresses to perform the request. This capability to manage the memory is called virtual memory.
In each process, the virtual memory available is organized in the following 6 sections but for the relevance of this topic, we will focus only on the stack and the heap.
Stack
The stack is a LIFO (last in, first out) data structure, with a size-dependent on the operating system (by default, for ARM, x86 and x64 machines Windows's reserve 1MB, while Linux reserve from 2MB to 8MB depending on the version).
This section of memory is automatically managed by the CPU. Every time a function declares a new variable, the compiler allocates a new memory block as big as its size on the stack, and when the function is over, the memory block for the variable is deallocated.
Heap
This region of memory isn't managed automatically by the CPU and its size is bigger than the stack. When the new keyword is invoked, the compiler starts looking for the first free memory block that fits the size of the request. and when it finds it, it is marked as reserved by using the built-in C function malloc() and a return the pointer to that location. It's also possible to deallocate a block of memory by using the built-in C function free(). This mechanism causes memory fragmentation and has to use pointers to access the right block of memory, it's slower than the stack to perform the read/write operations.
Custom and Built-in types
While C# provides a standard set of built-in types representing integers, boolean, text characters, and so on, You can use constructs like struct, class, interface, and enum to create your own types.
An example of custom type using the struct construct is:
struct Point
{
public int X;
public int Y;
};
Value and reference types
We can categorize the C# type into the following categories:
Value types
Reference types
Value types
Value types derive from the System.ValueType class and variables of this type contain their values within their memory allocation in the stack. The two categories of value types are struct and enum.
The following example shows the member of the type boolean. As you can see there is no explicit reference to System.ValueType class, this happens because this class is inherited by the struct.
namespace System
{
[ComVisible(true)]
public struct Boolean : IComparable, IConvertible, IComparable<Boolean>, IEquatable<Boolean>
{
public static readonly string TrueString;
public static readonly string FalseString;
public static Boolean Parse(string value);
...
}
}
Reference types
On the other hand, the reference types do not contain the actual data stored in a variable, but the memory address of the heap where the value is stored. The categories of reference types are classes, delegates, arrays, and interfaces.
At run time, when a reference type variable is declared, it contains the value null until an object that has been created using the keywords new is assigned to it.
The following example shows the members of the generic type List.
namespace System.Collections.Generic
{
[DebuggerDisplay("Count = {Count}")]
[DebuggerTypeProxy(typeof(Generic.Mscorlib_CollectionDebugView<>))]
[DefaultMember("Item")]
public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IEnumerable, IList, ICollection, IReadOnlyList<T>, IReadOnlyCollection<T>
{
...
public T this[int index] { get; set; }
public int Count { get; }
public int Capacity { get; set; }
public void Add(T item);
public void AddRange(IEnumerable<T> collection);
...
}
}
In case you wanna find out the memory address of a specific object, the class System.Runtime.InteropServices provides a way to access to managed objects from unmanaged memory. In the following example, we are gonna use the static method GCHandle.Alloc() to allocate a handle to a string and then the method AddrOfPinnedObject to retrieve its address.
string s1 = "Hello World";
GCHandle gch = GCHandle.Alloc(s1, GCHandleType.Pinned);
IntPtr pObj = gch.AddrOfPinnedObject();
Console.WriteLine($"Memory address:{pObj.ToString()}");
The output will be
Memory address:39723832
References
Official documentation: https://learn.microsoft.com/en-us/cpp/build/reference/stack-stack-allocations?view=vs-2019
I think these two pictures describe it the best. This is the case in languages like C#, Java, JavaScript, and Python. For C++ references mean different, and the equivalent of reference types are pointer types (That's why you see in various documents of different languages that they are used interchangeably). One of the important things is the meaning of "Pass by Value" and "Pass by Reference". I think there are other questions about them on StackOverflow you can seek for.
There are many little details of the differences between value types and reference types that are stated explicitly by the standard and some of them are not easy to understand, especially for beginners.
See ECMA standard 33, Common Language Infrastructure (CLI). The CLI is also standardized by the ISO. I would provide a reference but for ECMA we must download a PDF and that link depends on the version number. ISO standards cost money.
One difference is that value types can be boxed but reference types generally cannot be. There are exceptions but they are quite technical.
Value types cannot have parameter-less instance constructors or finalizers and they cannot refer to themselves. Referring to themselves means for example that if there is a value type Node then a member of Node cannot be a Node. I think there are other requirements/limitations in the specifications but if so then they are not gathered together in one place.

is .net copying data or pointing to it [duplicate]

Some guy asked me this question couple of months ago and I couldn't explain it in detail. What is the difference between a reference type and a value type in C#?
I know that value types are int, bool, float, etc and reference types are delegate, interface, etc. Or is this wrong, too?
Can you explain it to me in a professional way?
Your examples are a little odd because while int, bool and float are specific types, interfaces and delegates are kinds of type - just like struct and enum are kinds of value types.
I've written an explanation of reference types and value types in this article. I'd be happy to expand on any bits which you find confusing.
The "TL;DR" version is to think of what the value of a variable/expression of a particular type is. For a value type, the value is the information itself. For a reference type, the value is a reference which may be null or may be a way of navigating to an object containing the information.
For example, think of a variable as like a piece of paper. It could have the value "5" or "false" written on it, but it couldn't have my house... it would have to have directions to my house. Those directions are the equivalent of a reference. In particular, two people could have different pieces of paper containing the same directions to my house - and if one person followed those directions and painted my house red, then the second person would see that change too. If they both just had separate pictures of my house on the paper, then one person colouring their paper wouldn't change the other person's paper at all.
Value type:
Holds some value not memory addresses
Example:
Struct
Storage:
TL;DR: A variable's value is stored wherever it is decleared. Local variables live on the stack for example, but when declared inside a class as a member it lives on the heap tightly coupled with the class it is declared in.
Longer: Thus value types are stored wherever they are declared.
E.g.: an int's value inside a function as a local variable would be stored on the stack, whilst an in int's value declared as member in a class would be stored on the heap with the class it is declared in. A value type on a class has a lifetype that is exactly the same as the class it is declared in, requiring almost no work by the garbage collector. It's more complicated though, i'd refer to #JonSkeet's book "C# In Depth" or his article "Memory in .NET" for a more concise explenation.
Advantages:
A value type does not need extra garbage collection. It gets garbage collected together with the instance it lives in. Local variables in methods get cleaned up upon method leave.
Drawbacks:
When large set of values are passed to a method the receiving variable actually copies so there are two redundant values in memory.
As classes are missed out.it losses all the oop benifits
Reference type:
Holds a memory address of a value not value
Example:
Class
Storage:
Stored on heap
Advantages:
When you pass a reference variable to a method and it changes it indeed changes the original value whereas in value types a copy of the given variable is taken and that's value is changed.
When the size of variable is bigger reference type is good
As classes come as a reference type variables, they give reusability, thus benefitting Object-oriented programming
Drawbacks:
More work referencing when allocating and dereferences when reading the value.extra overload for garbage collector
I found it easier to understand the difference of the two if you know how computer allocate stuffs in memory and know what a pointer is.
Reference is usually associated with a pointer. Meaning the memory address where your variable reside is actually holding another memory address of the actual object in a different memory location.
The example I am about to give is grossly over simplified, so take it with a grain of salt.
Imagine computer memory is a bunch of PO boxes in a row (starting w/ PO Box 0001 to PO Box n) that can hold something inside it. If PO boxes doesn't do it for you, try a hashtable or dictionary or an array or something similar.
Thus, when you do something like:
var a = "Hello";
the computer will do the following:
allocate memory (say starting at memory location 1000 for 5 bytes) and put H (at 1000), e (at 1001), l (at 1002), l (at 1003) and o (at 1004).
allocate somewhere in memory (say at location 0500) and assigned it as the variable a.
So it's kind of like an alias (0500 is a).
assign the value at that memory location (0500) to 1000 (which is where the string Hello start in memory). Thus the variable a is holding a reference to the actual starting memory location of the "Hello" string.
Value type will hold the actual thing in its memory location.
Thus, when you do something like:
var a = 1;
the computer will do the following:
allocate a memory location say at 0500 and assign it to variable a (the same alias thing)
put the value 1 in it (at memory location 0500).
Notice that we are not allocating extra memory to hold the actual value (1).
Thus a is actually holding the actual value and that's why it's called value type.
This is from a post of mine from a different forum, about two years ago. While the language is vb.net (as opposed to C#), the Value Type vs. Reference type concepts are uniform throughout .net, and the examples still hold.
It is also important to remember that within .net, ALL types technically derive from the base type Object. The value types are designed to behave as such, but in the end they also inherit the functionality of base type Object.
A. Value Types are just that- they represent a distinct area in memory where a discrete VALUE is stored. Value types are of fixed memory size and are stored in the stack, which is a collection of addresses of fixed size.
When you make a statement like such:
Dim A as Integer
DIm B as Integer
A = 3
B = A
You have done the following:
Created 2 spaces in memory sufficient to hold 32 bit integer values.
Placed a value of 3 in the memory allocation assigned to A
Placed a value of 3 in the memory allocation assigned to B by assigning it the same value as the held in A.
The Value of each variable exists discretely in each memory location.
B. Reference Types can be of various sizes. Therefore, they can't be stored in the "Stack" (remember, the stack is a collection of memory allocations of fixed size?). They are stored in the "Managed Heap". Pointers (or "references") to each item on the managed heap are maintained in the stack (Like an Address). Your code uses these pointers in the stack to access objects stored in the managed heap. So when your code uses a reference variable, it is actually using a pointer (or "address" to an memory location in the managed heap).
Say you have created a Class named clsPerson, with a string Property Person.Name
In this case, when you make a statement such as this:
Dim p1 As clsPerson
p1 = New clsPerson
p1.Name = "Jim Morrison"
Dim p2 As Person
p2 = p1
In the case above, the p1.Name Property will Return "Jim Morrison", as you would expect. The p2.Name property will ALSO return "Jim Morrison", as you would Iintuitively expect. I believe that both p1 and p2 represent distinct addresses on the Stack. However, now that you have assigned p2 the value of p1, both p1 and p2 point to the SAME LOCATION on the managed heap.
Now COnsider THIS situation:
Dim p1 As clsPerson
Dim p2 As clsPerson
p1 = New clsPerson
p1.Name = "Jim Morrison"
p2 = p1
p2.Name = "Janis Joplin"
In this situation, You have created one new instance of the person Class on the Managed Heap with a pointer p1 on the Stack which references the object, and assigned the Name Property of the object instance a value of "Jim Morrison" again. Next, you created another pointer p2 in the Stack, and pointed it at the same address on the managed heap as that referenced by p1 (when you made the assignement p2 = p1).
Here comes the twist. When you the Assign the Name property of p2 the value "Janis Joplin" you are changing the Name property for the object REFERENCED by Both p1 and p2, such that, if you ran the following code:
MsgBox(P1.Name)
'Will return "Janis Joplin"
MsgBox(p2.Name)
'will ALSO return "Janis Joplin"Because both variables (Pointers on the Stack) reference the SAME OBJECT in memory (an Address on the Managed Heap).
Did that make sense?
Last. If you do THIS:
DIm p1 As New clsPerson
Dim p2 As New clsPerson
p1.Name = "Jim Morrison"
p2.Name = "Janis Joplin"
You now have two distinct Person Objects. However, the minute you do THIS again:
p2 = p1
You have now pointed both back to "Jim Morrison". (I am not exactly sure what happened to the Object on the Heap referenced by p2 . . . I THINK it has now gone out of scope. This is one of those areas where hopefullly someone can set me straight . . .). -EDIT: I BELIEVE this is why you would Set p2 = Nothing OR p2 = New clsPerson before making the new assignment.
Once again, if you now do THIS:
p2.Name = "Jimi Hendrix"
MsgBox(p1.Name)
MsgBox(p2.Name)
Both msgBoxes will now return "Jimi Hendrix"
This can be pretty confusing for a bit, and I will say one last time, I may have some of the details wrong.
Good Luck, and hopefully others who know better than me will come along to help clarify some of this . . .
value data type and reference data type
1) value( contain the data directly )
but
reference ( refers to the data )
2) in value( every variable has its own copy)
but
in reference (more than variable can refer to some objects)
3) in value (operation variable can`t effect on other variable )
but
in reference (variable can affect other )
4) value types are(int, bool, float)
but
reference type are (array , class objects , string )
Value Type:
Fixed memory size.
Stored in Stack memory.
Holds actual value.
Ex. int, char, bool, etc...
Reference Type:
Not fixed memory.
Stored in Heap memory.
Holds memory address of actual value.
Ex. string, array, class, etc...
"Variables that are based on value types directly contain values. Assigning one value type variable to another copies the contained value. This differs from the assignment of reference type variables, which copies a reference to the object but not the object itself." from Microsoft's library.
You can find a more complete answer here and here.
Sometimes explanations won't help especially for the beginners. You can imagine value type as data file and reference type as a shortcut to a file.
So if you copy a reference variable you only copy the link/pointer to a real data somewhere in memory. If you copy a value type, you really clone the data in memory.
This is probably wrong in esoterical ways, but, to make it simple:
Value types are values that are passed normally "by value" (so copying them). Reference types are passed "by reference" (so giving a pointer to the original value). There isn't any guarantee by the .NET ECMA standard of where these "things" are saved. You could build an implementation of .NET that is stackless, or one that is heapless (the second would be very complex, but you probably could, using fibers and many stacks)
Structs are value type (int, bool... are structs, or at least are simulated as...), classes are reference type.
Value types descend from System.ValueType. Reference type descend from System.Object.
Now.. In the end you have Value Type, "referenced objects" and references (in C++ they would be called pointers to objects. In .NET they are opaque. We don't know what they are. From our point of view they are "handles" to the object). These lasts are similar to Value Types (they are passed by copy). So an object is composed by the object (a reference type) and zero or more references to it (that are similar to value types). When there are zero references the GC will probably collect it.
In general (in the "default" implementation of .NET), Value type can go on the stack (if they are local fields) or on the heap (if they are fields of a class, if they are variables in an iterator function, if they are variables referenced by a closure, if they are variable in an async function (using the newer Async CTP)...). Referenced value can only go to the heap. References use the same rules as Value types.
In the cases of Value Type that go on the heap because they are in an iterator function, an async function, or are referenced by a closure, if you watch the compiled file you'll see that the compiler created a class to put these variables, and the class is built when you call the function.
Now, I don't know how to write long things, and I have better things to do in my life. If you want a "precise" "academic" "correct" version, read THIS:
http://blogs.msdn.com/b/ericlippert/archive/2010/09/30/the-truth-about-value-types.aspx
It's 15 minutes I'm looking for it! It's better than the msdn versions, because it's a condensed "ready to use" article.
The simplest way to think of reference types is to consider them as being "object-IDs"; the only things one can do with an object ID are create one, copy one, inquire or manipulate the type of one, or compare two for equality. An attempt to do anything else with an object-ID will be regarded as shorthand for doing the indicated action with the object referred to by that id.
Suppose I have two variables X and Y of type Car--a reference type. Y happens to hold "object ID #19531". If I say "X=Y", that will cause X to hold "object ID #19531". Note that neither X nor Y holds a car. The car, otherwise known as "object ID #19531", is stored elsewhere. When I copied Y into X, all I did was copy the ID number. Now suppose I say X.Color=Colors.Blue. Such a statement will be regarded as an instruction to go find "object ID#19531" and paint it blue. Note that even though X and Y now refer to a blue car rather than a yellow one, the statement doesn't actually affect X or Y, because both still refer to "object ID #19531", which is still the same car as it always has been.
Variable types and Reference Value are easy to apply and well applied to the domain model, facilitate the development process.
To remove any myth around the amount of "value type", I will comment on how this is handled on the platform. NET, specifically in C # (CSharp) when called APIS and send parameters by value, by reference, in our methods, and functions and how to make the correct treatment of the passages of these values​​.
Read this article Variable Type Value and Reference in C #
Suppose v is a value-type expression/variable, and r is a reference-type expression/variable
x = v
update(v) //x will not change value. x stores the old value of v
x = r
update(r) //x now refers to the updated r. x only stored a link to r,
//and r can change but the link to it doesn't .
So, a value-type variable stores the actual value (5, or "h"). A reference-type varaible only stores a link to a metaphorical box where the value is.
Before explaining the different data types available in C#, it's important to mention that C# is a strongly-typed language. This means that each variable, constant, input parameter, return type and in general every expression that evaluates to a value, has a type.
Each type contains information that will be embedded by the compiler into the executable file as metadata which will be used by the common language runtime (CLR) to guarantee type safety when it allocates and reclaims memory.
If you wanna know how much memory a specific type allocates, you can use the sizeof operator as follows:
static void Main()
{
var size = sizeof(int);
Console.WriteLine($"int size:{size}");
size = sizeof(bool);
Console.WriteLine($"bool size:{size}");
size = sizeof(double);
Console.WriteLine($"double size:{size}");
size = sizeof(char);
Console.WriteLine($"char size:{size}");
}
The output will show the number of bytes allocated by each variable.
int size:4
bool size:1
double size:8
char size:2
The information related to each type are:
The required storage space.
The maximum and minimum values. For example, the type Int32 accepts values between 2147483648 and 2147483647.
The base type it inherits from.
The location where the memory for variables will be allocated at run time.
The kinds of operations that are permitted.
The members (methods, fields, events, etc.) contained by the type. For example, if we check the definition of type int, we will find the following struct and members:
namespace System
{
[ComVisible(true)]
public struct Int32 : IComparable, IFormattable, IConvertible, IComparable<Int32>, IEquatable<Int32>
{
public const Int32 MaxValue = 2147483647;
public const Int32 MinValue = -2147483648;
public static Int32 Parse(string s, NumberStyles style, IFormatProvider provider);
...
}
}
Memory management
When multiple processes are running on an operating system and the amount of RAM isn't enough to hold it all, the operating system maps parts of the hard disk with the RAM and starts storing data in the hard disk. The operating system will use than specific tables where virtual addresses are mapped to their correspondent physical addresses to perform the request. This capability to manage the memory is called virtual memory.
In each process, the virtual memory available is organized in the following 6 sections but for the relevance of this topic, we will focus only on the stack and the heap.
Stack
The stack is a LIFO (last in, first out) data structure, with a size-dependent on the operating system (by default, for ARM, x86 and x64 machines Windows's reserve 1MB, while Linux reserve from 2MB to 8MB depending on the version).
This section of memory is automatically managed by the CPU. Every time a function declares a new variable, the compiler allocates a new memory block as big as its size on the stack, and when the function is over, the memory block for the variable is deallocated.
Heap
This region of memory isn't managed automatically by the CPU and its size is bigger than the stack. When the new keyword is invoked, the compiler starts looking for the first free memory block that fits the size of the request. and when it finds it, it is marked as reserved by using the built-in C function malloc() and a return the pointer to that location. It's also possible to deallocate a block of memory by using the built-in C function free(). This mechanism causes memory fragmentation and has to use pointers to access the right block of memory, it's slower than the stack to perform the read/write operations.
Custom and Built-in types
While C# provides a standard set of built-in types representing integers, boolean, text characters, and so on, You can use constructs like struct, class, interface, and enum to create your own types.
An example of custom type using the struct construct is:
struct Point
{
public int X;
public int Y;
};
Value and reference types
We can categorize the C# type into the following categories:
Value types
Reference types
Value types
Value types derive from the System.ValueType class and variables of this type contain their values within their memory allocation in the stack. The two categories of value types are struct and enum.
The following example shows the member of the type boolean. As you can see there is no explicit reference to System.ValueType class, this happens because this class is inherited by the struct.
namespace System
{
[ComVisible(true)]
public struct Boolean : IComparable, IConvertible, IComparable<Boolean>, IEquatable<Boolean>
{
public static readonly string TrueString;
public static readonly string FalseString;
public static Boolean Parse(string value);
...
}
}
Reference types
On the other hand, the reference types do not contain the actual data stored in a variable, but the memory address of the heap where the value is stored. The categories of reference types are classes, delegates, arrays, and interfaces.
At run time, when a reference type variable is declared, it contains the value null until an object that has been created using the keywords new is assigned to it.
The following example shows the members of the generic type List.
namespace System.Collections.Generic
{
[DebuggerDisplay("Count = {Count}")]
[DebuggerTypeProxy(typeof(Generic.Mscorlib_CollectionDebugView<>))]
[DefaultMember("Item")]
public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IEnumerable, IList, ICollection, IReadOnlyList<T>, IReadOnlyCollection<T>
{
...
public T this[int index] { get; set; }
public int Count { get; }
public int Capacity { get; set; }
public void Add(T item);
public void AddRange(IEnumerable<T> collection);
...
}
}
In case you wanna find out the memory address of a specific object, the class System.Runtime.InteropServices provides a way to access to managed objects from unmanaged memory. In the following example, we are gonna use the static method GCHandle.Alloc() to allocate a handle to a string and then the method AddrOfPinnedObject to retrieve its address.
string s1 = "Hello World";
GCHandle gch = GCHandle.Alloc(s1, GCHandleType.Pinned);
IntPtr pObj = gch.AddrOfPinnedObject();
Console.WriteLine($"Memory address:{pObj.ToString()}");
The output will be
Memory address:39723832
References
Official documentation: https://learn.microsoft.com/en-us/cpp/build/reference/stack-stack-allocations?view=vs-2019
I think these two pictures describe it the best. This is the case in languages like C#, Java, JavaScript, and Python. For C++ references mean different, and the equivalent of reference types are pointer types (That's why you see in various documents of different languages that they are used interchangeably). One of the important things is the meaning of "Pass by Value" and "Pass by Reference". I think there are other questions about them on StackOverflow you can seek for.
There are many little details of the differences between value types and reference types that are stated explicitly by the standard and some of them are not easy to understand, especially for beginners.
See ECMA standard 33, Common Language Infrastructure (CLI). The CLI is also standardized by the ISO. I would provide a reference but for ECMA we must download a PDF and that link depends on the version number. ISO standards cost money.
One difference is that value types can be boxed but reference types generally cannot be. There are exceptions but they are quite technical.
Value types cannot have parameter-less instance constructors or finalizers and they cannot refer to themselves. Referring to themselves means for example that if there is a value type Node then a member of Node cannot be a Node. I think there are other requirements/limitations in the specifications but if so then they are not gathered together in one place.

.NET Non/Generic Lists are stored on stack or heap?

What I want to know is where are stored the following lists, on stack or on heap?
Nongeneric list of value type
int[] strArray = { 1, 2, 3 };
Nongeneric list of ref type
Person[] personArray = new Person[1];
personArray[0] = new Person();
Generic list of value type
List<int> moreInts = new List<int>();
moreInts.Add(10);
Generic list of ref type
List<Person> morePeople = new List<Person>();
morePeople.Add(new Person ("Frank", "Black", 50));
String is not a value type
Array is not a value type
List is not a value type (and uses Array internally)
Both can be stored on the stack or the heap theoretically (or something entirely different!), that's not part of the specification. In practice, the stack is only used for in-scope things, and to simplify making sure that the references are in scope, it usually only puts value types on the stack, since those are easy to verify. However, they can also be stored in registers. It's up to the JIT compiler to decide that.
In unsafe code, you can explicitly allocate some pointers on the stack, but that's only in the unsafe code. Other than that, it's up to the implementation of the JIT compiler. However, with the current versions, you can be pretty certain that all that can't be verified to have life time limited enough, will be placed on the heap. The key thing here is scope, not whether it's a reference type or a value type. It's just that value types are generally easy to verify :)
If you have to think whether it's on the heap or the stack, it's probably on the heap. In any case, the point is "you don't care". Only care if you have performance profiling data to back up the fact that you do indeed have an issue related to this.
Remember - stack, heap, registers... all just an implementation detail in the managed memory model of .NET.
Everebody. All arrays stored in heap.
If array type is struct, then values stored in cells of array.
List safe values in array.
For example:
strArray has 3 cells with 3 pointers to strings in heap
personArray has 1 cell with 1 pointer to Person class in heap
moreInts has 1 cell with 1 integer value in cell
morePeople has 1 cell with pointer to Person class in heap

How do you store an int or other "C# value types" on the heap (with C#)?

I'm engaged in educating myself about C# via Troelsen's Pro C# book.
I'm familiar with the stack and heap and how C# stores these sorts of things. In C++, whenever we use new we receive a pointer to something on the heap. However, in C# the behavior of new seems different to me:
when used with value types like an int, using new seems to merely call the int default constructor yet the value of such an int would still be stored on the stack
I understand that all objects/structs and such are stored on the heap, regardless of whether or not new is used.
So my question is: how can I instantiate an int on the heap? (And does this have something to do with 'boxing'?)
You can box any value type to System.Object type so it will be stored on the managed heap:
int number = 1;
object locatedOnTheHeap = number;
An other question is why you need this.
This is a classic example from the must-know MSDN paper: Boxing and Unboxing (C# Programming Guide)
When the CLR boxes a value type, it wraps the value inside a
System.Object and stores it on the managed heap.
Boxing is used to store value types in the garbage-collected heap.
Boxing is an implicit conversion of a value type to the type object or
to any interface type implemented by this value type. Boxing a value
type allocates an object instance on the heap and copies the value
into the new object.
.
I understand that all objects/structs and such are stored on the heap
BTW, IIRC sometimes JIT optimizes code so value type objects like of type like int are stored in the CPU registers rather than stack.
I do not know why you would want to do this, however, in theory you could indeed box your value. You would do this by boxing the int into an object (which is a reference type and will be placed on the stack:
object IAmARefSoIWillBeOnHeap = (object)1;
*As sll stated, you do not need the (object) as it will be an implicit cast. This is here merely for academic reasons, to show what is happening.
Here is a good article about reference versus value types, which is the difference of the heap versus the stack
A value type is "allocated" wherever it is declared:
As a local variable, typically on the stack (but to paraphrase Eric Lippert, the stack is an implementation detail, I suggest you read his excellent piece on his blog: The Truth About Value Types.)
As a field in a class, it expands the size of the instance with the size of the value type, and takes up space inside the instance
As such, this code:
var x = new SomeValueType();
does not allocate something on the heap by itself for that value type. If you close over it with an anonymous method or similar, the local variable will be transformed into the field of a class, and an instance of that class will be allocated on the heap, but in this case, the value type will be embedded into that class as a field.
The heap is for instances of reference types.
However, you've touched up something regarding boxing. You can box a value type value to make a copy of it and place that copy on the heap, wrapped in an object.
So this:
object x = new SomeValueType();
would first allocate the value type, then box it into an object, and store the reference to that object in x.
yet the value of such an int would still be stored on the stack
This is not necessarily true. Even when it is true, it's purely an implementation detail, and not part of the specification of the language. The main issue is that the type system does not necessarily correlate to the storage mechanism used by the runtime.
There are many cases where calling new on a struct will still result in an object that isn't on the stack. Boxing is a good example - when you box an object, you're basically pushing it into an object (effectively "copying" it to the heap), and referencing the object. Also, any time you're closing over a value type with a lambda, you'll end up "allocating on the heap."
That being said, I wouldn't focus on this at all - the issue really shouldn't about stack vs. heap in allocations, but rather about value type vs. reference type semantics and usage. As such, I'd strongly recommend reading Eric Lippert's The Truth About Value Types and Jon Skeet's References and Values. Both of these articles focus on the important aspects of struct vs. class semantics instead of necessarily looking at the storage.
As for ways to force the storage of an int on the heap, here are a couple of simple ones:
object one = 1; // Boxing
int two = 2; // Gets closed over, so ends up "on the heap"
Action closeOverTwo = () => { Console.WriteLine(two); }
// Do stuff with two here...
var three = new { Three = 3 }; // Wrap in a value type...
If you want an int on the heap, you can do this:
object o = 4;
But basically, you shouldn't want that. C# is designed for you not to think about such things. Here's a good place to start on that: http://blogs.msdn.com/b/ericlippert/archive/2009/04/27/the-stack-is-an-implementation-detail.aspx
So my question is: how can I instantiate an int on the heap? (And does
this have something to do with 'boxing'?)
Your understanding about objects and structs are correct. When you intialized either an object or a structure it goes on the heap.

What is the difference between a reference type and value type in c#?

Some guy asked me this question couple of months ago and I couldn't explain it in detail. What is the difference between a reference type and a value type in C#?
I know that value types are int, bool, float, etc and reference types are delegate, interface, etc. Or is this wrong, too?
Can you explain it to me in a professional way?
Your examples are a little odd because while int, bool and float are specific types, interfaces and delegates are kinds of type - just like struct and enum are kinds of value types.
I've written an explanation of reference types and value types in this article. I'd be happy to expand on any bits which you find confusing.
The "TL;DR" version is to think of what the value of a variable/expression of a particular type is. For a value type, the value is the information itself. For a reference type, the value is a reference which may be null or may be a way of navigating to an object containing the information.
For example, think of a variable as like a piece of paper. It could have the value "5" or "false" written on it, but it couldn't have my house... it would have to have directions to my house. Those directions are the equivalent of a reference. In particular, two people could have different pieces of paper containing the same directions to my house - and if one person followed those directions and painted my house red, then the second person would see that change too. If they both just had separate pictures of my house on the paper, then one person colouring their paper wouldn't change the other person's paper at all.
Value type:
Holds some value not memory addresses
Example:
Struct
Storage:
TL;DR: A variable's value is stored wherever it is decleared. Local variables live on the stack for example, but when declared inside a class as a member it lives on the heap tightly coupled with the class it is declared in.
Longer: Thus value types are stored wherever they are declared.
E.g.: an int's value inside a function as a local variable would be stored on the stack, whilst an in int's value declared as member in a class would be stored on the heap with the class it is declared in. A value type on a class has a lifetype that is exactly the same as the class it is declared in, requiring almost no work by the garbage collector. It's more complicated though, i'd refer to #JonSkeet's book "C# In Depth" or his article "Memory in .NET" for a more concise explenation.
Advantages:
A value type does not need extra garbage collection. It gets garbage collected together with the instance it lives in. Local variables in methods get cleaned up upon method leave.
Drawbacks:
When large set of values are passed to a method the receiving variable actually copies so there are two redundant values in memory.
As classes are missed out.it losses all the oop benifits
Reference type:
Holds a memory address of a value not value
Example:
Class
Storage:
Stored on heap
Advantages:
When you pass a reference variable to a method and it changes it indeed changes the original value whereas in value types a copy of the given variable is taken and that's value is changed.
When the size of variable is bigger reference type is good
As classes come as a reference type variables, they give reusability, thus benefitting Object-oriented programming
Drawbacks:
More work referencing when allocating and dereferences when reading the value.extra overload for garbage collector
I found it easier to understand the difference of the two if you know how computer allocate stuffs in memory and know what a pointer is.
Reference is usually associated with a pointer. Meaning the memory address where your variable reside is actually holding another memory address of the actual object in a different memory location.
The example I am about to give is grossly over simplified, so take it with a grain of salt.
Imagine computer memory is a bunch of PO boxes in a row (starting w/ PO Box 0001 to PO Box n) that can hold something inside it. If PO boxes doesn't do it for you, try a hashtable or dictionary or an array or something similar.
Thus, when you do something like:
var a = "Hello";
the computer will do the following:
allocate memory (say starting at memory location 1000 for 5 bytes) and put H (at 1000), e (at 1001), l (at 1002), l (at 1003) and o (at 1004).
allocate somewhere in memory (say at location 0500) and assigned it as the variable a.
So it's kind of like an alias (0500 is a).
assign the value at that memory location (0500) to 1000 (which is where the string Hello start in memory). Thus the variable a is holding a reference to the actual starting memory location of the "Hello" string.
Value type will hold the actual thing in its memory location.
Thus, when you do something like:
var a = 1;
the computer will do the following:
allocate a memory location say at 0500 and assign it to variable a (the same alias thing)
put the value 1 in it (at memory location 0500).
Notice that we are not allocating extra memory to hold the actual value (1).
Thus a is actually holding the actual value and that's why it's called value type.
This is from a post of mine from a different forum, about two years ago. While the language is vb.net (as opposed to C#), the Value Type vs. Reference type concepts are uniform throughout .net, and the examples still hold.
It is also important to remember that within .net, ALL types technically derive from the base type Object. The value types are designed to behave as such, but in the end they also inherit the functionality of base type Object.
A. Value Types are just that- they represent a distinct area in memory where a discrete VALUE is stored. Value types are of fixed memory size and are stored in the stack, which is a collection of addresses of fixed size.
When you make a statement like such:
Dim A as Integer
DIm B as Integer
A = 3
B = A
You have done the following:
Created 2 spaces in memory sufficient to hold 32 bit integer values.
Placed a value of 3 in the memory allocation assigned to A
Placed a value of 3 in the memory allocation assigned to B by assigning it the same value as the held in A.
The Value of each variable exists discretely in each memory location.
B. Reference Types can be of various sizes. Therefore, they can't be stored in the "Stack" (remember, the stack is a collection of memory allocations of fixed size?). They are stored in the "Managed Heap". Pointers (or "references") to each item on the managed heap are maintained in the stack (Like an Address). Your code uses these pointers in the stack to access objects stored in the managed heap. So when your code uses a reference variable, it is actually using a pointer (or "address" to an memory location in the managed heap).
Say you have created a Class named clsPerson, with a string Property Person.Name
In this case, when you make a statement such as this:
Dim p1 As clsPerson
p1 = New clsPerson
p1.Name = "Jim Morrison"
Dim p2 As Person
p2 = p1
In the case above, the p1.Name Property will Return "Jim Morrison", as you would expect. The p2.Name property will ALSO return "Jim Morrison", as you would Iintuitively expect. I believe that both p1 and p2 represent distinct addresses on the Stack. However, now that you have assigned p2 the value of p1, both p1 and p2 point to the SAME LOCATION on the managed heap.
Now COnsider THIS situation:
Dim p1 As clsPerson
Dim p2 As clsPerson
p1 = New clsPerson
p1.Name = "Jim Morrison"
p2 = p1
p2.Name = "Janis Joplin"
In this situation, You have created one new instance of the person Class on the Managed Heap with a pointer p1 on the Stack which references the object, and assigned the Name Property of the object instance a value of "Jim Morrison" again. Next, you created another pointer p2 in the Stack, and pointed it at the same address on the managed heap as that referenced by p1 (when you made the assignement p2 = p1).
Here comes the twist. When you the Assign the Name property of p2 the value "Janis Joplin" you are changing the Name property for the object REFERENCED by Both p1 and p2, such that, if you ran the following code:
MsgBox(P1.Name)
'Will return "Janis Joplin"
MsgBox(p2.Name)
'will ALSO return "Janis Joplin"Because both variables (Pointers on the Stack) reference the SAME OBJECT in memory (an Address on the Managed Heap).
Did that make sense?
Last. If you do THIS:
DIm p1 As New clsPerson
Dim p2 As New clsPerson
p1.Name = "Jim Morrison"
p2.Name = "Janis Joplin"
You now have two distinct Person Objects. However, the minute you do THIS again:
p2 = p1
You have now pointed both back to "Jim Morrison". (I am not exactly sure what happened to the Object on the Heap referenced by p2 . . . I THINK it has now gone out of scope. This is one of those areas where hopefullly someone can set me straight . . .). -EDIT: I BELIEVE this is why you would Set p2 = Nothing OR p2 = New clsPerson before making the new assignment.
Once again, if you now do THIS:
p2.Name = "Jimi Hendrix"
MsgBox(p1.Name)
MsgBox(p2.Name)
Both msgBoxes will now return "Jimi Hendrix"
This can be pretty confusing for a bit, and I will say one last time, I may have some of the details wrong.
Good Luck, and hopefully others who know better than me will come along to help clarify some of this . . .
value data type and reference data type
1) value( contain the data directly )
but
reference ( refers to the data )
2) in value( every variable has its own copy)
but
in reference (more than variable can refer to some objects)
3) in value (operation variable can`t effect on other variable )
but
in reference (variable can affect other )
4) value types are(int, bool, float)
but
reference type are (array , class objects , string )
Value Type:
Fixed memory size.
Stored in Stack memory.
Holds actual value.
Ex. int, char, bool, etc...
Reference Type:
Not fixed memory.
Stored in Heap memory.
Holds memory address of actual value.
Ex. string, array, class, etc...
"Variables that are based on value types directly contain values. Assigning one value type variable to another copies the contained value. This differs from the assignment of reference type variables, which copies a reference to the object but not the object itself." from Microsoft's library.
You can find a more complete answer here and here.
Sometimes explanations won't help especially for the beginners. You can imagine value type as data file and reference type as a shortcut to a file.
So if you copy a reference variable you only copy the link/pointer to a real data somewhere in memory. If you copy a value type, you really clone the data in memory.
This is probably wrong in esoterical ways, but, to make it simple:
Value types are values that are passed normally "by value" (so copying them). Reference types are passed "by reference" (so giving a pointer to the original value). There isn't any guarantee by the .NET ECMA standard of where these "things" are saved. You could build an implementation of .NET that is stackless, or one that is heapless (the second would be very complex, but you probably could, using fibers and many stacks)
Structs are value type (int, bool... are structs, or at least are simulated as...), classes are reference type.
Value types descend from System.ValueType. Reference type descend from System.Object.
Now.. In the end you have Value Type, "referenced objects" and references (in C++ they would be called pointers to objects. In .NET they are opaque. We don't know what they are. From our point of view they are "handles" to the object). These lasts are similar to Value Types (they are passed by copy). So an object is composed by the object (a reference type) and zero or more references to it (that are similar to value types). When there are zero references the GC will probably collect it.
In general (in the "default" implementation of .NET), Value type can go on the stack (if they are local fields) or on the heap (if they are fields of a class, if they are variables in an iterator function, if they are variables referenced by a closure, if they are variable in an async function (using the newer Async CTP)...). Referenced value can only go to the heap. References use the same rules as Value types.
In the cases of Value Type that go on the heap because they are in an iterator function, an async function, or are referenced by a closure, if you watch the compiled file you'll see that the compiler created a class to put these variables, and the class is built when you call the function.
Now, I don't know how to write long things, and I have better things to do in my life. If you want a "precise" "academic" "correct" version, read THIS:
http://blogs.msdn.com/b/ericlippert/archive/2010/09/30/the-truth-about-value-types.aspx
It's 15 minutes I'm looking for it! It's better than the msdn versions, because it's a condensed "ready to use" article.
The simplest way to think of reference types is to consider them as being "object-IDs"; the only things one can do with an object ID are create one, copy one, inquire or manipulate the type of one, or compare two for equality. An attempt to do anything else with an object-ID will be regarded as shorthand for doing the indicated action with the object referred to by that id.
Suppose I have two variables X and Y of type Car--a reference type. Y happens to hold "object ID #19531". If I say "X=Y", that will cause X to hold "object ID #19531". Note that neither X nor Y holds a car. The car, otherwise known as "object ID #19531", is stored elsewhere. When I copied Y into X, all I did was copy the ID number. Now suppose I say X.Color=Colors.Blue. Such a statement will be regarded as an instruction to go find "object ID#19531" and paint it blue. Note that even though X and Y now refer to a blue car rather than a yellow one, the statement doesn't actually affect X or Y, because both still refer to "object ID #19531", which is still the same car as it always has been.
Variable types and Reference Value are easy to apply and well applied to the domain model, facilitate the development process.
To remove any myth around the amount of "value type", I will comment on how this is handled on the platform. NET, specifically in C # (CSharp) when called APIS and send parameters by value, by reference, in our methods, and functions and how to make the correct treatment of the passages of these values​​.
Read this article Variable Type Value and Reference in C #
Suppose v is a value-type expression/variable, and r is a reference-type expression/variable
x = v
update(v) //x will not change value. x stores the old value of v
x = r
update(r) //x now refers to the updated r. x only stored a link to r,
//and r can change but the link to it doesn't .
So, a value-type variable stores the actual value (5, or "h"). A reference-type varaible only stores a link to a metaphorical box where the value is.
Before explaining the different data types available in C#, it's important to mention that C# is a strongly-typed language. This means that each variable, constant, input parameter, return type and in general every expression that evaluates to a value, has a type.
Each type contains information that will be embedded by the compiler into the executable file as metadata which will be used by the common language runtime (CLR) to guarantee type safety when it allocates and reclaims memory.
If you wanna know how much memory a specific type allocates, you can use the sizeof operator as follows:
static void Main()
{
var size = sizeof(int);
Console.WriteLine($"int size:{size}");
size = sizeof(bool);
Console.WriteLine($"bool size:{size}");
size = sizeof(double);
Console.WriteLine($"double size:{size}");
size = sizeof(char);
Console.WriteLine($"char size:{size}");
}
The output will show the number of bytes allocated by each variable.
int size:4
bool size:1
double size:8
char size:2
The information related to each type are:
The required storage space.
The maximum and minimum values. For example, the type Int32 accepts values between 2147483648 and 2147483647.
The base type it inherits from.
The location where the memory for variables will be allocated at run time.
The kinds of operations that are permitted.
The members (methods, fields, events, etc.) contained by the type. For example, if we check the definition of type int, we will find the following struct and members:
namespace System
{
[ComVisible(true)]
public struct Int32 : IComparable, IFormattable, IConvertible, IComparable<Int32>, IEquatable<Int32>
{
public const Int32 MaxValue = 2147483647;
public const Int32 MinValue = -2147483648;
public static Int32 Parse(string s, NumberStyles style, IFormatProvider provider);
...
}
}
Memory management
When multiple processes are running on an operating system and the amount of RAM isn't enough to hold it all, the operating system maps parts of the hard disk with the RAM and starts storing data in the hard disk. The operating system will use than specific tables where virtual addresses are mapped to their correspondent physical addresses to perform the request. This capability to manage the memory is called virtual memory.
In each process, the virtual memory available is organized in the following 6 sections but for the relevance of this topic, we will focus only on the stack and the heap.
Stack
The stack is a LIFO (last in, first out) data structure, with a size-dependent on the operating system (by default, for ARM, x86 and x64 machines Windows's reserve 1MB, while Linux reserve from 2MB to 8MB depending on the version).
This section of memory is automatically managed by the CPU. Every time a function declares a new variable, the compiler allocates a new memory block as big as its size on the stack, and when the function is over, the memory block for the variable is deallocated.
Heap
This region of memory isn't managed automatically by the CPU and its size is bigger than the stack. When the new keyword is invoked, the compiler starts looking for the first free memory block that fits the size of the request. and when it finds it, it is marked as reserved by using the built-in C function malloc() and a return the pointer to that location. It's also possible to deallocate a block of memory by using the built-in C function free(). This mechanism causes memory fragmentation and has to use pointers to access the right block of memory, it's slower than the stack to perform the read/write operations.
Custom and Built-in types
While C# provides a standard set of built-in types representing integers, boolean, text characters, and so on, You can use constructs like struct, class, interface, and enum to create your own types.
An example of custom type using the struct construct is:
struct Point
{
public int X;
public int Y;
};
Value and reference types
We can categorize the C# type into the following categories:
Value types
Reference types
Value types
Value types derive from the System.ValueType class and variables of this type contain their values within their memory allocation in the stack. The two categories of value types are struct and enum.
The following example shows the member of the type boolean. As you can see there is no explicit reference to System.ValueType class, this happens because this class is inherited by the struct.
namespace System
{
[ComVisible(true)]
public struct Boolean : IComparable, IConvertible, IComparable<Boolean>, IEquatable<Boolean>
{
public static readonly string TrueString;
public static readonly string FalseString;
public static Boolean Parse(string value);
...
}
}
Reference types
On the other hand, the reference types do not contain the actual data stored in a variable, but the memory address of the heap where the value is stored. The categories of reference types are classes, delegates, arrays, and interfaces.
At run time, when a reference type variable is declared, it contains the value null until an object that has been created using the keywords new is assigned to it.
The following example shows the members of the generic type List.
namespace System.Collections.Generic
{
[DebuggerDisplay("Count = {Count}")]
[DebuggerTypeProxy(typeof(Generic.Mscorlib_CollectionDebugView<>))]
[DefaultMember("Item")]
public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IEnumerable, IList, ICollection, IReadOnlyList<T>, IReadOnlyCollection<T>
{
...
public T this[int index] { get; set; }
public int Count { get; }
public int Capacity { get; set; }
public void Add(T item);
public void AddRange(IEnumerable<T> collection);
...
}
}
In case you wanna find out the memory address of a specific object, the class System.Runtime.InteropServices provides a way to access to managed objects from unmanaged memory. In the following example, we are gonna use the static method GCHandle.Alloc() to allocate a handle to a string and then the method AddrOfPinnedObject to retrieve its address.
string s1 = "Hello World";
GCHandle gch = GCHandle.Alloc(s1, GCHandleType.Pinned);
IntPtr pObj = gch.AddrOfPinnedObject();
Console.WriteLine($"Memory address:{pObj.ToString()}");
The output will be
Memory address:39723832
References
Official documentation: https://learn.microsoft.com/en-us/cpp/build/reference/stack-stack-allocations?view=vs-2019
I think these two pictures describe it the best. This is the case in languages like C#, Java, JavaScript, and Python. For C++ references mean different, and the equivalent of reference types are pointer types (That's why you see in various documents of different languages that they are used interchangeably). One of the important things is the meaning of "Pass by Value" and "Pass by Reference". I think there are other questions about them on StackOverflow you can seek for.
There are many little details of the differences between value types and reference types that are stated explicitly by the standard and some of them are not easy to understand, especially for beginners.
See ECMA standard 33, Common Language Infrastructure (CLI). The CLI is also standardized by the ISO. I would provide a reference but for ECMA we must download a PDF and that link depends on the version number. ISO standards cost money.
One difference is that value types can be boxed but reference types generally cannot be. There are exceptions but they are quite technical.
Value types cannot have parameter-less instance constructors or finalizers and they cannot refer to themselves. Referring to themselves means for example that if there is a value type Node then a member of Node cannot be a Node. I think there are other requirements/limitations in the specifications but if so then they are not gathered together in one place.

Categories

Resources