How GC cleanup the struct?

How GC cleanup the struct? - c#

I think the GC may treat reference type and value type differently.
GC will collect the reference type if there is nobody have a reference to it.
When GC will collect the value type like struct? My struct is not small. I want it be collected as earlier as possible. With a profiler software, I saw that struct has a big accumulation and is the major memory consummer.

A struct will only be in the managed heap (i.e. where it can be garbage collected) if it's either an instance/static field, or as part of another object, or boxed, or in an array1. It's never "naked" in the managed heap - the closest you can get is a boxed value.
If you have a large struct, that's your first problem. Why have you created such a thing? Structs should almost always be small (the rule of thumb is usually 32 bytes) as otherwise every time you use it as a method argument or assign it to another variable, you'll end up copying it.
Have you considered using a class instead?
1 As Eric Lippert is fond of pointing out, the stack is an implementation detail. Furthermore, in certain cases local variables end up as fields in autogenerated classes... but that's somewhat irrelevant for this question, I believe.

A struct type is a value-type and inherits from System.ValueType. Value-type variable is allocated on the current thread's stack (not on the heap). Memory is not allocated on the managed heap. It is allocated on the stack and is automatically freed when value-type variable go out of scope. But if you are making the boxing of value-type variable then memory is allocated on the heap for variable's wrapper and variable's fields are copied to the wrapper. If your value-type variable is larger than 85KB it's wrapper will be placed in Large Object Heap (LOH). LOH objects are long living - they belong to Gen2.

Related

How is value type array allocated in a heap?

I read the answer to some of the similar questions but my question is little different due to fact that I do not understand a statement written about this in a book.
Because a struct is a value type, each instance does not require
instantiation of an object on the heap; this incurs a useful savings
when creating many instances of a type. For instance, creating an
array of value type requires only a single heap allocation.
I mean How can array require only a single heap allocation?... or what does it mean by single heap allocation

How can array require only a single heap allocation?... or what does it mean by single heap allocation
First of all, let's clarify what we mean by "heap" vs "stack".
Most programming environments today are stack-based. As you run a program, each time you call a method a new entry is pushed onto a special stack provided for your program. This stack entry (or frame) tells the system where to look for the method's executable code, what arguments were passed, and exactly where to return to in the calling code after the method exits. When a method finishes, it's entry is removed (popped) from the stack, so the program can go back to the previous method. When the stack is empty, the program has finished.1 There is often support for this special stack directly in the CPU.
The memory for the stack is allocated when the program is first launched, which means the stack itself has a fixed (limited) size. This is where "Stack Overflows" come from; get too deep down too many method calls, and the stack will run out of space. Each frame on the stack also has a certain amount of space for local variables for the method, and this is the memory we're talking about when we say value types live on the stack. The local variables stored on the stack do not require new allocations; the memory is already there. Just remember: this only applies in the context of local variables in a method.
The heap, on the other hand, is memory not automatically granted to the program. It is memory the program must request above and beyond it's core allocation. Heap memory has to be managed more carefully (so it doesn't leak — but we have a garbage collector to help with this), but there is (usually) much more of it available. Because it has to be granted by the operating system on request, initial allocations for the heap are also a little slower than memory used from the stack.2 For reference types, you can think of the new keyword as requesting a new heap memory allocation.
As a broad generalization, we say reference types are allocated on the heap, and value types are allocated on the stack (though there are plenty of exceptions for this3).
Now we understand this much, we can start to look at how .Net handles arrays.
The core array type itself is a reference type. What I mean is, for any given type T, the T may (or may not) be a value type, but an array of T (T[]) is always a reference type. In the "stack vs heap" context, this means creating a new array is a heap allocation, even if T is a value type. Arrays in .Net also have a fixed size4.
An additional attribute of value types is they also have a known/fixed size, based on the members. Therefore, an array of value types has a fixed number of elements, each with a known fixed size. That's enough information so allocating a new array of value types will get all the space for the array object and it's elements in single heap allocation. The value of each item (not just a reference) is held right there with the array's core memory. Now we have a bunch of value-type objects, but their memory is on the heap, rather than the stack.
This can be further complicated by a value type with one or more reference type members. In this situation, the space for the value type is allocated as normal, but the the part of the value for the reference members is just a reference. It still requires separate allocations or assignments to populate those reference members.
For an array holding reference types, the initial heap allocation for the array still allocates space for all the elements, but space reserved for each element is only enough for the reference. That is, initially each element in the array is still null. To populate the array you must still set those references to objects in memory, either by assigning existing objects or allocating new ones.
Finally, just as we were able to use arrays to get a value-type onto the heap, instead of the stack, there are also ways to force reference types to allocate from the stack. However, generally you should not do this unless you really understand the full implications.
1) There are different conventions on exactly when a frame is pushed/popped for each method call, depending on the platform, compiler configuration, and more, so only look at this paragraph for the general idea; the exact specifics will be incorrect in some particulars on any given platform.
2) For future reading, it is also useful to understand how programs handle addressing for heap memory.
3) Eric Lippert has an excellent write-up of this topic.
4) That is, arrays in .Net they are true arrays in the full formal computer science sense, unlike the associative array-like collection types in many other platforms. .Net has these, too, but it calls them what they are: collections rather than arrays.

An array is itself a reference type, which means, it is allocated on the managed heap. But if it is an array of a value type, it reserves the the memory needed for its size in one single step. Lets you have a struct with 4 Int32 in it.
A struct4Int[1000] will allocate 16000 bytes in one step.
An array of a reference type will take only the memory needed for the referencing (32bit or 64bit per item, depending on the architecture you are compiling for). Lets say a class with 4Int32 in it.
A class4Int[1000] will allocate 4000 or 8000 bytes at first.
The items are filled with the address of the references, which is intially null.
After creating the array you will have to allocate memory for every instance of the reference type and putting it's reference into the array (multiple allocations on the heap), adding another 16000 bytes on the heap in 1000 small pieces.

Is stack memory only for pointers and heap for objects?

Firstly, I'm currently working in C# and I've been reading up on memory management. So far, I've read through some great answers on stack overflow explaining the difference between stack memory and the managed heap memory. The majority of the answers state that by declaring:
int x = 5, you're allocating enough memory for the type of x within the stack memory.
I understand how this works as well as the scope of it, however when I read the explanation of heap memory, it confused me.
If you're saying int x = 5, since int is an alias of System.Int32, wouldn't x technically be a pointer to a new instance of the System.Int32 struct? And if so, wouldn't it then be stored in the heap memory since that's used for instance objects.
In this tutorial, it states (for the line class1 cls1 = new class1()):
... creates a pointer on the stack and the actual object is stored in a different type of memory location called ‘Heap’.By this logic, isn't everything stored on the heap and only pointers stored on the stack? Examples being new instances of System.String, System.Int64, System.Boolean, System.Decimal etc.
I thought I understood it, however clearly I don't, so I would appreciate someone explaining whether the stack is only for pointers or which part I've misinterpreted. Thanks in advance.

You can use the following rule: if it's a struct (including primitive types) then it's allocated where it's declared, otherwise a pointer to an object in heap is allocated.
Possible locations are:
For local variables it's a stack. Note that physically values can be stored in CPU registers rather than in stack.
For class fields it's inside of contiguous chunk of memory allocated in heap for an instance of the class.
For static fields it's allocated in loader heap as a part of type metadata (сorrect me if I am wrong).
Warning: this is just basic, non-comprehensive explanation to have a basic understanding of what's going on. The reality is more complicated. Local variables can be hoisted and moved to heap, optimizer can eliminate them altogether, etc...

You may want to check Classes and structs (MSDN) to understand what is stored where and how:
int x = 1; // 32 bits holding an integer in the stack
System.Object bo = x; // 32+some more bits are on the heap to hold the "boxed" (wrapped to be kept on the heap) integer value
System.Object ho = new Object(); // some bits are created on the heap right from the start
In simple words there are two types of objects: classes and structs. The classes (reference types) are meant to be stored on the heap and have a pointer to them while structs are meant to be stored in the stack (the structs can be relocated to the heap though with a little overhead of wrapping("boxing") them).
If you really need/want to understand how CLR works in general, consider reading "CLR via C#" (Richter).

Should the stack be really allocated in a C# program?

From what I know of C# (I believe I am right), value types are allocated on the stack and reference types are alllocated on the heap. But if a field in a class is value type, it is allocated on the heap rather than on the stack (I'm still right, right?).
With that said, I also know that every C# program is a class and is made up of classes. That should imply that any variable declared in a C# program, value type or reference type, should be allocated on the heap.
What I can infer, then, is that the stack may not be really used in a C# program. I say 'may' because there could be extraordinary cases, not that I know one, though.

You are mostly correct :)
Variables that are local to a method, however, are indeed allocated from the stack. This is the whole truth for value types. For reference types, the actual object, string, array, and so on, is allocated on the heap, but the pointer itself is allocated on stack.

A reference type is stored in the heap. This is true for the value types that are contained in the reference type. On the contrary in the stack is stored the reference to the object that is stored in the heap.
Regarding the variables that are local to a method that 500 - Internal Server Error has pointed out holds. They are allocated to the stack.

Reference type in struct in C#

I was going through type values in C# and learned that they don't get allocated on the heap as normal reference types do. How does a struct with reference type get allocated?
e.g.
struct simple{
public Employee e;
public bool topEmployee;
public void printSomething()
{
Console.WriteLine("Progress " + e.GetProgressReport());
Console.WriteLine("TopEmployee " + topEmployee);
}
};
The Employee is a class. Will e get allocated to a heap when initialized? Does it defeat the point of having a struct?

The "kind" of type (value/reference) has little to do with how instances are allocated. It's all about life time, and there are more ways to allocate than "heap" and "stack". Read The Truth About Value Types.
But insofar your question makes sense: A struct's member types do not affect how struct instances are allocated, because they do not affect the lifetime of the object. Same goes for classes, by the way.
The member e will be a part of the value type object and allocated where it may be. This member is a reference, and hence any actual Employee object e refers to will be allocated somewhere else1. Though it sounds like one, this is not a special rule; locals and class members and array items behave the same way. It does not defeat the point of value types, rather maintains the benefits of both value and reference types. The value type instances are still separate values instead of being aliased, and they still have simpler and shorter life time allowing better allocation choices with less effort. The reference type instances are still shared and (potentially) long-lived.
1 At least conceptually and in the current implementations; in very simple cases optimizations (escape analysis+allocation sinking) could merge these allocations, but no CLR I'm aware of does that.

Struct memory is allocated 'in-line'. Class memory is allocated on the heap, with a reference (pointer) allocated 'in-line'.
If you see a class variable named C in a program, the storage for that variable will be equivalent to a pointer (say 4 bytes), and the actual storage for the class will be on the heap.
But if you see a struct variable named S in a program, the storage for that variable is simply the size of the variable at the point of declaration. There is no heap allocated storage for it.
If C contains an S then S will be located in the heap storage for C.
If S contains a C then the reference to C will be located in the storage for S, and the storage for C is on the heap. This is the answer to your question about simple and e.
So struct storage can actually be in static memory, on the stack or on the heap (inside a class). Class memory is always on the heap.

The reference variable 'e' will be allocated wherever the struct is allocated (e.g. a local variable of unboxed struct type is likely to go on stack). The instance of Employee the 'e' is pointing to will be allocated on heap. This may vary between .NET implementations, but is most likely true for all current implementations.

Is it true that - Garbage Collector won't collection the object of struct type

Yesterday, we had a discussion on Gargbage collection.
It was discussed that the objects created using Classes are collected by Garbage collector, but it cannot be collected by GC if it is being created using struct
I know that structures uses stack and classes uses heap.
But, I guess GC never collects unmanaged codes only. So does that mean that the Structure types are unmanaged code. (I don't think so).
OR is it that the GC take care of Heap only and not Stack?
If yes, then what about int data type. int is struct not class. So if we defined the object of type int, isn't it managed by GC?

The GC will collect any managed objects (and Structures are managed objects) if they cannot be accessed from a GC root.
but it cannot be collected by GC, if it is being created using struct.
What you were told is incorrect. It doesn't matter how a managed object was created - if there are no references to it anymore, it will end up being collected.
OR is that means the GC take care of Heap only not Stack?
The GC takes care of the object graph - if objects are reachable by any of the GC roots, they will not be collected, if they are not, they will end up being collected. The Stack and Heap are not relevant.
So if we defined the object of type int, isn't it managed by GC?
int (AKA System.Int32) is a managed object - a structure. If you declare an int field in a class and the class goed out of scope, the int will end up being collected by the GC, for example.
As #leppie commented, in many situations, structures will get placed on the stack and when the stack is popped they will no longer exist - the GC, in such cases, is not involved (and doesn't need to be).

it cannot be collected by GC if it is being created using struct.
Not true. If it's not referenced it will be collected eventually.
I know that structures use stack and classes use heap.
That is a common misconception. See this Lippert's article for details.
I think that the idea you're referencing to is that most often GC does not bother with collecting any data located on stack, because stack will be destroyed as soon as program execution leaves its scope. So that means that any data put directly into stack (which might mean value types and should mean references to all the other data) will be cleared automatically, no need to use GC. GC's work is to clear heap data independently of its (data) type. If it's not referenced - it's collected.

But, I guess GC never collects unmanaged codes only. So is that means the Structure types are unmanaged code. (I don't think so).
I don't understand what you are asking here.
OR is that means the GC take care of Heap only not Stack?
Yes and No. GC takes care of memory required for instances of reference types (always created on the managed heap). You can view the "stack" as a piece of memory associated with current thread of execution. The stack can contain handles to reference types allocated on the managed heap. In this case the GC "cares": it will not collect the memory from the managed heap for these instances until these references on the stack exist. The stack can also contain instances (not references to!) of value types, in which case the GC does not care...
If yes, then what about int data type. int is struct not class. So if we defined the object of type int, isn't it managed by GC?
This question is a bit misleading. Let's say you "allocate" an instance of an int on the stack:
void Foo()
{
// ...
int tTmp;
//...
}
In this case the GC does not care about tTmp. It is placed on the current thread's stack and removed, if it gets out of scope. But if you do this:
void Foo()
{
//...
var tTmp = new int [] {
1, 2, 3, 4
};
//...
}
then the array of 4 integers is created on the managed heap and GC takes care of tTmp. It also "indirectly" takes care of the memory required for the contents of the array, which happen to be the space required for the four integers...

Let's see what the .NET standard (ECMA-334) says
Value types differ from reference types in that variables of the value
types directly contain their data, whereas variables of the reference
types store references to their data, the latter being known as
objects. With reference types, it is possible for two variables to
reference the same object, and thus possible for operations on one
variable to affect the object referenced by the other variable. With
value types, the variables each have their own copy of the data, and
it is not possible for operations on one to affect the other.
In other words, there is no reason for garbage collector to care about value types, because they clean-up after themselves (when they go out of scope), they contain their own data. GC is for clean-up of shared (references) data.
Note that not only structures are value types:
A value type is either a struct type or an enumeration type. C#
provides a set of predefined struct types called the simple types. The
simple types are identified through reserved words.
So e.g. "int" type is so called "simple type", value type. It is bit different from other structures, because operations like +-*/ could end up being compiled into primitive operations, not function calls.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.