I come from a managed world and c++ automatic memory management is quite unclear to me
If I understand correctly, I encapsulate a pointer within a stack object and when auto_ptr becomes out of scope, it automatically calls delete on the pointed object?
What kind of usage should I make of it and how should I naturally avoid inherent c++ problems?
auto_ptr is the simplest implementation of RAII in C++. Your understanding is correct, whenever its destructor is called, the underlying pointer gets deleted.
This is a one step up from C where you don't have destructors and any meaningful RAII is impossible.
A next step up towards automagic memory management is shared_ptr. It uses reference counting to keep track of whether or not the object is alive. This allows the programmer to create the objects a bit more freely, but still not as powerful as the garbage collection in Java and C#. One example where this method fails is circular references. If A has a ref counted pointer to B and B has a ref counted pointer to A, they will never get destructed, even though no other object is using either.
Modern object orianted languages use some sort of variation of mark and sweep. This technique allows managing circular references and is reliable enough for most programming tasks.
Yes, std::auto_ptr calls delete on its content when it goes out of scope. You use auto_ptr only if no shared ownership takes place.
auto_ptr isn't particularly flexible, you can't use it with objects created with new[] or anything else.
Shared ownership is usually approached with shared pointers, which e.g. boost has implementations of. The most common usage, implemented e.g. in Boosts shared_ptr, employs a reference counting scheme and cleans up the pointee when the last smart pointer goes out of scope.
shared_ptr has one big advantage - it lets you specify custom deleters. With that you can basically put every kind of resource in it and just have to specify what deleter it should use.
Here's how you use a smart pointer. For the sake of example, I'll be using a shared_ptr.
{
shared_ptr<Foo> foo(new Foo);
// do things with foo
}
// foo's value is released here
Pretty much all smart pointers aim to achieve something similar to the above, in that the object being held in the smart pointer gets released at the end of the smart pointer's scope. However, there are three types of smart pointers that are widely used, and they have very different semantics on how ownership is handled:
shared_ptr uses "shared ownership": the shared_ptr can be held by more than one scope/object, and they all own a reference to the object. When the last reference falls off, the object is deleted. This is done using reference counting.
auto_ptr uses "transferable ownership": the auto_ptr's value can be held only in one place, and each time the auto_ptr is assigned, the assignee receives ownership of the object, and the assigner loses its reference to the object. If an auto_ptr's scope is exited without the object being transferred to another auto_ptr, the object is deleted. Since there is only one owner of the object at a time, no reference counting is needed.
unique_ptr/scoped_ptr uses "nontransferable ownership": the object is held only at the place it's created, and cannot be transferred elsewhere. When the program leaves the scope where the unique_ptr is created, the object is deleted, no questions asked.
It's a lot to take in, I'll grant, but I hope it'll all sink in soon. Hope it helps!
You should use boost::shared_ptr instead of std::auto_ptr.
auto_ptr and shared_ptr simply keep an instance of the pointer and because they are local stack objects they get deallocated when they go out of scope. Once they are deallocated they call delete on internal pointer.
Simple example, the actuall shared_ptr and auto_ptr are more sophisticated (they have methods for assignment and conversion/access to internal pointer):
template <typename T>
struct myshrdptr
{
T * t;
myshrdptr(T * p) : t(p) {}
~myshrdptr()
{
cout << "myshrdptr deallocated" << endl;
delete t;
}
T * operator->() { return t; }
};
struct AB
{
void dump() { cout << "AB" << endl; }
};
void testShrdptr()
{
myshrdptr<AB> ab(new AB());
ab->dump();
// ab out of scope destructor called
// which calls delete on the internal pointer
// which deletes the AB object
}
From somewhere else:
int main()
{
testShrdptr();
cout << "done ..." << endl;
}
output something like (you can see that the destructor is called):
AB
myshrdptr deallocated
done ...
Rather than trying to understand auto_ptr and its relation to garbage-collected references, you should really try to see the underlying pattern:
In C++, all local objects have their destructors called when they go out of scope. This can be harnessed to clean up memory. For example, we could write a class which, in its constructor, is given a pointer to heap-allocated memory, and in its destructor, frees this pointer.
That is pretty much what auto_ptr does. (Unfortunately, auto_ptr also has some notoriously quirky semantics for assignment and copying)
It's also what boost::shared_ptr or other smart pointers do. There's no magic to any of those. They are simply classes that are given a pointer in their constructor, and, as they're typically allocated on the stack themselves, they'll automatically go out of scope at some point, and so their destructor is called, which can delete the pointer you originally passed to the constructor. You can write such classes yourself. Again, no magic, just a straightforward application of C++'s lifetime rules: When a local object goes out of scope, its destructor is called.
Many other classes cut out the middleman and simply let the same class do both allocation and deallocation. For example, std::vector calls new as necessary to create its internal array -- and in its destructor, it calls delete to release it.
When the vector is copied, it takes care to allocate a new array, and copy the contents from the original one, so that each object ends up with its own private array.
auto_ptr, or smart pointers in general, aren't the holy grail. They don't "solve" the problem of memory management. They are one useful part of the recipe, but to avoid memory management bugs and headaches, you need to understand the underlying pattern (commonly known as RAII) -- that is, whenever you have a resource allocation, it should be tied to a local variable which is given responsibility for also cleaning it up.
Sometimes, this means calling new yourself to allocate memory, and then passing the result to an auto_ptr, but more often, it means not calling new in the first place -- simply create the object you need on the stack, and let it call new as required internally. Or perhaps, it doesn't even need to call new internally. The trick to memory management is really to just rely on local stack-allocated objects instead of heap allocations. Don't use new by default.
Choose an imperative language (such as C, C++, or ADA) that provides pointer types.
Redesign that language to abolish pointer types, instead allowing programmers to define recursive types directly.
Consider carefully the issue of copy semantics vs reference semantics. Implement an interpreter for the language using DrRacket .
Related
What is the difference between the following codes
Code 1 :
if(RecordCollections!=null){
RecordCollections.Clear();
RecordCollections=null;
}
Code 2 :
RecordCollections = null;
The code is present inside a Dispose Method , Is there any advantage over using the Clear method before making the Collection to null ? Whether it is needed at all?
Is there any advantage over using the Clear method before making the Collection to null.
Impossible to say without a good Minimal, Complete, and Verifiable code example.
That said, neither of those code snippets look very useful to me. The first one for sure would be pointless if all that the Clear() method does is to empty the collection. Of course, if it actually went through and e.g. called Dispose() on each collection member, that would be different. But that would be a very unusual collection implementation.
Even the second has very little value, and is contrary to normal IDisposable semantics. IDisposable is supposed to be just for managing unmanaged resources. It gets used sometimes for other things, but that's its main purpose. In particular, one typically only calls Dispose() just before discarding an object. If the object itself is going to be discarded, then any references it holds to other objects (such as a collection) will no longer be reachable, and so setting them to null doesn't have any useful effect.
In fact, in some cases setting a variable to null can actually extend the lifetime of an object. The runtime is sophisticated enough to recognize that a variable is no longer used, and if it holds the last remaining reference to an object, the object can become eligible for garbage collection at that point, even if the scope of the variable extends further. By setting the variable to null, the variable itself is used later in the program, and so the runtime can't treat it as unreachable until that point, later than it otherwise would have.
This last point typically applies most commonly to local variables, not fields in an object. But it's theoretically possible for the runtime to optimize more broadly. It's a bad habit to get into, to go around setting to null variables that themselves aren't going to be around much longer.
Dispose refers to a mechanism to explicitly clean up the un-managed memory, since that cannot be cleaned up using standard Garbage Collector, mostly IDisposable will be implemented by the class, which use the unmanaged API internally like Database Connection.
Standard practice is:
To also implement Finalizer along with, since Dispose is an explicit call and if missed out by caller then Finalization does take care of clean up action, though it needs 2 cycles of GC.
If a class use any of the object, as a class variable, which implements a Dispose itself, then the Dispose shall be implemented to call the Dispose on the class variable.
Regarding the code provided:
if(RecordCollections!=null){
RecordCollections.Clear();
RecordCollections=null;
}
Or
RecordCollections = null;
As this is related to cleaning up managed memory, it has little use, since GC does the main job and doesn't need it, but in my view its an acceptable practice, where class variables are explicitly nullified, which makes user vary of each and every allocation and mostly an attempt shall be made to use the method local variables, until and unless state needs to be shared across method calls. Object allocation misuse, can be much more controlled.
As far as difference is concerned, though a collection is explicitly cleared and then nullified or just nullified, the memory remains intact, till the point GC is invoked, which is un-deterministic, but in my view its not very clear, how does the GC explicitly maps the objects for collection, which are no more reachable, but for various generations, especially the higher ones (promoted objects), if an object is explicit marked as null, then GC may have to spend less / no time tracing the root / reference, however there's no explicit documentation, to explain this aspect / implementation.
Structs are value types and thus are fully copied every time there is a manipulation on the struct. Since they are value types, structs are allocated in the stack and not in the heap.
I can see how structs can degrade the performance of methods when structs are passed as parameters, since they will be always copied in the stack, specially if they are big with lots of inner fields.
But I am curious about how C# deals with the return of structs.
In C the return is made by registers, or by reference using the heap if the value to be returned is too big for the registers. And practically all C# struct tutorials say structs lives in the stack, never in the heap.
So in the following code:
MyStruct ms = GetMyValue();
Where GetMyValue() is
MyStruct GetMyValue();
How will C# deal with the return of the struct for the ms variable? Specially if it's is too big for the registers? Will it in fact copy it to the heap and then copy it back again to the caller of the method and assign it to ms?
EDIT:
To address the comments left in the post:
I have read a few tutorial on C# structs before posting this, this tutorial in particular uses the word stack more times than I bother to count. And this MSDN tutorial also speaks about the stack, although it's from 2003, I don't think structs changed since then.
I am aware this might not be realted at all with C# but in fact be a matter of the JIT compiler it self or the CLR or something else I am not aware of. That's the purpose of my question, to learn more about the inner workings of C#, even if this is not actually related to the language itself.
There are C function call conventions, the best support for my Post is this StackOverflow post. When I first posted it in here I just said what I remembered, but since the SO answer says:
As for your specific question, it depends the ABI. Sometimes if the return value is larger than 4 bytes but not larger than 8 bytes, it can be split into EAX and EDX. But most of the time the calling function will just allocate some memory (usually on the stack) and pass a pointer to this area to the called function.
I might be wrong on this one, and I say might, because the answer says usually.
The true reason why I want to understand how structs are handled is because I have a project where I have to read a Serial Port multiple times to poll for data, this data will be returned by a method.
Since the data is just some bytes I thought I could get some performance out of structs instead of using a class to abstract the bytes incoming by the Serial Port, but if the return would pass the struct as a heap allocation my expectations on performance increase could be false.
Yes, I can make a simple test and compare performance, I know, but I wanted to actually learn how it's done behind the curtains, and not only memorize the outcome of my simulation. I like to know how the things that I work with actually work, and not only learn how to use them.
Value types are not only located on a stack. They also live in fields and in arrays. The key distinction to reference types is that value types are copied by value and have no identity. The stack vs. heap idea is false.
In C the return is made by registers, or by reference using the heap if the value to be returned is too big for the registers
The heap is not involved. The caller allocates spaces for the return value to be placed in. It passes a pointer to that space. The callee can fill that space. The .NET CLR does this as well. Of course this is an implementation detail.
but I wanted to actually learn
This is very good. You could not have tested what I just told you. You need to be a little more critical in what you believe what others say. Either you had bad tutorials or you read them in an imprecise way.
I can see how structs can degrade the performance of methods when structs are passed as parameters, since they will be always copied in the stack
This is not always the case I think. I'm not quite sure but I think the JIT can sometimes pass structs in registers. The .NET JITs really do not optimize much but I think this is an optimization that works to a certain degree. Probably driven by the existence of some one-field structs such as DateTime.
structs do not always live on the stack. if you allocate a struct inside of a function, it lives its life on the stack. if it's a field of a reference type(class/array(implicitly derived from System.Array/Object), it lives its life on the heap. as far as how theyre returned, that might be up to the ABI for that CPU architecture.
from the sounds of it, you've never dealt with IL/assembly/code generation, so lets build a dynamic method thats equivalent to MyStruct ms = GetMyValue()/what the compiler would generate in context of the word stack. "things" are never actually returned. thing(s, in a tuple sense i'm sure), are pushed onto the stack, and then a return instruction is emitted. leaving the return value(s) for the caller. we're going to assume GetMyValue() allocates a new MyStruct and assigns it to a local variable. the generated code would look something like this(i extend the ILGenerator class):
ILGenerator generator = dynamicMethod.GetILGenerator();
generator
.DeclareLocal(typeof(MyStruct))
.EmitCall(OpCodes.Call, typeof(EncapsulatingClass).GetMethod("GetMyValue"))
.Emit(OpCodes.Stloc, 0);
what happens here is(some of this is my assumption on how the CLI runtime works):
the calling function reserves a slot of typeof(MyStruct) at the current local list index.
GetMyValue() is called, reserves a MyStruct local the same way the method we are building does, emits an OpCodes.Newobj, which allocates and adjusts ESP(extended stack pointer) downward in the amount of sizeof(MyStruct), emits OpCodes.Stloc to store ESP minus sizeof(MyStruct) into the reserved local index, does some stuff with its fields, calls Emit(OpCodes.Ldloc, 0) to push the address the local points to onto the evaluation stack for the calling function, and emits an OpCodes.Ret to return.
the calling function emits an OpCodes.Stloc to store(copy) the contents of the MyStruct the top of the evaluation stack points to(how this happens, well i'm sure the answer is it depends, unfortunately), at local index 0.
i'm not an expert on how the CLI runtime is constructed by any means, so a lot of this is an assumption of what happens. take it with a grain of salt, and i'm by no means a CPU engineering expert. how the instruction stream segment of OpCodes.Ldloc, OpCodes.Ret, OpCodes.Stloc -- ms = GetMyValue() -- is treated, is probably up to how the JITer translates the IL into actual cpu specific machine instructions. such as X86. what determines if a struct will be returned into a register, is probably limited to one register only, so whatever the biggest register is, and if whatever struct will fit inside of it. i know CPU's can combine registers for memory offsets, but i'm not sure if that applies to returning structs inside of multiple registers. another thing to keep in mind, GetMyValue() went out of scope, which means the struct GetMyValue() allocated, in a scope sense, doesn't exist anymore, but in a stack sense(where it was allocated), it does, so the JITer could very well have just taken the address OpCodes.Ldloc pushed onto the stack, and placed it directly into the callers local index 0. since nothing can possibly copy it anymore due to the function returning. making the caller the new owner of the struct. avoiding any copying and registers altogether in this special case. this might be where calling conventions come into play as well. the problem is, if you allocated three structs in GetMyValue() for whatever reason, returning any struct after the first struct allocated would break that optimization, which is where the next optimization, return the struct inside a register(if it fits), comes into play. leaving the worst case scenario, copying its contents purely onto the stack again for the caller. i could be wrong, and anyone is more than welcome to chime in and correct me. a good place to start, would be github and see how the runtime handles OpCodes.Ldloc/Stloc for structs. i would imagine that's a good spot to look when it comes to getting the answers you need.
EDIT: any tutorial you've read that says structs are always allocated on the stack, have them all DDoS'd.
What is best answer on interview on such question you think?
I think I didn't find a copy of this here, if there is one please link it.
Another way of looking at this - rather than just quoting the spec which says that structs can't/don't have destructors - consider what would happen if the spec was changed so that they did - or rather, let's ask the question: can we guess why did the language designers decide to not allow structs to have 'destructors' in the first place?
(Don't get hung up on the word 'destructor' here; we're basically talking about a magic method on structs that gets called automatically when the variable goes out of scope. In other words, a language feature analogous to C++'s destructors.)
The first thing to realize is that we don't care about releasing memory. Whether the object is on the stack or on the heap (eg. a struct in a class), the memory will be taken care of one way or another sooner or later; either by being popped off the stack or by being collected. The real reason for having something that's destructor-like in the first place is for managing external resources - things like file handles, window handles, or other things that need special handling to get them cleaned up that the CLR itself doesn't know about.
Now supposed you allow a struct to have a destructor that can do this cleanup. Fine. Until you realize that when structs are passed as parameters, they get passed by value: they are copied. Now you've got two structs with the same internal fields, and they're both going to attempt to clean up the same object. One will happen first, and so code that is using the other one afterwards will start to fail mysteriously... and then its own cleanup will fail (hopefully! - worst case is it might succeed in cleaning up some other random resource - this can happen in situations where handle values are reused, for example.)
You could conceivably make a special case for structs that are parameters so that their 'destructors' don't run (but be careful - you now need to remember that when calling a function, it's always the outer one that 'owns' the actual resource - so now some structs are subtly different to others...) - but then you still have this problem with regular struct variables, where one can be assigned to another, making a copy.
You could perhaps work around this by adding a special mechanism to assignment operations that somehow allows the new struct to negotiate ownership of the underlying resource with its new copy - perhaps they share it or transfer ownership outright from the old to the new - but now you've essentially headed off into C++-land, where you need copy constructors, assignment operators, and have added a bunch of subtleties waiting to trap the unaware novice programmer. And keep in mind that the entire point of C# is to avoid that type of C++-style complexity as much as possible.
And, just to make things a bit more confusing, as one of the other answers pointed out, structs don't just exist as local objects. With locals, scope is nice and well defined; but structs can also be members of a class object. When should the 'destructor' get called in that case? Sure, you can do it when the container class is finalized; but now you have a mechanism that behaves very differently depending on where the struct lives: if the struct is a local, it gets triggered immediately at end of scope; if the struct is within a class, it gets triggered lazily... So if you really care about ensuring that some resource in one of your structs is cleaned up at a certain time, and if your struct could end up as a member of a class, you'd probably need something explicit like IDisposable/using() anyhow to ensure you've got your bases covered.
So while I can't claim to speak for the language designers, I can make a pretty good guess that one reason they decided not to include such a feature is because it would be a can of worms, and they wanted to keep C# reasonably simple.
From Jon Jagger:
"A struct cannot have a destructor. A destructor is just an override of object.Finalize in disguise, and structs, being value types, are not subject to garbage collection."
Every object other than arrays and strings is stored on the heap in the same way: a header which gives information about the "object-related" properties (its type, whether it's used by any active monitor locks, whether it has a non-suppressed Finalize method, etc.), and its data (meaning the contents of all the type's instance fields (public, private, and protected intermixed, with base-class fields appearing before derived-type fields). Because every heap object has a header, the system can take a reference to any object and know what it is, and what the garbage-collector is supposed to do with it. If the system has a list of all objects which have been created and have a Finalize method, it can examine every object in the list, see if its Finalize method is unsuppressed, and act on it appropriately.
Structs are stored without any header; a struct like Point with two integer fields is simply stored as two integers. While it is possible to have a ref to a struct (such a thing is created when a struct is passed as a ref parameter), the code that uses the ref has to know what type of struct the ref points to, since neither the ref nor the struct itself holds that information. Further, heap objects may only be created by the garbage-collector, which will guarantee that any object which is created will always exist until the next GC cycle. By contrast, user code can create and destroy structs by itself (often on the stack); if code creates a struct along with a ref to it, and passes that ref it to a called routine, there's no way that code can destroy the struct (or do anything at all, for that matter) until the called routine returns, so the struct is guaranteed to exist at least until the called routine exits. On the other hand, once the called routine exits, the ref it was given should be presumed invalid, since the caller would be free to destroy the struct at any time thereafter.
Becuase by definition destructors are used to destruct instances of classes, and structs are value types.
Ref: http://msdn.microsoft.com/en-us/library/66x5fx1b.aspx
By Microsoft's own words: "Destructors are used to destruct instances of classes." so it's a little silly to ask "Why can't you use a destructor on (something that is not a class)?" ^^
I have been trying to figure out the intricacies of the .NET garbage collection system and I have a question related to C# reference parameters. If I understand correctly, variables defined in a method are stored on the stack and are not affected by garbage collection. So, in this example:
public class Test
{
public Test()
{
}
public int DoIt()
{
int t = 7;
Increment(ref t);
return t;
}
private int Increment(ref int p)
{
p++;
}
}
the return value of DoIt() will be 8. Since the location of t is on the stack, then that memory cannot be garbage collected or compacted and the reference variable in Increment() will always point to the proper contents of t.
However, suppose we have:
public class Test
{
private int t = 7;
public Test()
{
}
public int DoIt()
{
Increment(ref t);
return t;
}
private int Increment(ref int p)
{
p++;
}
}
Now, t is stored on the heap as it is a value of a specific instance of my class. Isn't this possibly a problem if I pass this value as a reference parameter? If I pass t as a reference parameter, p will point to the current location of t. However, if the garbage collector moves this object during a compact, won't that mess up the reference to t in Increment()? Or does the garbage collector update even references created by passing reference parameters? Do I have to worry about this at all? The only mention of worrying about memory being compacted on MSDN (that I can find) is in relation to passing managed references to unmanaged code. Hopefully that's because I don't have to worry about any managed references in managed code. :)
If I understand correctly, variables defined in a method are stored on the stack and are not affected by garbage collection.
It depends on what you mean by "affected". The variables on the stack are the roots of the garbage collector, so they surely affect garbage collection.
Since the location of t is on the stack, then that memory cannot be garbage collected or compacted and the reference variable in Increment() will always point to the proper contents of t.
"Cannot" is a strange word to use here. The point of using the stack in the first place is because the stack is only used for data which never needs to be compacted and whose lifetime is always known so it never needs to be garbage collected. That why we use the stack in the first place. You seem to be putting the cart before the horse here. Let me repeat that to make sure it is clear: the reason we store this stuff on the stack is because it does not need to be collected or compacted because its lifetime is known. If its lifetime were not known then it would go on the heap. For example, local variables in iterator blocks go on the heap for that reason.
Now, t is stored on the heap as it is a value of a specific instance of my class.
Correct.
Isn't this possibly a problem if I pass this value as a reference parameter?
Nope. That's fine.
If I pass t as a reference parameter, p will point to the current location of t.
Yep. Though the way I prefer to think of it is that p is an alias for the variable t.
However, if the garbage collector moves this object during a compact, won't that mess up the reference to t in Increment()?
Nope. The garbage collector knows about managed references; that's why they're called managed references. If the gc moves the thing around, the managed reference is still valid.
If you had passed an actual pointer to t using unsafe code then you would be required to pin the container of t in place so that the garbage collector would know to not move it. You can do that using the fixed statement in C#, or by creating a GCHandle to the object you want to pin.
does the garbage collector update even references created by passing reference parameters?
Yep. It would be rather fragile if it didn't.
Do I have to worry about this at all?
Nope. You're thinking about this like an unmanaged C++ programmer -- C++ makes you do this work, but C# does not. Remember, the whole point of the managed memory model is to free you from having to think about this stuff.
Of course, if you enjoy worrying about this stuff you can always use the "unsafe" feature to turn these safety systems off, and then you can write heap and stack corrupting bugs to your heart's content.
No, you don't need to worry about it. Basically the calling method (DoIt) has a "live" reference to the instance of Test, which will prevent it from being garbage collected. I'm not sure whether it can be compacted - but I suspect it can, with the GC able to spot which variable references are part of objects being moved.
In other words - don't worry. Whether it can be compacted or not, it shouldn't cause you a problem.
It is exactly how you mention it in the last sentence. The GC will move all needed references when it compacts the heap (except for references to unmanaged memory).
Note that using the stack or heap is related to an instance variable being of a value or reference type. Value types (structs and 'simple' types like int, double, etc) are always on the stack, classes are always in the heap (what is in the stack is the reference - the pointer - to the allocated memory for the instance).
Edit: as correctly noted below in the comment, the second paragraph was written much too quickly. If a value type instance is a member of a class, it will not be stored in the stack, it will be in the heap like the rest of the members.
I know that one of the differences between classes and structs is that struct instances get stored on stack and class instances(objects) are stored on the heap.
Since classes and structs are very similar. Does anybody know the difference for this particular distinction?
(edited to cover points in comments)
To emphasise: there are differences and similarities between value-types and reference-types, but those differences have nothing to do with stack vs heap, and everything to do with copy-semantics vs reference-semantics. In particular, if we do:
Foo first = new Foo { Bar = 123 };
Foo second = first;
Then are "first" and "second" talking about the same copy of Foo? or different copies? It just so happens that the stack is a convenient and efficient way of handling value-types as variables. But that is an implementation detail.
(end edit)
Re the whole "value types go on the stack" thing... - value types don't always go on the stack;
if they are fields on a class
if they are boxed
if they are "captured variables"
if they are in an iterator block
then they go on the heap (the last two are actually just exotic examples of the first)
i.e.
class Foo {
int i; // on the heap
}
static void Foo() {
int i = 0; // on the heap due to capture
// ...
Action act = delegate {Console.WriteLine(i);};
}
static IEnumerable<int> Foo() {
int i = 0; // on the heap to do iterator block
//
yield return i;
}
Additionally, Eric Lippert (as already noted) has an excellent blog entry on this subject
It's useful in practice to be able to allocate memory on the stack for some purposes, since those allocations are very fast.
However, it's worth noting that there's no fundamental guarantee that all structs will be placed on the stack. Eric Lippert recently wrote an interesting blog entry on this topic.
Every process has a data block consists of two different allocatable memory segment. These are stack and heap. Stack is mostly serving as the program flow manager and saves local variables, parameters and returning pointers (in a case of returning from the current working function).
Classes are very complex and mostly very large types compared to value types like structs (or basic types -- ints, chars, etc.) Since stack allocation should be specialized on the efficiency of program flow, it is not serving an optimal environment to keep large objects.
Therefore, to greet both of the expectations, this seperated architecture came along.
How the compiler and run-time environment handle memory management has grown up over a long period of time. The stack memory v.s. heap memory allocation decision had a lot to do with what could be known at compile-time and what could be known at runtime. This was before managed run times.
In general, the compiler has very good control of what's on the stack, it gets to decide what is cleaned up and when based on calling conventions. The heap on the other hand, was more like the wild west. The compiler did not have good control of when things came and went. By placing function arguments on the stack, the compiler is able to make a scope -- that scope can be controlled over the lifetime of the call. This is a natural place to put value types, because they are easy to control as opposed to reference types that can hand out memory locations (pointers) to just about anyone they want.
Modern memory management changes a lot of this. The .NET runtime can take control of reference types and the managed heap through complex garbage collection and memory management algorithms. This is also a very, very deep subject.
I recommend you check out some texts on compilers -- I grew up on Aho, so I recommend that. You can also learn a lot about the subject by reading Gosling.
In some languages, like C++, objects are also value types.
To find an example for the opposite is harder, but under classic Pascal union structs could only be instantiated on the heap. (normal structs could be static)
In short: this situation is a choice, not a hard law. Since C# (and Java before it) lack procedural underpinnings, one can ask themselves why it needs structures at all.
The reason it is there, is probably a combination of needing it for external interfaces and to have a performant and tight complex (container-) type. One that is faster than class. And then it is better to make it a value type.
Marc Gravell already explained wonderfully the difference regarding how value and reference types are copied which is the main differentiation between them.
As to why value types are usually created on the stack, that's because the way they are copied allows it. The stack has some definite advantages over the heap in terms of performance, particularly because the compiler can calculate the exact position of a variable created in a certain block of code, which makes access faster.
When you create a reference type you receive a reference to the actual object which exists in the heap. There is a small level of indirection whenever you interact with the object itself. These reference types cannot be created on the stack because the lifetime of values in the stack is determined, in great part, by the structure of your code. The function frame of a method call will be popped off the stack when the function returns, for example.
With value types, however, their copy semantics allows the compiler, depending on where it was created, to place it in the stack. If you create a local variable that holds an instance of a struct in a method and then return it, a copy of it will be created, as Marc explained above. This means that the value can be safely placed in the stack, since the lifetime of the actual instance is tied to the method's function frame. Anytime you send it somewhere outside the current function a copy of it will be created, so it doesn't matter if you tie the existence of the original instance to the scope of the function. Along these lines, you can also see why value types that are captured by closures need to go in the heap: They outlive their scope because they must also be accessible from within the closure, which can be passed around freely.
If it were a reference type, then you wouldn't be returning a copy of the object, but rather a reference, which means the actual value must be stored somewhere else, otherwise, if you returned the reference and the object's lifetime was tied to the scope in which it was created, it would end up pointing to an empty space in memory.
The distinction isn't really that "Value types go on the stack, reference types on the heap". The real point is that it's usually more efficient to access objects that live in the stack, so the compiler will try and place those values it can there. It simply turns out that value types, because of their copy semantics, fit the bill better than reference types.
I believe that whether or not to use stack or heap space is the main distinction between the two, perhaps this article will shed some light on your question: Csharp classes vs structs
The main difference being that the heap may hold objects that live forever while something on the stack is temporary in that it will disappear when the enclosing callsite is exited. This is because when one enters a method it grows to hold local variables as well as the caller method. When the method exits (ab)normally eg return or because of exception each frame must be popped off the stack. Eventually the interested frame is popped and everything on it lost.
The whole point about using the stack is that it automatically implements and honours scope. A variable stored on the stack exists until the functiont that created it exits and that functions stack frame is popped. Things that have local scope are natural for stack storage things that have bigger scope are more difficult to manage on the stack. Objects on the heap can have lifetimes that are controlled in more complex ways.
Compilers always use the stack for variables - value or reference it makes little difference. A reference variable doesn't have to have its value stored on the stack - it can be anywhere and the heap makes a more efficient if the object referenced is big and if there are multiple references to it. The point is that the scope of a reference variable isn't the same as the lifetime of the object it references i.e. a variable may be destroyed by being popped off the stack but the object (on the heap) it references might live on.
If a value type is small enough you might as well store it on the stack in place of a reference to it on the heap - its lifetime is tied to the scope of the variable. If the value type is part of a larger reference type then it too could have multiple references to it and hence it is more natural to store it on the heap and dissociate its lifetime from any single reference variable.
Stack and heap are about lifetimes and the value v reference semantics is almost a by product.
Have a look at Value and Reference
Value types go on the stack, reference types go on the heap. A struct is a value type.
There is no guaruantee about this in the specification though, so it might change in future releases:)