What is best answer on interview on such question you think?
I think I didn't find a copy of this here, if there is one please link it.
Another way of looking at this - rather than just quoting the spec which says that structs can't/don't have destructors - consider what would happen if the spec was changed so that they did - or rather, let's ask the question: can we guess why did the language designers decide to not allow structs to have 'destructors' in the first place?
(Don't get hung up on the word 'destructor' here; we're basically talking about a magic method on structs that gets called automatically when the variable goes out of scope. In other words, a language feature analogous to C++'s destructors.)
The first thing to realize is that we don't care about releasing memory. Whether the object is on the stack or on the heap (eg. a struct in a class), the memory will be taken care of one way or another sooner or later; either by being popped off the stack or by being collected. The real reason for having something that's destructor-like in the first place is for managing external resources - things like file handles, window handles, or other things that need special handling to get them cleaned up that the CLR itself doesn't know about.
Now supposed you allow a struct to have a destructor that can do this cleanup. Fine. Until you realize that when structs are passed as parameters, they get passed by value: they are copied. Now you've got two structs with the same internal fields, and they're both going to attempt to clean up the same object. One will happen first, and so code that is using the other one afterwards will start to fail mysteriously... and then its own cleanup will fail (hopefully! - worst case is it might succeed in cleaning up some other random resource - this can happen in situations where handle values are reused, for example.)
You could conceivably make a special case for structs that are parameters so that their 'destructors' don't run (but be careful - you now need to remember that when calling a function, it's always the outer one that 'owns' the actual resource - so now some structs are subtly different to others...) - but then you still have this problem with regular struct variables, where one can be assigned to another, making a copy.
You could perhaps work around this by adding a special mechanism to assignment operations that somehow allows the new struct to negotiate ownership of the underlying resource with its new copy - perhaps they share it or transfer ownership outright from the old to the new - but now you've essentially headed off into C++-land, where you need copy constructors, assignment operators, and have added a bunch of subtleties waiting to trap the unaware novice programmer. And keep in mind that the entire point of C# is to avoid that type of C++-style complexity as much as possible.
And, just to make things a bit more confusing, as one of the other answers pointed out, structs don't just exist as local objects. With locals, scope is nice and well defined; but structs can also be members of a class object. When should the 'destructor' get called in that case? Sure, you can do it when the container class is finalized; but now you have a mechanism that behaves very differently depending on where the struct lives: if the struct is a local, it gets triggered immediately at end of scope; if the struct is within a class, it gets triggered lazily... So if you really care about ensuring that some resource in one of your structs is cleaned up at a certain time, and if your struct could end up as a member of a class, you'd probably need something explicit like IDisposable/using() anyhow to ensure you've got your bases covered.
So while I can't claim to speak for the language designers, I can make a pretty good guess that one reason they decided not to include such a feature is because it would be a can of worms, and they wanted to keep C# reasonably simple.
From Jon Jagger:
"A struct cannot have a destructor. A destructor is just an override of object.Finalize in disguise, and structs, being value types, are not subject to garbage collection."
Every object other than arrays and strings is stored on the heap in the same way: a header which gives information about the "object-related" properties (its type, whether it's used by any active monitor locks, whether it has a non-suppressed Finalize method, etc.), and its data (meaning the contents of all the type's instance fields (public, private, and protected intermixed, with base-class fields appearing before derived-type fields). Because every heap object has a header, the system can take a reference to any object and know what it is, and what the garbage-collector is supposed to do with it. If the system has a list of all objects which have been created and have a Finalize method, it can examine every object in the list, see if its Finalize method is unsuppressed, and act on it appropriately.
Structs are stored without any header; a struct like Point with two integer fields is simply stored as two integers. While it is possible to have a ref to a struct (such a thing is created when a struct is passed as a ref parameter), the code that uses the ref has to know what type of struct the ref points to, since neither the ref nor the struct itself holds that information. Further, heap objects may only be created by the garbage-collector, which will guarantee that any object which is created will always exist until the next GC cycle. By contrast, user code can create and destroy structs by itself (often on the stack); if code creates a struct along with a ref to it, and passes that ref it to a called routine, there's no way that code can destroy the struct (or do anything at all, for that matter) until the called routine returns, so the struct is guaranteed to exist at least until the called routine exits. On the other hand, once the called routine exits, the ref it was given should be presumed invalid, since the caller would be free to destroy the struct at any time thereafter.
Becuase by definition destructors are used to destruct instances of classes, and structs are value types.
Ref: http://msdn.microsoft.com/en-us/library/66x5fx1b.aspx
By Microsoft's own words: "Destructors are used to destruct instances of classes." so it's a little silly to ask "Why can't you use a destructor on (something that is not a class)?" ^^
Related
What is the difference between the following codes
Code 1 :
if(RecordCollections!=null){
RecordCollections.Clear();
RecordCollections=null;
}
Code 2 :
RecordCollections = null;
The code is present inside a Dispose Method , Is there any advantage over using the Clear method before making the Collection to null ? Whether it is needed at all?
Is there any advantage over using the Clear method before making the Collection to null.
Impossible to say without a good Minimal, Complete, and Verifiable code example.
That said, neither of those code snippets look very useful to me. The first one for sure would be pointless if all that the Clear() method does is to empty the collection. Of course, if it actually went through and e.g. called Dispose() on each collection member, that would be different. But that would be a very unusual collection implementation.
Even the second has very little value, and is contrary to normal IDisposable semantics. IDisposable is supposed to be just for managing unmanaged resources. It gets used sometimes for other things, but that's its main purpose. In particular, one typically only calls Dispose() just before discarding an object. If the object itself is going to be discarded, then any references it holds to other objects (such as a collection) will no longer be reachable, and so setting them to null doesn't have any useful effect.
In fact, in some cases setting a variable to null can actually extend the lifetime of an object. The runtime is sophisticated enough to recognize that a variable is no longer used, and if it holds the last remaining reference to an object, the object can become eligible for garbage collection at that point, even if the scope of the variable extends further. By setting the variable to null, the variable itself is used later in the program, and so the runtime can't treat it as unreachable until that point, later than it otherwise would have.
This last point typically applies most commonly to local variables, not fields in an object. But it's theoretically possible for the runtime to optimize more broadly. It's a bad habit to get into, to go around setting to null variables that themselves aren't going to be around much longer.
Dispose refers to a mechanism to explicitly clean up the un-managed memory, since that cannot be cleaned up using standard Garbage Collector, mostly IDisposable will be implemented by the class, which use the unmanaged API internally like Database Connection.
Standard practice is:
To also implement Finalizer along with, since Dispose is an explicit call and if missed out by caller then Finalization does take care of clean up action, though it needs 2 cycles of GC.
If a class use any of the object, as a class variable, which implements a Dispose itself, then the Dispose shall be implemented to call the Dispose on the class variable.
Regarding the code provided:
if(RecordCollections!=null){
RecordCollections.Clear();
RecordCollections=null;
}
Or
RecordCollections = null;
As this is related to cleaning up managed memory, it has little use, since GC does the main job and doesn't need it, but in my view its an acceptable practice, where class variables are explicitly nullified, which makes user vary of each and every allocation and mostly an attempt shall be made to use the method local variables, until and unless state needs to be shared across method calls. Object allocation misuse, can be much more controlled.
As far as difference is concerned, though a collection is explicitly cleared and then nullified or just nullified, the memory remains intact, till the point GC is invoked, which is un-deterministic, but in my view its not very clear, how does the GC explicitly maps the objects for collection, which are no more reachable, but for various generations, especially the higher ones (promoted objects), if an object is explicit marked as null, then GC may have to spend less / no time tracing the root / reference, however there's no explicit documentation, to explain this aspect / implementation.
C# has a unified type system in which all types, including primitive types inherit from object type.
Java also has all of its classes inherited from object type.
Quote from Thinking in java
A singly rooted hierarchy makes it much easier to implement a garbage
collector
Does,unified type system or single rooted hierarchy help for doing Garbage collection and if so how?
I think the main reason is it helps ensure that all objects have a Finalize method, thus enabling automatic compatibility with the GC.
This method is automatically called after an object becomes inaccessible, unless the object has been exempted from finalization by a call to GC.SuppressFinalize. During shutdown of an application domain, Finalize is automatically called on objects that are not exempt from finalization, even those that are still accessible. Finalize is automatically called only once on a given instance, unless the object is re-registered using a mechanism such as GC.ReRegisterForFinalize and GC.SuppressFinalize has not been subsequently called.
Well, you have a Problem here - you assume it helps because it makes the garbage collector better.
IT helps, because it can AVOID (!) garbage collection.
STRUCTS In C# - and that are all primitive types like int etc., but also all you create yourself - are NOT garbage collected UNLESS they are boxed (assigned to an object, then they get a wrapper).
This means I can make elements like a Point (with x and y as ints) has Zero Overhead in the garbage collector because it is not garbage collected.
Basically, in Java primitives are "Compiler hacks", in C# primitives are the "other side of the object hierarchy" that are structs, not classes.
So, the unified type System does not help the gargabe collector by being unified, itdoes so by not creating objects that are garbage collected in the first place.
This also leads - with generic Support - to efficient collections that do not have to Box every item. That is where things get nasty in Java as the runtime has no concept of a generics at Bytecode Level, so everything there is "object", while in C# a List is a separate type of List in Bytecode and optimized.
The only thing which a garbage collector needs is to be able to identify the size of an object and where the references in the object are (e.g. from the type)
Having one way of determining this, can make it much simpler.
What is unified in .NET is the means by which structure types and class types are defined. A structure definition
struct IntPair
{
public Int32 V1,V2;
}
defines two things. It defines a heap-object type IntPair which derives from IntPair, which in turn derives from Object, and will behave like an object because it is one. It also defines a storage-location type IntPair which will behave like a pair of integers stuck together with duct tape because it is a pair of integers stuck together with duct tape. While the heap object type derives from Object, the storage-location type isn't an object and doesn't hold one either. It holds two integers--nothing more; nothing less.
The benefit of the unified type system is that it makes it easier for the garbage-collector to handle value types which contain reference-type fields. It is necessary for the GC to be able to with 100% certainty identify all reachable references that exist throughout the universe, including those that are fields of a value type. If class-type and value-type definitions use the same data structures, the procedure for determining the size of a class object and the offset to each field can also be used for determine size of a structure along with the offset of each nested field thereof. Having value types which don't happen to contain nested references use the same data structures doesn't particularly benefit the GC, but doesn't hurt anything either. Given that value types which do contain references need to be defined in a manner the GC knows about, it's easier to have all value types defined by the same means than to have those which references defined one way, and those which don't contain references defined another.
I'm developing a game using XNA and C# and was attempting to avoid calling new struct() type code each frame as I thought it would freak the GC out. "But wait," I said to myself, "struct is a value type. The GC shouldn't get called then, right?" Well, that's why I'm asking here.
I only have a very vague idea of what happens to value types. If I create a new struct within a function call, is the struct being created on the stack? Will it simply get pushed and popped and performance not take a hit? Further, would there be some memory limit or performance implications if, say, I need to create many instances in a single call?
Take, for instance, this code:
spriteBatch.Draw(tex, new Rectangle(x, y, width, height), Color.White);
Rectangle in this case is a struct. What happens when that new Rectangle is created? What are the implications of having to repeat that line many times (say, thousands of times)? Is this Rectangle created, a copy sent to the Draw method, and then discarded (meaning no memory getting eaten up the more Draw is called in that manner in the same function)?
P.S. I know this may be pre-mature optimization, but I'm mostly curious and wish to have a better understanding of what is happening.
When a new struct is created, it's contents are put straight into the location where you specify - if it's a method variable, it goes on the stack; if it's being assigned to a class variable, it goes inside the class instance being pointed to (on the heap).
When a struct variable is copied (or, in your case, passed to a function), the bytes making up the struct are all copied to the correct place on the stack or inside the class (if you're setting a field or property on an instance of a reference type).
Even though there may be copying of bytes, the JIT compiler will likely optimize all the unneccessary copies away so that it executes as fast as possible. Generally, it's not something you need to worry about - this is very much a micro-optimization :)
Does this answer your question?
While value types go on the stack, there's still performance implications to allocating and deallocating all that memory every frame -- especially on the Xbox 360. On a PC you'll likely not notice the difference, but on the 360 you probably will.
The value types are created on the stack if declared locally or on the heap if part of an object instance (as part of the object instance). In any case, struct instances are not collected by the GC, they are destroyed when their container goes out of scope.
The MSDN struct (C#) article has some more information about this.
This is just to add to thecoops answer. For reference types the new operator allocates a new instance of the type on the heap and calls the specified constructor.
For a struct, the new operator initializes the fields according to the specified constructor. It is however possible to instantiate a struct without using new. In that case all the fields in the struct are uninitialized and cannot be used until they have been explicitly initialized.
For more info see the description on MSDN.
I come from a managed world and c++ automatic memory management is quite unclear to me
If I understand correctly, I encapsulate a pointer within a stack object and when auto_ptr becomes out of scope, it automatically calls delete on the pointed object?
What kind of usage should I make of it and how should I naturally avoid inherent c++ problems?
auto_ptr is the simplest implementation of RAII in C++. Your understanding is correct, whenever its destructor is called, the underlying pointer gets deleted.
This is a one step up from C where you don't have destructors and any meaningful RAII is impossible.
A next step up towards automagic memory management is shared_ptr. It uses reference counting to keep track of whether or not the object is alive. This allows the programmer to create the objects a bit more freely, but still not as powerful as the garbage collection in Java and C#. One example where this method fails is circular references. If A has a ref counted pointer to B and B has a ref counted pointer to A, they will never get destructed, even though no other object is using either.
Modern object orianted languages use some sort of variation of mark and sweep. This technique allows managing circular references and is reliable enough for most programming tasks.
Yes, std::auto_ptr calls delete on its content when it goes out of scope. You use auto_ptr only if no shared ownership takes place.
auto_ptr isn't particularly flexible, you can't use it with objects created with new[] or anything else.
Shared ownership is usually approached with shared pointers, which e.g. boost has implementations of. The most common usage, implemented e.g. in Boosts shared_ptr, employs a reference counting scheme and cleans up the pointee when the last smart pointer goes out of scope.
shared_ptr has one big advantage - it lets you specify custom deleters. With that you can basically put every kind of resource in it and just have to specify what deleter it should use.
Here's how you use a smart pointer. For the sake of example, I'll be using a shared_ptr.
{
shared_ptr<Foo> foo(new Foo);
// do things with foo
}
// foo's value is released here
Pretty much all smart pointers aim to achieve something similar to the above, in that the object being held in the smart pointer gets released at the end of the smart pointer's scope. However, there are three types of smart pointers that are widely used, and they have very different semantics on how ownership is handled:
shared_ptr uses "shared ownership": the shared_ptr can be held by more than one scope/object, and they all own a reference to the object. When the last reference falls off, the object is deleted. This is done using reference counting.
auto_ptr uses "transferable ownership": the auto_ptr's value can be held only in one place, and each time the auto_ptr is assigned, the assignee receives ownership of the object, and the assigner loses its reference to the object. If an auto_ptr's scope is exited without the object being transferred to another auto_ptr, the object is deleted. Since there is only one owner of the object at a time, no reference counting is needed.
unique_ptr/scoped_ptr uses "nontransferable ownership": the object is held only at the place it's created, and cannot be transferred elsewhere. When the program leaves the scope where the unique_ptr is created, the object is deleted, no questions asked.
It's a lot to take in, I'll grant, but I hope it'll all sink in soon. Hope it helps!
You should use boost::shared_ptr instead of std::auto_ptr.
auto_ptr and shared_ptr simply keep an instance of the pointer and because they are local stack objects they get deallocated when they go out of scope. Once they are deallocated they call delete on internal pointer.
Simple example, the actuall shared_ptr and auto_ptr are more sophisticated (they have methods for assignment and conversion/access to internal pointer):
template <typename T>
struct myshrdptr
{
T * t;
myshrdptr(T * p) : t(p) {}
~myshrdptr()
{
cout << "myshrdptr deallocated" << endl;
delete t;
}
T * operator->() { return t; }
};
struct AB
{
void dump() { cout << "AB" << endl; }
};
void testShrdptr()
{
myshrdptr<AB> ab(new AB());
ab->dump();
// ab out of scope destructor called
// which calls delete on the internal pointer
// which deletes the AB object
}
From somewhere else:
int main()
{
testShrdptr();
cout << "done ..." << endl;
}
output something like (you can see that the destructor is called):
AB
myshrdptr deallocated
done ...
Rather than trying to understand auto_ptr and its relation to garbage-collected references, you should really try to see the underlying pattern:
In C++, all local objects have their destructors called when they go out of scope. This can be harnessed to clean up memory. For example, we could write a class which, in its constructor, is given a pointer to heap-allocated memory, and in its destructor, frees this pointer.
That is pretty much what auto_ptr does. (Unfortunately, auto_ptr also has some notoriously quirky semantics for assignment and copying)
It's also what boost::shared_ptr or other smart pointers do. There's no magic to any of those. They are simply classes that are given a pointer in their constructor, and, as they're typically allocated on the stack themselves, they'll automatically go out of scope at some point, and so their destructor is called, which can delete the pointer you originally passed to the constructor. You can write such classes yourself. Again, no magic, just a straightforward application of C++'s lifetime rules: When a local object goes out of scope, its destructor is called.
Many other classes cut out the middleman and simply let the same class do both allocation and deallocation. For example, std::vector calls new as necessary to create its internal array -- and in its destructor, it calls delete to release it.
When the vector is copied, it takes care to allocate a new array, and copy the contents from the original one, so that each object ends up with its own private array.
auto_ptr, or smart pointers in general, aren't the holy grail. They don't "solve" the problem of memory management. They are one useful part of the recipe, but to avoid memory management bugs and headaches, you need to understand the underlying pattern (commonly known as RAII) -- that is, whenever you have a resource allocation, it should be tied to a local variable which is given responsibility for also cleaning it up.
Sometimes, this means calling new yourself to allocate memory, and then passing the result to an auto_ptr, but more often, it means not calling new in the first place -- simply create the object you need on the stack, and let it call new as required internally. Or perhaps, it doesn't even need to call new internally. The trick to memory management is really to just rely on local stack-allocated objects instead of heap allocations. Don't use new by default.
Choose an imperative language (such as C, C++, or ADA) that provides pointer types.
Redesign that language to abolish pointer types, instead allowing programmers to define recursive types directly.
Consider carefully the issue of copy semantics vs reference semantics. Implement an interpreter for the language using DrRacket .
I know that one of the differences between classes and structs is that struct instances get stored on stack and class instances(objects) are stored on the heap.
Since classes and structs are very similar. Does anybody know the difference for this particular distinction?
(edited to cover points in comments)
To emphasise: there are differences and similarities between value-types and reference-types, but those differences have nothing to do with stack vs heap, and everything to do with copy-semantics vs reference-semantics. In particular, if we do:
Foo first = new Foo { Bar = 123 };
Foo second = first;
Then are "first" and "second" talking about the same copy of Foo? or different copies? It just so happens that the stack is a convenient and efficient way of handling value-types as variables. But that is an implementation detail.
(end edit)
Re the whole "value types go on the stack" thing... - value types don't always go on the stack;
if they are fields on a class
if they are boxed
if they are "captured variables"
if they are in an iterator block
then they go on the heap (the last two are actually just exotic examples of the first)
i.e.
class Foo {
int i; // on the heap
}
static void Foo() {
int i = 0; // on the heap due to capture
// ...
Action act = delegate {Console.WriteLine(i);};
}
static IEnumerable<int> Foo() {
int i = 0; // on the heap to do iterator block
//
yield return i;
}
Additionally, Eric Lippert (as already noted) has an excellent blog entry on this subject
It's useful in practice to be able to allocate memory on the stack for some purposes, since those allocations are very fast.
However, it's worth noting that there's no fundamental guarantee that all structs will be placed on the stack. Eric Lippert recently wrote an interesting blog entry on this topic.
Every process has a data block consists of two different allocatable memory segment. These are stack and heap. Stack is mostly serving as the program flow manager and saves local variables, parameters and returning pointers (in a case of returning from the current working function).
Classes are very complex and mostly very large types compared to value types like structs (or basic types -- ints, chars, etc.) Since stack allocation should be specialized on the efficiency of program flow, it is not serving an optimal environment to keep large objects.
Therefore, to greet both of the expectations, this seperated architecture came along.
How the compiler and run-time environment handle memory management has grown up over a long period of time. The stack memory v.s. heap memory allocation decision had a lot to do with what could be known at compile-time and what could be known at runtime. This was before managed run times.
In general, the compiler has very good control of what's on the stack, it gets to decide what is cleaned up and when based on calling conventions. The heap on the other hand, was more like the wild west. The compiler did not have good control of when things came and went. By placing function arguments on the stack, the compiler is able to make a scope -- that scope can be controlled over the lifetime of the call. This is a natural place to put value types, because they are easy to control as opposed to reference types that can hand out memory locations (pointers) to just about anyone they want.
Modern memory management changes a lot of this. The .NET runtime can take control of reference types and the managed heap through complex garbage collection and memory management algorithms. This is also a very, very deep subject.
I recommend you check out some texts on compilers -- I grew up on Aho, so I recommend that. You can also learn a lot about the subject by reading Gosling.
In some languages, like C++, objects are also value types.
To find an example for the opposite is harder, but under classic Pascal union structs could only be instantiated on the heap. (normal structs could be static)
In short: this situation is a choice, not a hard law. Since C# (and Java before it) lack procedural underpinnings, one can ask themselves why it needs structures at all.
The reason it is there, is probably a combination of needing it for external interfaces and to have a performant and tight complex (container-) type. One that is faster than class. And then it is better to make it a value type.
Marc Gravell already explained wonderfully the difference regarding how value and reference types are copied which is the main differentiation between them.
As to why value types are usually created on the stack, that's because the way they are copied allows it. The stack has some definite advantages over the heap in terms of performance, particularly because the compiler can calculate the exact position of a variable created in a certain block of code, which makes access faster.
When you create a reference type you receive a reference to the actual object which exists in the heap. There is a small level of indirection whenever you interact with the object itself. These reference types cannot be created on the stack because the lifetime of values in the stack is determined, in great part, by the structure of your code. The function frame of a method call will be popped off the stack when the function returns, for example.
With value types, however, their copy semantics allows the compiler, depending on where it was created, to place it in the stack. If you create a local variable that holds an instance of a struct in a method and then return it, a copy of it will be created, as Marc explained above. This means that the value can be safely placed in the stack, since the lifetime of the actual instance is tied to the method's function frame. Anytime you send it somewhere outside the current function a copy of it will be created, so it doesn't matter if you tie the existence of the original instance to the scope of the function. Along these lines, you can also see why value types that are captured by closures need to go in the heap: They outlive their scope because they must also be accessible from within the closure, which can be passed around freely.
If it were a reference type, then you wouldn't be returning a copy of the object, but rather a reference, which means the actual value must be stored somewhere else, otherwise, if you returned the reference and the object's lifetime was tied to the scope in which it was created, it would end up pointing to an empty space in memory.
The distinction isn't really that "Value types go on the stack, reference types on the heap". The real point is that it's usually more efficient to access objects that live in the stack, so the compiler will try and place those values it can there. It simply turns out that value types, because of their copy semantics, fit the bill better than reference types.
I believe that whether or not to use stack or heap space is the main distinction between the two, perhaps this article will shed some light on your question: Csharp classes vs structs
The main difference being that the heap may hold objects that live forever while something on the stack is temporary in that it will disappear when the enclosing callsite is exited. This is because when one enters a method it grows to hold local variables as well as the caller method. When the method exits (ab)normally eg return or because of exception each frame must be popped off the stack. Eventually the interested frame is popped and everything on it lost.
The whole point about using the stack is that it automatically implements and honours scope. A variable stored on the stack exists until the functiont that created it exits and that functions stack frame is popped. Things that have local scope are natural for stack storage things that have bigger scope are more difficult to manage on the stack. Objects on the heap can have lifetimes that are controlled in more complex ways.
Compilers always use the stack for variables - value or reference it makes little difference. A reference variable doesn't have to have its value stored on the stack - it can be anywhere and the heap makes a more efficient if the object referenced is big and if there are multiple references to it. The point is that the scope of a reference variable isn't the same as the lifetime of the object it references i.e. a variable may be destroyed by being popped off the stack but the object (on the heap) it references might live on.
If a value type is small enough you might as well store it on the stack in place of a reference to it on the heap - its lifetime is tied to the scope of the variable. If the value type is part of a larger reference type then it too could have multiple references to it and hence it is more natural to store it on the heap and dissociate its lifetime from any single reference variable.
Stack and heap are about lifetimes and the value v reference semantics is almost a by product.
Have a look at Value and Reference
Value types go on the stack, reference types go on the heap. A struct is a value type.
There is no guaruantee about this in the specification though, so it might change in future releases:)