Do Inner or Nested Scopes in C# influence the GC? - c#

Refering to this post as an example, do inner / nested scoping influence the GC or memory management in any way? Comming from a c++ or c standpoint anything declared and instanciated in the function that is not referenced anywhere else or declared outside of the defining scope is usually freed up after the function has dropped out of scope. Does the GC do something similar in this case? Does GC consider these inner scopes? Other than preventing declarations from being accessed outside of the scope, I dont see anything else that this feature provides.

Assuming we're just talking about reference types:
do inner / nested scoping influence the GC or memory management in any way?
In Release: no, they do not. The GC doesn't care about the scope of your local variables: it cares about when the local variable was last referenced. Letting the variable go out of scope, or assigning null to it, does nothing.
This is why GC.KeepAlive exists.
In Debug: local variables will be kept alive for the duration of the method to aid debugging, regardless of scope.
Other than preventing declarations from being accessed outside of the scope, I dont see anything else that this feature provides.
That's pretty much it. It's rare to see inner scopes which are just for scoping in C#.

IIRC, Value Types still are cleaned up.
With reference types it is more complicated. References run out of scope, but the GC is responsible for cleaning up the instances themself - eventually. See, running the GC is costly. While it Collects and Finalizes, every other thread in this Application must pause.
As a result, the GC is lazy with running. If it only runs once - on application closure - that is the ideal case. Beyond that only a special GC strategy, explicit calls of GC.Collect() (wich you should not have in production code) or the danger of a OutOfMemory exception will get it running.
When it runs, all that maters for each and every instance is "do I have a unbroken chain of references to a application root?" How long ago it ran out of scope does not mater (except for Optimsations like Generation Systems).
Do note that sometimes you have to do some cleanup work when a unmanaged resource runs out of scope/right now. That is what Finalizers, iDisposeable and using blocks handle.

In C# as well as C++ the GC has usually not much to do with it.
Scopes usually lay on the stack wich is not managed by the GC at all.
This behavior only differs for lambdas, if you refer to the outer scope from a lambda an object is created for the scopes variables and that object can get cleaned up by the GC when the lambda code is left.
The garbage collector uses the following information to determine
whether objects are live:
Stack roots. Stack variables provided by the just-in-time (JIT)
compiler and stack walker. JIT optimizations can lengthen or shorten
regions of code within which stack variables are reported to the
garbage collector.
Garbage collection handles. Handles that point to managed objects and
that can be allocated by user code or by the common language runtime.
Static data. Static objects in application domains that could be
referencing other objects. Each application domain keeps track of its
static objects.
For more details see: Fundamentals of garbage collection
And for Lambdas: Local functions compared to lambda expressions

Related

.NET GC deleting object in use

I'm running into a problem where it would appear the GC thread is waking up and deleting an object while it's in use.
While processfoo is running, and before it returns, it would appear fooCopy destructor fires in a separate thread. Has anyone seen a problem like this and if so how do you work around it? Surely, this can't be really happening and I must be doing something wrong. Can anyone give me some tips/strategies to debug the garbage collection?
try
{
CFoo fooCopy = new CFoo(someFoo);
someBar.ProcessFoo(fooCopy);
}
CFoo has an IntPtr member variable.
ProcessFoo passes that member variable to an imported function from a C++ DLL that might take a few seconds to run.
In my C++ DLL I log when the IntPtr is created and when it is deleted and I can see a separate thread deleting the IntPtr while ProcessFoo is running.
Surely, this can't be really happening and I must be doing something wrong.
Nope, it can absolutely be happening (this is why writing finalizers is hard, and you should avoid them where you can, and be very careful when forced to write them). An object can be GC-ed as soon as the runtime can prove that no code will ever try to access it again. If you call an instance method (and never access the object instance any time after that invocation) then the object is eligible for collection immediately after the last usage of this from that instance method.
Objects that are in scope (say, for example, the this variable of a method) but that are never used again within that scope are not considered rooted by the GC.
As for how you work around it, if you have a variable that you want to be considered "alive" even though it's never accessed in managed code again, use GC.KeepAlive. In this case, adding GC.KeepAlive(this) to the end of ProcessFoo will ensure that the object in question stays alive until the end of the method (or if it's not that method's responsibility, have the caller call GC.KeepAlive(someBar) right after ProcessFoo).
See this blog post for more information on this topic, and a few related and even more unusual properties of finalizers.
This is pretty normal when you interop with C++, the GC has no hope of discovering that the IntPtr is in use anywhere else. It is not a reference type that the GC has awareness of, nor can it probe the stack frames of native code. The jitter marks the fooCopy object reference in use up to the underlying CALL, not beyond that. In other words, it is eligible for collection while the native code is executing. If another thread triggers a GC then it is sayonora.
You'll find details about the lifetime of local variables in this post.
There are several possible workarounds for this, albeit that the correct one can't be guessed from the question. Beyond the SafeHandle classes, very good at ensuring the finalization is taken care of as well, you could use HandleRef instead of IntPtr in the [DllImport] declaration. Or append GC.KeepAlive() to this code to force the jitter to extend the lifetime of fooCopy.

Understanding garbage collection in .NET

Consider the below code:
public class Class1
{
public static int c;
~Class1()
{
c++;
}
}
public class Class2
{
public static void Main()
{
{
var c1=new Class1();
//c1=null; // If this line is not commented out, at the Console.WriteLine call, it prints 1.
}
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine(Class1.c); // prints 0
Console.Read();
}
}
Now, even though the variable c1 in the main method is out of scope and not referenced further by any other object when GC.Collect() is called, why is it not finalized there?
You are being tripped up here and drawing very wrong conclusions because you are using a debugger. You'll need to run your code the way it runs on your user's machine. Switch to the Release build first with Build + Configuration manager, change the "Active solution configuration" combo in the upper left corner to "Release". Next, go into Tools + Options, Debugging, General and untick the "Suppress JIT optimization" option.
Now run your program again and tinker with the source code. Note how the extra braces have no effect at all. And note how setting the variable to null makes no difference at all. It will always print "1". It now works the way you hope and expected it would work.
Which does leave with the task of explaining why it works so differently when you run the Debug build. That requires explaining how the garbage collector discovers local variables and how that's affected by having a debugger present.
First off, the jitter performs two important duties when it compiles the IL for a method into machine code. The first one is very visible in the debugger, you can see the machine code with the Debug + Windows + Disassembly window. The second duty is however completely invisible. It also generates a table that describes how the local variables inside the method body are used. That table has an entry for each method argument and local variable with two addresses. The address where the variable will first store an object reference. And the address of the machine code instruction where that variable is no longer used. Also whether that variable is stored on the stack frame or a cpu register.
This table is essential to the garbage collector, it needs to know where to look for object references when it performs a collection. Pretty easy to do when the reference is part of an object on the GC heap. Definitely not easy to do when the object reference is stored in a CPU register. The table says where to look.
The "no longer used" address in the table is very important. It makes the garbage collector very efficient. It can collect an object reference, even if it is used inside a method and that method hasn't finished executing yet. Which is very common, your Main() method for example will only ever stop executing just before your program terminates. Clearly you would not want any object references used inside that Main() method to live for the duration of the program, that would amount to a leak. The jitter can use the table to discover that such a local variable is no longer useful, depending on how far the program has progressed inside that Main() method before it made a call.
An almost magic method that is related to that table is GC.KeepAlive(). It is a very special method, it doesn't generate any code at all. Its only duty is to modify that table. It extends the lifetime of the local variable, preventing the reference it stores from getting garbage collected. The only time you need to use it is to stop the GC from being to over-eager with collecting a reference, that can happen in interop scenarios where a reference is passed to unmanaged code. The garbage collector cannot see such references being used by such code since it wasn't compiled by the jitter so doesn't have the table that says where to look for the reference. Passing a delegate object to an unmanaged function like EnumWindows() is the boilerplate example of when you need to use GC.KeepAlive().
So, as you can tell from your sample snippet after running it in the Release build, local variables can get collected early, before the method finished executing. Even more powerfully, an object can get collected while one of its methods runs if that method no longer refers to this. There is a problem with that, it is very awkward to debug such a method. Since you may well put the variable in the Watch window or inspect it. And it would disappear while you are debugging if a GC occurs. That would be very unpleasant, so the jitter is aware of there being a debugger attached. It then modifies the table and alters the "last used" address. And changes it from its normal value to the address of the last instruction in the method. Which keeps the variable alive as long as the method hasn't returned. Which allows you to keep watching it until the method returns.
This now also explains what you saw earlier and why you asked the question. It prints "0" because the GC.Collect call cannot collect the reference. The table says that the variable is in use past the GC.Collect() call, all the way up to the end of the method. Forced to say so by having the debugger attached and by running the Debug build.
Setting the variable to null does have an effect now because the GC will inspect the variable and will no longer see a reference. But make sure you don't fall in the trap that many C# programmers have fallen into, actually writing that code was pointless. It makes no difference whatsoever whether or not that statement is present when you run the code in the Release build. In fact, the jitter optimizer will remove that statement since it has no effect whatsoever. So be sure to not write code like that, even though it seemed to have an effect.
One final note about this topic, this is what gets programmers in trouble that write small programs to do something with an Office app. The debugger usually gets them on the Wrong Path, they want the Office program to exit on demand. The appropriate way to do that is by calling GC.Collect(). But they'll discover that it doesn't work when they debug their app, leading them into never-never land by calling Marshal.ReleaseComObject(). Manual memory management, it rarely works properly because they'll easily overlook an invisible interface reference. GC.Collect() actually works, just not when you debug the app.
[ Just wanted to add further on the Internals of Finalization process ]
You create an object and when the object is garbage collected, the object's Finalize method should be called. But there is more to finalization than this very simple assumption.
CONCEPTS:
Objects not implementing Finalize methods: their memory is reclaimed immediately, unless of course, they are not reachable by application code any more.
Objects implementing Finalize method: the concepts of Application Roots, Finalization Queue, Freachable Queue need to be understood since they are involved in the reclamation process.
Any object is considered garbage if it is not reachable by application code.
Assume: classes/objects A, B, D, G, H do not implement the Finalize method and C, E, F, I, J do implement the Finalize method.
When an application creates a new object, the new operator allocates memory from the heap. If the object's type contains a Finalize method, then a pointer to the object is placed on the finalization queue. Therefore pointers to objects C, E, F, I, J get added to the finalization queue.
The finalization queue is an internal data structure controlled by the garbage collector. Each entry in the queue points to an object that should have its Finalize method called before the object's memory can be reclaimed.
The figure below shows a heap containing several objects. Some of these objects are reachable from the application roots, and some are not. When objects C, E, F, I, and J are created, the .NET framework detects that these objects have Finalize methods and pointers to these objects are added to the finalization queue.
When a GC occurs (1st Collection), objects B, E, G, H, I, and J are determined to be garbage. A,C,D,F are still reachable by application code depicted as arrows from the yellow box above.
The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to the freachable queue ("F-reachable", i.e. finalizer reachable). The freachable queue is another internal data structure controlled by the garbage collector. Each pointer in the freachable queue identifies an object that is ready to have its Finalize method called.
After the 1st GC, the managed heap looks something similar to figure below. Explanation given below:
The memory occupied by objects B, G, and H has been reclaimed immediately because these objects did not have a finalize method that needed to be called.
However, the memory occupied by objects E, I, and J could not be reclaimed because their Finalize method has not been called yet. Calling the Finalize method is done by freachable queue.
A, C, D, F are still reachable by application code depicted as arrows from yellow box above, so they will not be collected in any case.
There is a special runtime thread dedicated to calling Finalize methods. When the freachable queue is empty (which is usually the case), this thread sleeps. But when entries appear, this thread wakes, removes each entry from the queue, and calls each object's Finalize method. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method. So here finally is when your Finalize method gets executed.
The next time the garbage collector is invoked (2nd GC), it sees that the finalized objects are truly garbage, since the application's roots don't point to it and the freachable queue no longer points to it (it's EMPTY too), therefore the memory for the objects E, I, J may be reclaimed from the heap. See figure below and compare it with figure just above.
The important thing to understand here is that two GCs are required to reclaim memory used by objects that require finalization. In reality, more than two collections cab be even required since these objects may get promoted to an older generation.
NOTE: The freachable queue is considered to be a root just like global and static variables are roots. Therefore, if an object is on the freachable queue, then the object is reachable and is not garbage.
As a last note, remember that debugging application is one thing, garbage collection is another thing and works differently. So far you can't feel garbage collection just by debugging applications. If you wish to further investigate memory get started here.

Doubts about .NET Garbage Collector

I've read some docs about the .NET Garbage Collector but i still have some doubts (examples in C#):
1)Does GC.Collect() call a partial or a full collection?
2)Does a partial collection block the execution of the "victim" application? If yes.. then i suppose this is a very "light" things to do since i'm running a game server that uses 2-3GB of memory and i "never" have execution stops (or i can't see them..).
3)I've read about GC roots but still can't understand how exactly they works. Suppose that this is the code (C#):
MyClass1:
[...]
public List<MyClass2> classList = new List<MyClass2>();
[...]
Main:
main()
{
MyClass1 a = new MyClass1();
MyClass2 b = new MyClass2();
a.classList.Add(b);
b = null;
DoSomeLongWork();
}
Will b ever be eligible to be garbage collected(before the DoSomeLongWork finishes)? The reference to b that classList contains, can it be considered a root?
Or a root is only the first reference to the instance? (i mean, b is the root reference because the instantiation happens there).
There are a couple of overloads for GC.Collect. The one without any parameters does a full collect. But keep in mind that it is rarely a good idea to call GC explicitly.
Garbage collections occurs within a managed application. The GC may have to suspend all managed threads in that process in order to compact the heap. It doesn't affect other processes on the system if that is what you're asking. For most applications it is usually not an issue, but you may experience performance related problems due to frequent collects. There are a couple of relevant performance counters (% time in GC, etc.) that you can use to monitor how the GC is performing.
Think of a root as a valid reference. In your example the instance of MyClass2 will not be collected if there are still references to it. I.e. if the instance pointed to by a is still rooted so will your instance of MyClass2. Also, keep in mind that GC is much more aggressive in release mode builds than debug builds.
GC.Collect() does a full collection. That's why it's not a good idea to call it yourself, as it can prematurely promote objects up a generation
AFAIK, not really. The GC blocks for so little time as to be inconsequential.
A stack GC root is an object that is currently being referenced by a method on the stack. Note that the behaviour is different between debug and release builds; on debug builds all variables in a method are kept alive until the method returns (so you can see their value in the debug window). On release builds the CLR keeps track of where method variables are used and the GC can remove objects that are referenced by the currently executing method but aren't used in the section of the method that is still to execute.
So, in your example, both a and b are not referenced again after the DoSomeLongWork() call, so in release builds, during that method execution they would both be eligible for collection. In debug builds they would hang around until the main() method returns.
Garbage collection is automatic. You don't have to interfere unless you are dealing with unmanaged resources. Garbage gets collected whenever the object is going out of scope; and at specific intervals whenever garbage-collector deems necessary - for instance, OS requires memory. This means there is no guarantee on how soon that will be, but your application will not run out of memory before memory from those objects is reclaimed.
Will b ever be eligible to be garbage collected(before the DoSomeLongWork finishes)?
Yes, whenever garbage collector finds it necessary.
Checkout Garbage Collector Basics and Performance Hints
Yes, GC.Collect() does a full collect, but you can do a manual GC.Collect(0) to do the gen 0 only.
All collections block the threads, but a partial collect does so only very briefly.
No, the instance of MyClass2 is still reachable in the other list. Both a and b are root references (during DoSomeLongWork) but since b is null that doesn't matter. Note that the GC is very tied in with the concept of reference types. b is only a local var referencing the anonymous object. The 'roots' for the GC are static fields, everything on the stack and even CPU registers. In a simpler way: everything your code still has access to.
Will b ever be eligible to be garbage collected (before the DoSomeLongWork finishes)?
Potentially yes, provided a and b are expelled from the set of global roots by the compiler (i.e. not kept on the stack) then they can be reclaimed by a collection during DoSomeLongWork.
I have found cases where .NET reclaims happily but Mono leaks memory.
The reference to b that classList contains, can it be considered a root?
Whether or not b will be turned into a global root depends entirely upon the compiler.
A toy compiler might push references from function arguments and returned from function calls onto the stack and unwind the stack at the end of each function. In this case, b will be pushed onto the stack and, therefore, will be a global root.
Production quality compilers perform sophisticated register allocation and maintain only live references on the stack, either overwriting or nulling references on the stack as they die. In this case, b is dead during the call to DoSomeLongWork so its stack entry will have been nulled or overwritten.
None of this can be inferred from the source code without details of what exactly the compiler will do. For example, my HLVM project is only a toy at this stage, using the former technique, but it will actually collect a and b in this case because the call to DoSomeLongWork is a tail call.
Or a root is only the first reference to the instance?
The global roots are the references the GC begins with in order to traverse all of the reachable data in the heap. The global roots are typically global variables and threads' stacks but more sophisticated GC algorithms can introduce new kinds of global roots, e.g. remembered sets.

Garbage collection not functioning as expected

This may likely be an issue with my inexperience using a Managed Language. The issue essentially is a loop within an objects method, which executes over about 20 seconds, throughout the duration of this loop the programs overall memory usage constantly goes up. Now all the variables within the loop which are modified are variables defined within the loops scope (ie. no class members are changed/re-allocated within the loop). After the entire method completes the excess memory is still being used.
I have absolutely no idea why/where this issue is but here are some things which may be a factor:
I use fonts within the loop, but I '.Dispose()' of them and have verified that there is no GDI leak.
I have try/catch statements which are in active use.
Objects are allocated...
So again, any ideas on where this issue may be coming from would be very helpful, I would post code but there is a lot of it. Also as mentioned above the memory is not cleaned up after the method call is done, and even after the object on which the method was invoked has gone out of scope.
Edit
I also just tried the GC.Collect() method, and nothing has changed in the overall result. I have no idea, but does this mean that the memory is not considered 'Garbage'? Again all the allocation is done within the scope of the loop so shouldn't it be considered garbage after the loop terminates. I understand the GC won't immediately get to cleaning it up, but using the GC.Collect() call should force that?
.NET uses traced garbage collection instead of the classic reference counting mechanism.
As soon as your .NET code releases an object or data, it doesn't get cleaned up instantly. It sits around for a while before being cleaned up. The garbage collector is a separate entity wandering around.
Microsoft states about the garbage collector
However, memory is not infinite.
Eventually the garbage collector must
perform a collection in order to free
some memory.
The garbage collector will come around at its own leisure based on complex algorithms. It will eventually clean everything up, if not at the end of the program lifetime. It's not recommended we poke or prod the garbage collector through System.GC members because we're supposed to assume it knows best.
Garbage collector will release objects if there is no pointer pointing to them. Be sure you don't keep not necessary objects in your variables (especially in arrays).

How do you get rid of an object in c#

In the following c# code, how do I get rid of the objects when it's no longer useful? Does it get taken care of automatically, or do I need to do something?
public void Test()
{
object MyObject = new object();
... code ...
}
Automatically. When MyObject goes out of scope (at the end of Test), it's flagged for garbage collection. At some point in the future, the .NET runtime will clean up the memory for you
Edit:
(I should point out for the sake of completeness that if you pass MyObject somewhere, it gets passed by reference, and will not be garbage collected - it's only when no more references to the object are floating around that the GC is free to collect it)
Edit: in a release build, MyObject will usually be collected as soon as it's unused (see my answer for more details --dp)
The short answer is: unless it has unmanaged resources (file handles etc) you don't need to worry.
The long answer is a bit more involved.
When .NET decides it wants to free up some memory, it runs the garbage collector. This looks for all the objects which are still in use, and marks them as such. Any local variable (in any stack frame of any thread) which may still be read counts as a root as do static variables. (In fact I believe that static variables are referenced via live Type objects, which are referenced via live AppDomain objects, but for the most part you can regard static variables as roots.)
The garbage collector looks at each object referred to by a root, and then finds more "live" references based on the instance variables within those objects. It recurses down, finding and marking more and more objects as "live". Once it's finished this process, it can then look at all the rest of the objects and free them.
That's a very broad conceptual picture - but it gets a lot more detailed when you think of the generational model of garbage collection, finalizers, concurrent collection etc. I strongly recommend that you read Jeff Richter's CLR via C# which goes into this in a lot of detail. He also has a two part article (back from 2000, but still very relevant) if you don't want to buy the book.
Of course all this doesn't mean you don't need to worry about memory usage and object lifetimes in .NET. In particular:
Creating objects pointlessly will cost performance. In particular, the garbage collector is fast but not free. Look for simple ways to reduce your memory usage as you code, but micro-optimising before you know you have a problem is also bad.
It's possible to "leak" memory by making objects reachable for longer than you intended. Two reasonably common causes of this are static variables and event subscriptions. (An event subscription makes the event handler reachable from the event publisher, but not the other way round.)
If you use more memory (in a live, reachable way) than you have available, your app will crash. There's not a lot .NET can do to prevent that!
Objects which use non-memory resources typically implement IDisposable. You should call Dispose on them to release those resources when you're finished with the object. Note that this doesn't free the object itself - only the garbage collector can do that. The using statement in C# is the most convenient way of calling Dispose reliably, even in the face of an exception.
The other answers are correct, unless your object is an instance of a class which implements the IDisposable interface, in which case you ought to (explicitly, or implicitly via a using statement) call the object's Dispose method.
In optimized code, it is possible and likely that MyObject will be collected before the end of the method. By default, the debug configuration in Visual Studio will build with the debug switch on and the optimize switch off which means that MyObject will be kept to the end of the method (so that you can look at the value while debugging). Building with optimize off (debug doesn't matter in this case) allows MyObject to be collected after it's determined to be unused. One way to force it to stay alive to the end of the method is to call GC.KeepAlive(MyObject) at the end of the method.
This will force the garbage collector to get rid of unused objects.
GC.Collect();
GC.WaitForPendingFinalizers();
If you want a specific object to be collected.
object A = new Object();
...
A = null;
GC.collect();
GC.WaitForPendingFinalizers();
It gets taken care of automatically.
Typically garbage collection can be depended on to cleanup, but if your object contains any unmanaged resources (database connections, open files etc) you will need to explicitly close those resources and/or exceute the dispose method on the object (if it implements IDisposable). This can lead to bugs so you need to be careful how you deal with these types of objects. Simply closing a file after using it is not always sufficient since an exception before the close executes will leave the file open. Either use a using block or a try-finally block.
Bottom line: "using" is your friend.

Categories

Resources