Weird behaviour of .NET Garbage Collector

Weird behaviour of .NET Garbage Collector - c#

MSDN says for GC.Collect()
All objects, regardless of how long they have been in memory, are
considered for collection; however, objects that are referenced in
managed code are not collected. Use this method to force the system
to try to reclaim the maximum amount of available memory.
So I would expect that a Child class that is still referenced in a Parent class is not collected before the Parent is collected.
But the weird thing is that it is MOSTLY collected BEFORE the parent is collected. This does not make any sense to me.
I compile the following code on VS2010 and run it on framework 4.0.
What I get is this:
using System;
namespace GarbageCollector
{
class Child
{
public bool bInUse = true;
public void Dispose()
{
Console.WriteLine("Child finished by Parent.");
bInUse = false;
}
~Child()
{
bInUse = false;
}
}
class Parent
{
Child child = new Child();
~Parent()
{
if (!child.bInUse)
Console.WriteLine("Finalizing Child that is still in use in a Parent!");
child.Dispose();
}
}
class Program
{
static void Main(string[] args)
{
while (true)
{
for (int i=0; i<10; i++)
{
Parent P = new Parent();
}
GC.Collect();
}
}
}
}
Can anybody explain me what is going on here?
EDIT:
I already found out how to solve the problem. If you want to access class members in the Finalizer of your class this can be a problem if these members themself also have a Finalizer. In this case the members may already be dead before the Finalizer of your class can access them because the GarbageCollector destroys them in ANY order. (Child before Parent or Child after Parent)
BUT if you access class members that do NOT have an own finalizer this problem does not appear.
So if you want to store for example a list of handles in your class and you want to close these handles in the Finalizer then make sure that this list class does NOT have an own Finalizer, otherwise your handles may be gone before you can close them!

So I would expect that a Child class that is still referenced in a Parent class is not collected before the Parent is collected.
That's not a true assumption. The GC is free to collect any object so long as it can prove that the object is no longer accessible from any code that will run at any point in the future. It is allowed to collect any object at that point, but it is free to collect, or leave, any of the objects meeting that condition. If an object references another, but neither are rooted or accessible from any rooted object, the GC is free to delete them in any order, or even to delete the child and not the parent.
It's also worth noting that your code is showing nothing. The finalizer for the object may be run at any point between when it's eligible for collect and when it actually is collected. even if the finalizer for both are run, the order that the finalizers run in isn't guarenteed to be the order that the objects themselves are collected in.
Of course, in practice, odds are very high that both objects will actually be collected at exactly the same time, unless one of the objects has existed for much longer than the other. The GC runs by considering all objects (in a given tier) as "dead", and then copying those that are still "alive" into a new section, leaving all that aren't copied to be overridden whenever something happens to need that memory, so if both objects are in the same GC tier (which is probable) then the memory for both locations is able to be overridden at exactly the same instant in time. As for when that memory is actually overridden, it'd be super hard to even find out (if it even ever is overridden).
So in the end the entire concept behind the quoted expectation isn't really a sensible premise, on many different levels.

Related

In C# Can I stop an object from being garbage collected, from the finalizer?

Or is it already to late if the finalize method is reached?
Basically I'm creating some code to log to a MySql database. Each log entry is represented by an object and stored in a queue until it gets flushed to the database in a batch insert / update. I figured it'd be inefficient to create a new object on the heap every time I wanted to write an entry (especially since I might want to write an entry or two in performance sensitive areas). My solution was to create a pool of objects and reuse them.
Basically I'm trying to not re-invent the wheel by letting the .Net Garbage Collector let me know when an object is no longer needed and can be added back to the pool. The problem is I need away to abort garbage collection from the destructor. Is that possible?

Can you? Yes.
Should you? No, it is almost certainly a terrible idea.
The general rule C# developers should remember is the following:
If you find yourself writing a finalizer, you probably did something wrong.
The memory allocators used by well-established managed VMs (such as the CLR or JVM) are extremely fast. One of the things that slows down the garbage collector in these systems is the use of customized finalizers. In an effort to optimize the runtime, you are actually giving up a very fast operation in favor of a much slower operation. Furthermore, the semantics of "bringing an object back to life" are difficult to understand and reason about.
Before you consider using a finalizer, you should understand everything in the following articles.
Never write a finalizer again (well, almost never)
DG Update: Dispose, Finalization, and Resource Management

Connection pooling is a feature virtually any major DB connection implementation is already going to natively support, so there is no reason to handle this manually. You'll be able to simply create a new connection for each operation and know that behind the scenes the connections will actually be pooled.
To answer the literal question that you asked, yes. You can ensure that an object is not going to be GCed after it is finalized. You can do so simply by creating a reference to it from some "live" location.
This is a really bad idea though. Take a look at this example:
public class Foo
{
public string Data;
public static Foo instance = null;
~Foo()
{
Console.WriteLine("Finalized");
instance = this;
}
}
public static void Bar()
{
new Foo() { Data = "Hello World" };
}
static void Main(string[] args)
{
Bar();
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine(Foo.instance.Data);
Foo.instance = null;
GC.Collect();
GC.WaitForPendingFinalizers();
}
This will print out:
Finalized
Hello World
So here we had an object end up being finalized, and we then accessed it later on. The problem however is that this object has been marked as "finalized". When it is finally hit by the GC again it's not finalized a second time.

You could re-register for finalization in the destructor, like so:
~YourClass()
{
System.GC.ReRegisterForFinalize(this);
}
And from there you'd probably want something to reference so it doesn't get finalized again, but this is a way to do it.
http://msdn.microsoft.com/en-us/library/system.gc.reregisterforfinalize(v=vs.110).aspx

Will the garbage collector free this List when my method finishes running

Lets say I have something like:
public class Item
{
public string Code;
public List<string> Codes = new List<string>();
}
private void SomeMethod()
{
List<Item> Items = new List<Item>();
for (int i = 0; i < 10; i++)
{
Item NewItem = new Item();
NewItem.Code = "Something " + i.ToString();
NewItem.Codes.Add("Something " + i.ToString());
Items.Add(Item);
}
//Do something with Items
}
I am instantiating Item and not free'ing it because I need to have access to it in the list later on (rough example).
What I am wondering is when SomeMethod() has finished executing, will Items (and the contents of it - including the List<>) be de-referenced and allow the garbage collector to clean up the memory as and when it runs? Basically, will this section of code cause any memory leaks or should everything be de-referenced when SomeMethod() has finished processing.
My understanding is that when nothing holds a reference to an object it will be garbage collected so in my mind this code should be Ok but I just wanted to make sure I understand correctly.
EDIT:
If I was to add one of the objects Items is holding into another list that would still be in scope (a global list for example). What would happen?

Once your variable Items goes out of scope, the garbage collector will indeed dispose of it (and its contents, as your code is written) at its leisure.

The garbage collector will collect anything that is no longer reachable by code. Since you will no longer be able to reach your Items list or any of the Items contained within the list, then the garbage collector will collect them at some point. Your understanding of the garbage collector is correct.

Ok,
List<Item> Items
is declared in the scope of the method. So when the method ends, it goes out of scope and is the list is dereferenced.
The memory will be released at some point after that, when the Garbage Collecter sees fit.
As an aside, since Items is declared in a "local" scope I'd prefer to call it items.

all the variables in the method are in a stackframe and all these variable will be destoried along with the stackframe after the method was executed.
that is said after SomeMethod() executed, the variable Items will nolonger exists, thus new List() will be marked "can be collected".
the second question is that, there is a globle list variable hold the reference of one of the object that in the variable Items, then after the SomeMethod() executed, the List in the SomeMethod will be marked as "can be collected", and all the other object beside the item that was referenced by the globle list variable will also be makred as "can be collected"
the reason is, the list actually hold the reference that point to the exact object in the heap
So the shared object can not be collected due to the reason that it is referenced by the globleList

In order to keep things sane and keep its job, GC has to promise two things:
Garbage will be collected. This seems pretty obvious, but if the GC doesn't promise to clean up managed objects without you having to tell it to, it's rather useless.
ONLY garbage will be collected. You don't have to worry about the GC cleaning up objects that are still in play.
So, once Items goes out of scope (that is, once the function returns), it's no longer reachable by running code, so it's eligible for collection without you having to do anything. Since the items in the list are no longer reachable either (their only link was in the list), they're eligible too. If you returned a reference to an entry in the list, though, that object is still in play and can't be collected yet. GC will do the right thing.
In fact, GC almost always does the right thing. There are really only three major cases you have to worry about:
If you're buiding a container (like if you're making your own List workalike). Any objects that no longer "exist" in the container as far as the user is concerned, should generally be set to null so the GC doesn't see them as reachable through your collection and erroneously keep them alive. If you remove an item from the collection, for example, null out the reference or overwrite it with another.
With large (>~85KB?) objects. They'll typically stick around in memory til they crowd out everything else, at which point a full GC cycle runs. (Normally, only certain likely-to-be-discarded objects are checked during a cycle. Full collections check pretty much everything, which takes significantly longer but might free more memory.)
If you're using IDisposable objects or native/unmanaged resources. Some incompetents don't know how to implement IDisposable correctly. If you're dealing with a library created by such an incompetent, then you'll need to ensure that you Dispose stuff, or only ever use it within a using block, or things can get really weird. (If you only use the .net API, you're pretty safe. But disposing is still good manners.)
For pretty much all other occasions, GC just works. Trust it.

Store 'this' at finalization

How could be defined a code that store 'this' during class finalization? How the garbage collector should behave (if defined somewhere)?
In my mind the GC should finalize multiple times the class instance, and the following test application shall print "66", but the finalizer is executed only once, causing the application to print "6".
Few lines of code:
using System;
namespace Test
{
class Finalized
{
~Finalized()
{
Program.mFinalized = this;
}
public int X = 5;
}
class Program
{
public static Finalized mFinalized = null;
static void Main(string[] args)
{
Finalized asd = new Finalized();
asd.X = 6;
asd = null;
GC.Collect();
if (mFinalized != null)
Console.Write("{0}", mFinalized.X);
mFinalized = null;
GC.Collect();
if (mFinalized != null)
Console.Write("{0}", mFinalized.X);
}
}
}
What I'm trying to do is to understand how finalizers manage instance memory. In my application could be desiderable to re-use instance reference again for further processing.
It's clear that the finalizer doesn't "free" memory (at least in my test application). May the memory chunk be reused for other purposes? Or even freed? And if it isn't, that would be a memory leak or what?
Now, I'm confused more than before.

This is due to Resurrection. By storing the object in another variable during finalization (assigning this to a variable), you resurrect the obejct instance as far as the GC is concerned. You are allowed to resurrect your object in .NET, and you can actually cause the GC to finalize the object more than once, but you have to explicitly request it via GC.ReRegisterForFinalize .
For details, see Automatic Memory Management in the Microsoft .NET Framework.

GC.Collect does a sweep, special-casing any objects with a finalizer and not collecting them. Once these finalizer objects have finalized, GC then runs again over these objects. If they're no longer eligible for collection (by re-rooting, as you do), so be it. Normally the finalizer only runs once, but IIRC, you can request that it runs again.

Finalizer only gets called once. You're free to assign self to somewhere, and prevent the object being garbage collected. But once the object is available again for GC, it doesn't run the finalizer.

I'm interested in any good uses of resurrected objects.
The MSDN states "There are very few good uses of resurrection, and you really should avoid it if possible".
Also Bill Wagner in his Effective C# says "You cannot make this kind of construct work reliably. Dont try". But the book is 2 years old so maybe something changed?

Which objects can I use in a finalizer method?

I have a class that should delete some file when disposed or finalized. Inside finalizers I can't use other objects because they could have been garbage-collected already.
Am I missing some point regarding finalizers and strings could be used?
UPD: Something like that:
public class TempFileStream : FileStream
{
private string _filename;
public TempFileStream(string filename)
:base(filename, FileMode.Open, FileAccess.Read, FileShare.Read)
{
_filename = filename;
}
protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (_filename == null) return;
try
{
File.Delete(_filename); // <-- oops! _filename could be gc-ed already
_filename = null;
}
catch (Exception e)
{
...
}
}
}

Yes, you can most certainly use strings from within a finalizer, and many other object types.
For the definitive source of all this, I would go pick up the book CLR via C#, 3rd edition, written by Jeffrey Richter. In chapter 21 this is all described in detail.
Anyway, here's what is really happening...
During garbage collection, any objects that have a finalizer that still wants to be called are placed on a special list, called the freachable list.
This list is considered a root, just as static variables and live local variables are. Therefore, any objects those objects refer to, and so on recursively is removed from the garbage collection cycle this time. They will survive the current garbage collection cycle as though they weren't eligible to collect to begin with.
Note that this includes strings, which was your question, but it also involves all other object types
Then, at some later point in time, the finalizer thread picks up the object from that list, and runs the finalizer on those objects, and then takes those objects off that list.
Then, the next time garbage collection runs, it finds the same objects once more, but this time the finalizer no longer wants to run, it has already been executed, and so the objects are collected as normal.
Let me illustrate with an example before I tell you what doesn't work.
Let's say you have objects A through Z, and each object references the next one, so you have object A referencing object B, B references C, C references D, and so on until Z.
Some of these objects implement finalizers, and they all implement IDisposable. Let's assume that A does not implement a finalizer but B does, and then some of the rest does as well, it's not important for this example which does beyond A and B.
Your program holds onto a reference to A, and only A.
In an ordinary, and correct, usage pattern you would dispose of A, which would dispose of B, which would dispose of C, etc. but you have a bug, so this doesn't happen. At some point, all of these objects are eligible for collection.
At this point GC will find all of these objects, but then notice that B has a finalizer, and it has not yet run. GC will therefore put B on the freachable list, and recursively take C, D, E, etc. up to Z, off of the GC list, because since B suddenly became in- eligible for collection, so does the rest. Note that some of these objects are also placed on the freachable list themselves, because they have finalizers on their own, but all the objects they refer to will survive GC.
A, however, is collected.
Let me make the above paragraph clear. At this point, A has been collected, but B, C, D, etc. up to Z are still alive as though nothing has happened. Though your code no longer has a reference to any of them, the freachable list has.
Then, the finalizer thread runs, and finalizes all of the objects in the freachable list, and takes the objects off of the list.
The next time GC is run, those objects are now collected.
So that certainly works, so what is the big bruaha about?
The problem is with the finalizer thread. This thread makes no assumptions about the order in which it should finalize those objects. It doesn't do this because in many cases it would be impossible for it to do so.
As I said above, in an ordinary world you would call dispose on A, which disposes B, which disposes C, etc. If one of these objects is a stream, the object referencing the stream might, in its call to Dispose, say "I'll just go ahead and flush my buffers before disposing the stream." This is perfectly legal and lots of existing code do this.
However, in the finalization thread, this order is no longer used, and thus if the stream was placed on the list before the objects that referenced it, the stream is finalized, and thus closed, before the object referencing it.
In other words, what you cannot do is summarized as follows:
You can not access any objects your object refer to, that has finalizers, as you have no guarantee that these objects will be in a usable state when your finalizer runs. The objects will still be there, in memory, and not collected, but they may be closed, terminated, finalized, etc. already.
So, back to your question:
Q. Can I use strings in finalizer method?
A. Yes, because strings do not implement a finalizer, and does not rely on other objects that has a finalizer, and will thus be alive and kicking at the time your finalizer runs.
The assumption that made you take the wrong path is the second sentence of the qustion:
Inside finalizers I can't use other objects because they could have been garbage-collected already.
The correct sentence would be:
Inside finalizer I can't use other objects that have finalizers, because they could have been finalized already.
For an example of something the finalizer would have no way of knowing the order in which to correctly finalize two objects, consider two objects that refer to each other and that both have finalizers. The finalizer thread would have to analyze the code to determine in which order they would normally be disposed, which might be a "dance" between the two objects. The finalizer thread does not do this, it just finalizes one before the other, and you have no guarantee which is first.
So, is there any time it is safe to access objects that also have a finalizer, from my own finalizer?
The only guaranteed safe scenario is when your program/class library/source code owns both objects so that you know that it is.
Before I explain this, this is not really good programming practices, so you probably shouldn't do it.
Example:
You have an object, Cache, that writes data to a file, this file is never kept open, and is thus only open when the object needs to write data to it.
You have another object, CacheManager, that uses the first one, and calls into the first object to give it data to write to the file.
CacheManager has a finalizer. The semantics here is that if the manager class is collected, but not disposed, it should delete the caches as it cannot guarantee their state.
However, the filename of the cache object is retrievable from a property of the cache object.
So the question is, do I need to make a copy of that filename into the manager object, to avoid problems during finalization?
Nope, you don't. When the manager is finalized, the cache object is still in memory, as is the filename string it refers to. What you cannot guarantee, however, is that any finalizer on the cache object hasn't already run.
However, in this case, if you know that the finalizer of the cache object either doesn't exist, or doesn't touch the file, your manager can read the filename property of the cache object, and delete the file.
However, since you now have a pretty strange dependency going on here, I would certainly advice against it.

Another point not yet mentioned is that although one might not expect that an object's finalizer would ever run while an object is in use, the finalization mechanism does not ensure that. Finalizers can be run in an arbitrary unknown threading context; as a consequence, they should either avoid using any types that aren't thread-safe, or should use locking or other means to ensure that they only use things in thread-safe fashion. Note finalizers should use Monitor.TryEnter rather than Monitor.Enter, and endeavor to act as gracefully as possible if a lock is unexpectedly held. Note that since finalizers aren't supposed to run while an object is still in use, the fact that a lock was unexpectedly held will often suggest that a finalizer was run early. Depending upon the design of the code which uses the lock, it may be possible to have the finalizer set a flag and try again to acquire the lock, and have any other code which uses the lock check after releasing it whether that flag is set and, if so, reregister the object for finalization.
Handling finalization cleanup correctly in all threading scenarios is difficult. Finalization might not seem complicated, but no convenient automated mechanisms exist by which objects can ensure that finalizers won't run while the objects in question are in use. Consequently, finalizers have a lot of subtle thread-safety issues. Code which ignores such issues will "usually" work, but may sometimes fail in difficult-to-diagnose ways.

You can call the dispose method inside your finalizer and have the file cleanup code in the Dispose method. Along with that you can also pass a boolean to your dispose method that indicates that you are invoking it from the finalizer.
For an excellent reference on the proper usage of Dispose and Fianlizers , read this Proper use of the IDisposable interface

Clearing a double-linked list

I have a double linked list (queue) I have made on my own.
I am wondering, to clear the linked list, is it enough to simply remove the head and tail references?
E.g
public void Clear()
{
Head = null;
Tail = null;
}
I am imaging a domino effect, but I am having a hard time testing it.
It WILL make the whole object appear empty atleast. All data requests (such as peek, dequeue etc.) returns null.
You can also easily Enqueue some new objects.
Purely functional it seems to be working.
But I'd really like to know if I am doing it the right way.

The short answer is yes, garbage collection will clear out all the linked list nodes, provided that nothing external holds a reference to them.
Easiest way to test is to add a finalizer to your linked list node object that outputs some logging. Note that you can't be sure when the garbage collector runs (without forcing it, via GC.Collect()) so you won't see the finalizer called as soon as you call you Clear() method.
The "domino effect" is not going to happen, though; it doesn't matter if references are held to an object, rather than references can be traced back to the stack or a static object. So if several objects refer to each other, but nothing refers to them, then they'll all be garbage collected simultaneously.

Unless the objects in the collection needs to be disposed of, and it is the responsibility of the collection to do that, then your method is probably the best way.
Since there is no root to the object (no live references), garbage collection can pick it up and remove it.

I am imaging a domino effect
This isn't how the GC works.
The GC first marks everything "dead", then starting at the root objects it traverses all objects referenced by them, marking each one as "alive".
Since your list is no longer reference by any root objects (or children of) it will be left marked "dead".
The second pass then frees the "dead" objects.
I doubt that you can assume in a finalizer that any objects either side in the list have not been collected first, ie it will be in the GC's own order not the order of the list.
A bit more detail here:-
http://msdn.microsoft.com/en-us/magazine/bb985010.aspx

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.