Clearing a double-linked list

Clearing a double-linked list - c#

I have a double linked list (queue) I have made on my own.
I am wondering, to clear the linked list, is it enough to simply remove the head and tail references?
E.g
public void Clear()
{
Head = null;
Tail = null;
}
I am imaging a domino effect, but I am having a hard time testing it.
It WILL make the whole object appear empty atleast. All data requests (such as peek, dequeue etc.) returns null.
You can also easily Enqueue some new objects.
Purely functional it seems to be working.
But I'd really like to know if I am doing it the right way.

The short answer is yes, garbage collection will clear out all the linked list nodes, provided that nothing external holds a reference to them.
Easiest way to test is to add a finalizer to your linked list node object that outputs some logging. Note that you can't be sure when the garbage collector runs (without forcing it, via GC.Collect()) so you won't see the finalizer called as soon as you call you Clear() method.
The "domino effect" is not going to happen, though; it doesn't matter if references are held to an object, rather than references can be traced back to the stack or a static object. So if several objects refer to each other, but nothing refers to them, then they'll all be garbage collected simultaneously.

Unless the objects in the collection needs to be disposed of, and it is the responsibility of the collection to do that, then your method is probably the best way.
Since there is no root to the object (no live references), garbage collection can pick it up and remove it.

I am imaging a domino effect
This isn't how the GC works.
The GC first marks everything "dead", then starting at the root objects it traverses all objects referenced by them, marking each one as "alive".
Since your list is no longer reference by any root objects (or children of) it will be left marked "dead".
The second pass then frees the "dead" objects.
I doubt that you can assume in a finalizer that any objects either side in the list have not been collected first, ie it will be in the GC's own order not the order of the list.
A bit more detail here:-
http://msdn.microsoft.com/en-us/magazine/bb985010.aspx

Related

If I remove elements from a List - does this mean my RAM is also cleared/released via the garbagecollector or somehow?

In my program I add millions of records in a List in a loop, so as the program continues my RAM seems to grow too (I suspect because new and new elements are added to the list) - so if a timer tick event or just button push results in list.Clear(): will this immediately release my RAM? The ram seems to grow a lot to 500mb+ so with limited free amount on amazon instances this is an issue...thanx!

List have Clear() or Remove() methods to clear or Remove the Items within the List.
Force the Objects within the List which are not in use, to be cleared. Or Can assign null values to those Objects which are not in use, so that the Null Assigned Objects can Flag the GC that it is not in use and is ready to be Collected.
Assigning List Object to Null after the use, would also help GC to handle it well.
ie: myListVariable = null;
So that the memory can be cleared by the GC and can be freed from the unused Objects.

It's really not recommended, but you can force the garbage collector to run after you've cleared the list?
I found this link:
How to force garbage collector to run?
I'd look into implementing the iDisposable interface into your class. This would make it a managed resource and should automatically clear system resources when you're done with the data.
Good luck!

Garbage collection on removal of root

I have a binary tree implemented in c# with a method clear() which makes the root null, thus removing the reference to the root node in the heap. This makes the root node in the heap eligible for garbage collection.
But during a garbage collection cycle will only the root node be collected and then its two children will become eligible for garbage collection in the next cycle and taking as many cycles to remove the tree as the depth of the tree or will the whole tree in the heap be collected in one cycle?

The whole tree will be collected because the GC finds in one cycle ALL objects which cannot be reached. Because none of the nodes in your tree are reachable (from some live root), the entire set of nodes should be GCed.
This is an efficient way for the GC to work - but it's also a good way for it to deal with self-referential data structures such as a doubly-linked list (which would cause problems for any algorithm which didn't take "reachability from some reachable root" into account).

Actually that depends on how long your objects have been alive.
.NET uses an algorithm named 'mark & sweep', which is described here. The algorithm basically marks everything for removal, except the things that can be reached. Your objects will be removed in one iteration here.
However, naive mark & sweep will take a lot of time, since most "long-lived" objects will survive the GC. The more long-lived objects you have, the more time the GC will require to mark all objects. This is why .NET keeps track of how many GC cycles an object has survived. If it survives a couple of times, the next GC will skip the object. These are called 'generations' and are described on MSDN. Simply put, more times an object has survived a GC cycle, the less frequent it will be visited by the garbage collector for removal.
However, once your structure is at some point marked as 'not referenced' by the GC, the complete structure will be removed in one single pass.

Will the garbage collector free this List when my method finishes running

Lets say I have something like:
public class Item
{
public string Code;
public List<string> Codes = new List<string>();
}
private void SomeMethod()
{
List<Item> Items = new List<Item>();
for (int i = 0; i < 10; i++)
{
Item NewItem = new Item();
NewItem.Code = "Something " + i.ToString();
NewItem.Codes.Add("Something " + i.ToString());
Items.Add(Item);
}
//Do something with Items
}
I am instantiating Item and not free'ing it because I need to have access to it in the list later on (rough example).
What I am wondering is when SomeMethod() has finished executing, will Items (and the contents of it - including the List<>) be de-referenced and allow the garbage collector to clean up the memory as and when it runs? Basically, will this section of code cause any memory leaks or should everything be de-referenced when SomeMethod() has finished processing.
My understanding is that when nothing holds a reference to an object it will be garbage collected so in my mind this code should be Ok but I just wanted to make sure I understand correctly.
EDIT:
If I was to add one of the objects Items is holding into another list that would still be in scope (a global list for example). What would happen?

Once your variable Items goes out of scope, the garbage collector will indeed dispose of it (and its contents, as your code is written) at its leisure.

The garbage collector will collect anything that is no longer reachable by code. Since you will no longer be able to reach your Items list or any of the Items contained within the list, then the garbage collector will collect them at some point. Your understanding of the garbage collector is correct.

Ok,
List<Item> Items
is declared in the scope of the method. So when the method ends, it goes out of scope and is the list is dereferenced.
The memory will be released at some point after that, when the Garbage Collecter sees fit.
As an aside, since Items is declared in a "local" scope I'd prefer to call it items.

all the variables in the method are in a stackframe and all these variable will be destoried along with the stackframe after the method was executed.
that is said after SomeMethod() executed, the variable Items will nolonger exists, thus new List() will be marked "can be collected".
the second question is that, there is a globle list variable hold the reference of one of the object that in the variable Items, then after the SomeMethod() executed, the List in the SomeMethod will be marked as "can be collected", and all the other object beside the item that was referenced by the globle list variable will also be makred as "can be collected"
the reason is, the list actually hold the reference that point to the exact object in the heap
So the shared object can not be collected due to the reason that it is referenced by the globleList

In order to keep things sane and keep its job, GC has to promise two things:
Garbage will be collected. This seems pretty obvious, but if the GC doesn't promise to clean up managed objects without you having to tell it to, it's rather useless.
ONLY garbage will be collected. You don't have to worry about the GC cleaning up objects that are still in play.
So, once Items goes out of scope (that is, once the function returns), it's no longer reachable by running code, so it's eligible for collection without you having to do anything. Since the items in the list are no longer reachable either (their only link was in the list), they're eligible too. If you returned a reference to an entry in the list, though, that object is still in play and can't be collected yet. GC will do the right thing.
In fact, GC almost always does the right thing. There are really only three major cases you have to worry about:
If you're buiding a container (like if you're making your own List workalike). Any objects that no longer "exist" in the container as far as the user is concerned, should generally be set to null so the GC doesn't see them as reachable through your collection and erroneously keep them alive. If you remove an item from the collection, for example, null out the reference or overwrite it with another.
With large (>~85KB?) objects. They'll typically stick around in memory til they crowd out everything else, at which point a full GC cycle runs. (Normally, only certain likely-to-be-discarded objects are checked during a cycle. Full collections check pretty much everything, which takes significantly longer but might free more memory.)
If you're using IDisposable objects or native/unmanaged resources. Some incompetents don't know how to implement IDisposable correctly. If you're dealing with a library created by such an incompetent, then you'll need to ensure that you Dispose stuff, or only ever use it within a using block, or things can get really weird. (If you only use the .net API, you're pretty safe. But disposing is still good manners.)
For pretty much all other occasions, GC just works. Trust it.

Garbage collection and references C#

I have a design and I am not sure if garbage collection will occur correctly.
I have some magic apples, and some are yummy some are bad.
I have a dictionary : BasketList = Dictionary <basketID,Basket>
(list of baskets).
Each Basket object has a single Apple in it and each Basket stores a reference to an objectAppleSeperation.AppleSeperation stores 2 dictionaries, YummyApples = <basketID,Apple> and BadApples = Dictionary<basketID,Apple>, so when I'm asked where an apple is I know.
An Apple object stores BasketsImIn = Dictionary<ID,Basket>, which points to the Basket and in Shops, and the Apple in Basket.
My question is, if I delete a basket from BasketList and make sure I delete the Apple from BadApples and/or YummyApples, will garbage collection happen properly, or will there be some messy references lying around?

You are right to be thinking carefully about this; having the various references has implications not just for garbage collection, but for your application's proper functioning.
However, as long as you are careful to remove all references you have set, and you don't have any other free-standing variables holding onto the reference, the garbage collector will do its work.
The GC is actually fairly sophisticated at collecting unreferenced objects. For example, it can collect two objects that reference each other, but have no other 'living' references in the application.

See http://msdn.microsoft.com/en-us/library/ee787088.aspx for fundamentals on garbage collection.
From the above link, when garbage collection happens...
The system has low physical memory.
The memory that is used by allocated objects on the managed heap surpasses an acceptable threshold. This means that a threshold of acceptable memory usage has been exceeded on the managed heap. This threshold is continuously adjusted as the process runs.
The GC.Collect method is called. In almost all cases, you do not have to call this method, because the garbage collector runs continuously. This method is primarily used for unique situations and testing.
If you are performing the clean-up properly, then you need not worry!

Which objects can I use in a finalizer method?

I have a class that should delete some file when disposed or finalized. Inside finalizers I can't use other objects because they could have been garbage-collected already.
Am I missing some point regarding finalizers and strings could be used?
UPD: Something like that:
public class TempFileStream : FileStream
{
private string _filename;
public TempFileStream(string filename)
:base(filename, FileMode.Open, FileAccess.Read, FileShare.Read)
{
_filename = filename;
}
protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (_filename == null) return;
try
{
File.Delete(_filename); // <-- oops! _filename could be gc-ed already
_filename = null;
}
catch (Exception e)
{
...
}
}
}

Yes, you can most certainly use strings from within a finalizer, and many other object types.
For the definitive source of all this, I would go pick up the book CLR via C#, 3rd edition, written by Jeffrey Richter. In chapter 21 this is all described in detail.
Anyway, here's what is really happening...
During garbage collection, any objects that have a finalizer that still wants to be called are placed on a special list, called the freachable list.
This list is considered a root, just as static variables and live local variables are. Therefore, any objects those objects refer to, and so on recursively is removed from the garbage collection cycle this time. They will survive the current garbage collection cycle as though they weren't eligible to collect to begin with.
Note that this includes strings, which was your question, but it also involves all other object types
Then, at some later point in time, the finalizer thread picks up the object from that list, and runs the finalizer on those objects, and then takes those objects off that list.
Then, the next time garbage collection runs, it finds the same objects once more, but this time the finalizer no longer wants to run, it has already been executed, and so the objects are collected as normal.
Let me illustrate with an example before I tell you what doesn't work.
Let's say you have objects A through Z, and each object references the next one, so you have object A referencing object B, B references C, C references D, and so on until Z.
Some of these objects implement finalizers, and they all implement IDisposable. Let's assume that A does not implement a finalizer but B does, and then some of the rest does as well, it's not important for this example which does beyond A and B.
Your program holds onto a reference to A, and only A.
In an ordinary, and correct, usage pattern you would dispose of A, which would dispose of B, which would dispose of C, etc. but you have a bug, so this doesn't happen. At some point, all of these objects are eligible for collection.
At this point GC will find all of these objects, but then notice that B has a finalizer, and it has not yet run. GC will therefore put B on the freachable list, and recursively take C, D, E, etc. up to Z, off of the GC list, because since B suddenly became in- eligible for collection, so does the rest. Note that some of these objects are also placed on the freachable list themselves, because they have finalizers on their own, but all the objects they refer to will survive GC.
A, however, is collected.
Let me make the above paragraph clear. At this point, A has been collected, but B, C, D, etc. up to Z are still alive as though nothing has happened. Though your code no longer has a reference to any of them, the freachable list has.
Then, the finalizer thread runs, and finalizes all of the objects in the freachable list, and takes the objects off of the list.
The next time GC is run, those objects are now collected.
So that certainly works, so what is the big bruaha about?
The problem is with the finalizer thread. This thread makes no assumptions about the order in which it should finalize those objects. It doesn't do this because in many cases it would be impossible for it to do so.
As I said above, in an ordinary world you would call dispose on A, which disposes B, which disposes C, etc. If one of these objects is a stream, the object referencing the stream might, in its call to Dispose, say "I'll just go ahead and flush my buffers before disposing the stream." This is perfectly legal and lots of existing code do this.
However, in the finalization thread, this order is no longer used, and thus if the stream was placed on the list before the objects that referenced it, the stream is finalized, and thus closed, before the object referencing it.
In other words, what you cannot do is summarized as follows:
You can not access any objects your object refer to, that has finalizers, as you have no guarantee that these objects will be in a usable state when your finalizer runs. The objects will still be there, in memory, and not collected, but they may be closed, terminated, finalized, etc. already.
So, back to your question:
Q. Can I use strings in finalizer method?
A. Yes, because strings do not implement a finalizer, and does not rely on other objects that has a finalizer, and will thus be alive and kicking at the time your finalizer runs.
The assumption that made you take the wrong path is the second sentence of the qustion:
Inside finalizers I can't use other objects because they could have been garbage-collected already.
The correct sentence would be:
Inside finalizer I can't use other objects that have finalizers, because they could have been finalized already.
For an example of something the finalizer would have no way of knowing the order in which to correctly finalize two objects, consider two objects that refer to each other and that both have finalizers. The finalizer thread would have to analyze the code to determine in which order they would normally be disposed, which might be a "dance" between the two objects. The finalizer thread does not do this, it just finalizes one before the other, and you have no guarantee which is first.
So, is there any time it is safe to access objects that also have a finalizer, from my own finalizer?
The only guaranteed safe scenario is when your program/class library/source code owns both objects so that you know that it is.
Before I explain this, this is not really good programming practices, so you probably shouldn't do it.
Example:
You have an object, Cache, that writes data to a file, this file is never kept open, and is thus only open when the object needs to write data to it.
You have another object, CacheManager, that uses the first one, and calls into the first object to give it data to write to the file.
CacheManager has a finalizer. The semantics here is that if the manager class is collected, but not disposed, it should delete the caches as it cannot guarantee their state.
However, the filename of the cache object is retrievable from a property of the cache object.
So the question is, do I need to make a copy of that filename into the manager object, to avoid problems during finalization?
Nope, you don't. When the manager is finalized, the cache object is still in memory, as is the filename string it refers to. What you cannot guarantee, however, is that any finalizer on the cache object hasn't already run.
However, in this case, if you know that the finalizer of the cache object either doesn't exist, or doesn't touch the file, your manager can read the filename property of the cache object, and delete the file.
However, since you now have a pretty strange dependency going on here, I would certainly advice against it.

Another point not yet mentioned is that although one might not expect that an object's finalizer would ever run while an object is in use, the finalization mechanism does not ensure that. Finalizers can be run in an arbitrary unknown threading context; as a consequence, they should either avoid using any types that aren't thread-safe, or should use locking or other means to ensure that they only use things in thread-safe fashion. Note finalizers should use Monitor.TryEnter rather than Monitor.Enter, and endeavor to act as gracefully as possible if a lock is unexpectedly held. Note that since finalizers aren't supposed to run while an object is still in use, the fact that a lock was unexpectedly held will often suggest that a finalizer was run early. Depending upon the design of the code which uses the lock, it may be possible to have the finalizer set a flag and try again to acquire the lock, and have any other code which uses the lock check after releasing it whether that flag is set and, if so, reregister the object for finalization.
Handling finalization cleanup correctly in all threading scenarios is difficult. Finalization might not seem complicated, but no convenient automated mechanisms exist by which objects can ensure that finalizers won't run while the objects in question are in use. Consequently, finalizers have a lot of subtle thread-safety issues. Code which ignores such issues will "usually" work, but may sometimes fail in difficult-to-diagnose ways.

You can call the dispose method inside your finalizer and have the file cleanup code in the Dispose method. Along with that you can also pass a boolean to your dispose method that indicates that you are invoking it from the finalizer.
For an excellent reference on the proper usage of Dispose and Fianlizers , read this Proper use of the IDisposable interface

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.