Should you set all the objects to null (Nothing in VB.NET) once you have finished with them?
I understand that in .NET it is essential to dispose of any instances of objects that implement the IDisposable interface to release some resources although the object can still be something after it is disposed (hence the isDisposed property in forms), so I assume it can still reside in memory or at least in part?
I also know that when an object goes out of scope it is then marked for collection ready for the next pass of the garbage collector (although this may take time).
So with this in mind will setting it to null speed up the system releasing the memory as it does not have to work out that it is no longer in scope and are they any bad side effects?
MSDN articles never do this in examples and currently I do this as I cannot
see the harm. However I have come across a mixture of opinions so any comments are useful.
Karl is absolutely correct, there is no need to set objects to null after use. If an object implements IDisposable, just make sure you call IDisposable.Dispose() when you're done with that object (wrapped in a try..finally, or, a using() block). But even if you don't remember to call Dispose(), the finaliser method on the object should be calling Dispose() for you.
I thought this was a good treatment:
Digging into IDisposable
and this
Understanding IDisposable
There isn't any point in trying to second guess the GC and its management strategies because it's self tuning and opaque. There was a good discussion about the inner workings with Jeffrey Richter on Dot Net Rocks here: Jeffrey Richter on the Windows Memory Model and
Richters book CLR via C# chapter 20 has a great treatment:
Another reason to avoid setting objects to null when you are done with them is that it can actually keep them alive for longer.
e.g.
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is now eligible for garbage collection
// ... rest of method not using 'someType' ...
}
will allow the object referred by someType to be GC'd after the call to "DoSomething" but
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is NOT eligible for garbage collection yet
// because that variable is used at the end of the method
// ... rest of method not using 'someType' ...
someType = null;
}
may sometimes keep the object alive until the end of the method. The JIT will usually optimized away the assignment to null, so both bits of code end up being the same.
No don't null objects. You can check out https://web.archive.org/web/20160325050833/http://codebetter.com/karlseguin/2008/04/28/foundations-of-programming-pt-7-back-to-basics-memory/ for more information, but setting things to null won't do anything, except dirty your code.
Also:
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
In general, there's no need to null objects after use, but in some cases I find it's a good practice.
If an object implements IDisposable and is stored in a field, I think it's good to null it, just to avoid using the disposed object. The bugs of the following sort can be painful:
this.myField.Dispose();
// ... at some later time
this.myField.DoSomething();
It's good to null the field after disposing it, and get a NullPtrEx right at the line where the field is used again. Otherwise, you might run into some cryptic bug down the line (depending on exactly what DoSomething does).
Chances are that your code is not structured tightly enough if you feel the need to null variables.
There are a number of ways to limit the scope of a variable:
As mentioned by Steve Tranby
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
Similarly, you can simply use curly brackets:
{
// Declare the variable and use it
SomeObject object = new SomeObject()
}
// The variable is no longer available
I find that using curly brackets without any "heading" to really clean out the code and help make it more understandable.
In general no need to set to null. But suppose you have a Reset functionality in your class.
Then you might do, because you do not want to call dispose twice, since some of the Dispose may not be implemented correctly and throw System.ObjectDisposed exception.
private void Reset()
{
if(_dataset != null)
{
_dataset.Dispose();
_dataset = null;
}
//..More such member variables like oracle connection etc. _oraConnection
}
The only time you should set a variable to null is when the variable does not go out of scope and you no longer need the data associated with it. Otherwise there is no need.
this kind of "there is no need to set objects to null after use" is not entirely accurate. There are times you need to NULL the variable after disposing it.
Yes, you should ALWAYS call .Dispose() or .Close() on anything that has it when you are done. Be it file handles, database connections or disposable objects.
Separate from that is the very practical pattern of LazyLoad.
Say I have and instantiated ObjA of class A. Class A has a public property called PropB of class B.
Internally, PropB uses the private variable of _B and defaults to null. When PropB.Get() is used, it checks to see if _PropB is null and if it is, opens the resources needed to instantiate a B into _PropB. It then returns _PropB.
To my experience, this is a really useful trick.
Where the need to null comes in is if you reset or change A in some way that the contents of _PropB were the child of the previous values of A, you will need to Dispose AND null out _PropB so LazyLoad can reset to fetch the right value IF the code requires it.
If you only do _PropB.Dispose() and shortly after expect the null check for LazyLoad to succeed, it won't be null, and you'll be looking at stale data. In effect, you must null it after Dispose() just to be sure.
I sure wish it were otherwise, but I've got code right now exhibiting this behavior after a Dispose() on a _PropB and outside of the calling function that did the Dispose (and thus almost out of scope), the private prop still isn't null, and the stale data is still there.
Eventually, the disposed property will null out, but that's been non-deterministic from my perspective.
The core reason, as dbkk alludes is that the parent container (ObjA with PropB) is keeping the instance of _PropB in scope, despite the Dispose().
Stephen Cleary explains very well in this post: Should I Set Variables to Null to Assist Garbage Collection?
Says:
The Short Answer, for the Impatient
Yes, if the variable is a static field, or if you are writing an enumerable method (using yield return) or an asynchronous method (using async and await). Otherwise, no.
This means that in regular methods (non-enumerable and non-asynchronous), you do not set local variables, method parameters, or instance fields to null.
(Even if you’re implementing IDisposable.Dispose, you still should not set variables to null).
The important thing that we should consider is Static Fields.
Static fields are always root objects, so they are always considered “alive” by the garbage collector. If a static field references an object that is no longer needed, it should be set to null so that the garbage collector will treat it as eligible for collection.
Setting static fields to null is meaningless if the entire process is shutting down. The entire heap is about to be garbage collected at that point, including all the root objects.
Conclusion:
Static fields; that’s about it. Anything else is a waste of time.
There are some cases where it makes sense to null references. For instance, when you're writing a collection--like a priority queue--and by your contract, you shouldn't be keeping those objects alive for the client after the client has removed them from the queue.
But this sort of thing only matters in long lived collections. If the queue's not going to survive the end of the function it was created in, then it matters a whole lot less.
On a whole, you really shouldn't bother. Let the compiler and GC do their jobs so you can do yours.
Take a look at this article as well: http://www.codeproject.com/KB/cs/idisposable.aspx
For the most part, setting an object to null has no effect. The only time you should be sure to do so is if you are working with a "large object", which is one larger than 84K in size (such as bitmaps).
I believe by design of the GC implementors, you can't speed up GC with nullification. I'm sure they'd prefer you not worry yourself with how/when GC runs -- treat it like this ubiquitous Being protecting and watching over and out for you...(bows head down, raises fist to the sky)...
Personally, I often explicitly set variables to null when I'm done with them as a form of self documentation. I don't declare, use, then set to null later -- I null immediately after they're no longer needed. I'm saying, explicitly, "I'm officially done with you...be gone..."
Is nullifying necessary in a GC'd language? No. Is it helpful for the GC? Maybe yes, maybe no, don't know for certain, by design I really can't control it, and regardless of today's answer with this version or that, future GC implementations could change the answer beyond my control. Plus if/when nulling is optimized out it's little more than a fancy comment if you will.
I figure if it makes my intent clearer to the next poor fool who follows in my footsteps, and if it "might" potentially help GC sometimes, then it's worth it to me. Mostly it makes me feel tidy and clear, and Mongo likes to feel tidy and clear. :)
I look at it like this: Programming languages exist to let people give other people an idea of intent and a compiler a job request of what to do -- the compiler converts that request into a different language (sometimes several) for a CPU -- the CPU(s) could give a hoot what language you used, your tab settings, comments, stylistic emphases, variable names, etc. -- a CPU's all about the bit stream that tells it what registers and opcodes and memory locations to twiddle. Many things written in code don't convert into what's consumed by the CPU in the sequence we specified. Our C, C++, C#, Lisp, Babel, assembler or whatever is theory rather than reality, written as a statement of work. What you see is not what you get, yes, even in assembler language.
I do understand the mindset of "unnecessary things" (like blank lines) "are nothing but noise and clutter up code." That was me earlier in my career; I totally get that. At this juncture I lean toward that which makes code clearer. It's not like I'm adding even 50 lines of "noise" to my programs -- it's a few lines here or there.
There are exceptions to any rule. In scenarios with volatile memory, static memory, race conditions, singletons, usage of "stale" data and all that kind of rot, that's different: you NEED to manage your own memory, locking and nullifying as apropos because the memory is not part of the GC'd Universe -- hopefully everyone understands that. The rest of the time with GC'd languages it's a matter of style rather than necessity or a guaranteed performance boost.
At the end of the day make sure you understand what is eligible for GC and what's not; lock, dispose, and nullify appropriately; wax on, wax off; breathe in, breathe out; and for everything else I say: If it feels good, do it. Your mileage may vary...as it should...
I think setting something back to null is messy. Imagine a scenario where the item being set to now is exposed say via property. Now is somehow some piece of code accidentally uses this property after the item is disposed you will get a null reference exception which requires some investigation to figure out exactly what is going on.
I believe framework disposables will allows throw ObjectDisposedException which is more meaningful. Not setting these back to null would be better then for that reason.
Some object suppose the .dispose() method which forces the resource to be removed from memory.
Related
I'm working on a TCP socket related application, where an object I've created refers to a System.Net.Sockets.Socket object. That latter object seems to become null and in order to understand why, I would like to check if my own object gets re-created. For that, I thought of the simplest possible approach by checking the memory address of this. However, when adding this to the watch-window I get following error message:
Name Value
&this error CS0211: Cannot take the address of the given expression
As it seems to be impossible to check the memory address of an object in C#, how can I verify that I'm dealing with the same or another object when debugging my code?
In C#, objects are moved during garbage collection. You can't simply take the address of it, because the address changed when the GC heap is compacted.
Dealing with pointers in C# requires unsafe code and you leave the terrain of safe code, basically making it as unsafe as C++.
You can use a debugger like windbg, which displays the memory addresses of objects - but they will still change when GC moves them around.
If you want to see if a new instance of your class gets created, why not set a breakpoint in the constructor?
I am convinced with #thomas answer above.
you can add a unique identifier (such as a GUID) property to your object and use that to determine if you have the same object.
you could override the Equals method to compare two objects if they same as below.
public class MyClass
{
public Guid Id { get; } = Guid.NewGuid();
public override bool Equals(object obj)
{
return obj is MyClass second && this.Id == second.Id;
}
}
As already explained, addresses of objects are not a viable means of reasoning about objects in garbage-collected virtual machines like DotNet. In DotNet you may get the chance to observe the address of an object if you use the fixed keyword, unsafe blocks, or GCHandle.Alloc(), but these are all very hacky and they keep objects fixed in memory so they cannot be garbage collected, which is something that you absolutely do not want. The moment you unfix an object, then its address is free to change, so you cannot keep track of it.
Luckily, you do not need any of that!
You don't need addresses, because all you want is a mnemonic for each object, for the purpose of identifying it during troubleshooting. For this, you have the following options:
Create a singleton which issues unique ids, and in the constructor of each object invoke this singleton to obtain a unique id, store the id with the object, and include the id in the ToString() method of the object, or in whatever other method you might be using for debug display.
Use the System.Runtime.Serialization.ObjectIDGenerator class, which does more or less what the singleton id generator would do, but in a more advanced, and possibly easier to use way. (I have no personal experience using it, so I cannot give any more advice about it.)
Use the System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode( object ) method, which returns what is known in other circles as The Identity Hash-Code of an Object. It is guaranteed to remain unchanged throughout the lifetime of the object, but it is not guaranteed to be unique among all objects. However, since it is 32-bits long, it will be a cold day in hell before another object gets issued the same hash code by coincidence, so it will serve all your troubleshooting purposes just fine.
Do yourself a favor and display the Identity Hash Code of your objects in hexadecimal; the number will be shorter, and will have a wider variety of digits than decimal, so it will be easier to retain in short-term memory while troubleshooting.
I have a strange habit it seems... according to my co-worker at least. We've been working on a small project together. The way I wrote the classes is (simplified example):
[Serializable()]
public class Foo
{
public Foo()
{ }
private Bar _bar;
public Bar Bar
{
get
{
if (_bar == null)
_bar = new Bar();
return _bar;
}
set { _bar = value; }
}
}
So, basically, I only initialize any field when a getter is called and the field is still null. I figured this would reduce overload by not initializing any properties that aren't used anywhere.
ETA: The reason I did this is that my class has several properties that return an instance of another class, which in turn also have properties with yet more classes, and so on. Calling the constructor for the top class would subsequently call all constructors for all these classes, when they are not always all needed.
Are there any objections against this practice, other than personal preference?
UPDATE: I have considered the many differing opinions in regards to this question and I will stand by my accepted answer. However, I have now come to a much better understanding of the concept and I'm able to decide when to use it and when not.
Cons:
Thread safety issues
Not obeying a "setter" request when the value passed is null
Micro-optimizations
Exception handling should take place in a constructor
Need to check for null in class' code
Pros:
Micro-optimizations
Properties never return null
Delay or avoid loading "heavy" objects
Most of the cons are not applicable to my current library, however I would have to test to see if the "micro-optimizations" are actually optimizing anything at all.
LAST UPDATE:
Okay, I changed my answer. My original question was whether or not this is a good habit. And I'm now convinced that it's not. Maybe I will still use it in some parts of my current code, but not unconditionally and definitely not all the time. So I'm going to lose my habit and think about it before using it. Thanks everyone!
What you have here is a - naive - implementation of "lazy initialization".
Short answer:
Using lazy initialization unconditionally is not a good idea. It has its places but one has to take into consideration the impacts this solution has.
Background and explanation:
Concrete implementation:
Let's first look at your concrete sample and why I consider its implementation naive:
It violates the Principle of Least Surprise (POLS). When a value is assigned to a property, it is expected that this value is returned. In your implementation this is not the case for null:
foo.Bar = null;
Assert.Null(foo.Bar); // This will fail
It introduces quite some threading issues: Two callers of foo.Bar on different threads can potentially get two different instances of Bar and one of them will be without a connection to the Foo instance. Any changes made to that Bar instance are silently lost.
This is another case of a violation of POLS. When only the stored value of a property is accessed it is expected to be thread-safe. While you could argue that the class simply isn't thread-safe - including the getter of your property - you would have to document this properly as that's not the normal case. Furthermore the introduction of this issue is unnecessary as we will see shortly.
In general:
It's now time to look at lazy initialization in general:
Lazy initialization is usually used to delay the construction of objects that take a long time to be constructed or that take a lot of memory once fully constructed.
That is a very valid reason for using lazy initialization.
However, such properties normally don't have setters, which gets rid of the first issue pointed out above.
Furthermore, a thread-safe implementation would be used - like Lazy<T> - to avoid the second issue.
Even when considering these two points in the implementation of a lazy property, the following points are general problems of this pattern:
Construction of the object could be unsuccessful, resulting in an exception from a property getter. This is yet another violation of POLS and therefore should be avoided. Even the section on properties in the "Design Guidelines for Developing Class Libraries" explicitly states that property getters shouldn't throw exceptions:
Avoid throwing exceptions from property getters.
Property getters should be simple operations without any preconditions. If a getter might throw an exception, consider redesigning the property to be a method.
Automatic optimizations by the compiler are hurt, namely inlining and branch prediction. Please see Bill K's answer for a detailed explanation.
The conclusion of these points is the following:
For each single property that is implemented lazily, you should have considered these points.
That means, that it is a per-case decision and can't be taken as a general best practice.
This pattern has its place, but it is not a general best practice when implementing classes. It should not be used unconditionally, because of the reasons stated above.
In this section I want to discuss some of the points others have brought forward as arguments for using lazy initialization unconditionally:
Serialization:
EricJ states in one comment:
An object that may be serialized will not have it's contructor invoked when it is deserialized (depends on the serializer, but many common ones behave like this). Putting initialization code in the constructor means that you have to provide additional support for deserialization. This pattern avoids that special coding.
There are several problems with this argument:
Most objects never will be serialized. Adding some sort of support for it when it is not needed violates YAGNI.
When a class needs to support serialization there exist ways to enable it without a workaround that doesn't have anything to do with serialization at first glance.
Micro-optimization:
Your main argument is that you want to construct the objects only when someone actually accesses them. So you are actually talking about optimizing the memory usage.
I don't agree with this argument for the following reasons:
In most cases, a few more objects in memory have no impact whatsoever on anything. Modern computers have way enough memory. Without a case of actual problems confirmed by a profiler, this is pre-mature optimization and there are good reasons against it.
I acknowledge the fact that sometimes this kind of optimization is justified. But even in these cases lazy initialization doesn't seem to be the correct solution. There are two reasons speaking against it:
Lazy initialization potentially hurts performance. Maybe only marginally, but as Bill's answer showed, the impact is greater than one might think at first glance. So this approach basically trades performance versus memory.
If you have a design where it is a common use case to use only parts of the class, this hints at a problem with the design itself: The class in question most likely has more than one responsibility. The solution would be to split the class into several more focused classes.
It is a good design choice. Strongly recommended for library code or core classes.
It is called by some "lazy initialization" or "delayed initialization" and it is generally considered by all to be a good design choice.
First, if you initialize in the declaration of class level variables or constructor, then when your object is constructed, you have the overhead of creating a resource that may never be used.
Second, the resource only gets created if needed.
Third, you avoid garbage collecting an object that was not used.
Lastly, it is easier to handle initialization exceptions that may occur in the property then exceptions that occur during initialization of class level variables or the constructor.
There are exceptions to this rule.
Regarding the performance argument of the additional check for initialization in the "get" property, it is insignificant. Initializing and disposing an object is a more significant performance hit than a simple null pointer check with a jump.
Design Guidelines for Developing Class Libraries at http://msdn.microsoft.com/en-US/library/vstudio/ms229042.aspx
Regarding Lazy<T>
The generic Lazy<T> class was created exactly for what the poster wants, see Lazy Initialization at http://msdn.microsoft.com/en-us/library/dd997286(v=vs.100).aspx. If you have older versions of .NET, you have to use the code pattern illustrated in the question. This code pattern has become so common that Microsoft saw fit to include a class in the latest .NET libraries to make it easier to implement the pattern. In addition, if your implementation needs thread safety, then you have to add it.
Primitive Data Types and Simple Classes
Obvioulsy, you are not going to use lazy-initialization for primitive data type or simple class use like List<string>.
Before Commenting about Lazy
Lazy<T> was introduced in .NET 4.0, so please don't add yet another comment regarding this class.
Before Commenting about Micro-Optimizations
When you are building libraries, you must consider all optimizations. For instance, in the .NET classes you will see bit arrays used for Boolean class variables throughout the code to reduce memory consumption and memory fragmentation, just to name two "micro-optimizations".
Regarding User-Interfaces
You are not going to use lazy initialization for classes that are directly used by the user-interface. Last week I spent the better part of a day removing lazy loading of eight collections used in a view-model for combo-boxes. I have a LookupManager that handles lazy loading and caching of collections needed by any user-interface element.
"Setters"
I have never used a set-property ("setters") for any lazy loaded property. Therefore, you would never allow foo.Bar = null;. If you need to set Bar then I would create a method called SetBar(Bar value) and not use lazy-initialization
Collections
Class collection properties are always initialized when declared because they should never be null.
Complex Classes
Let me repeat this differently, you use lazy-initialization for complex classes. Which are usually, poorly designed classes.
Lastly
I never said to do this for all classes or in all cases. It is a bad habit.
Do you consider implementing such pattern using Lazy<T>?
In addition to easy creation of lazy-loaded objects, you get thread safety while the object is initialized:
http://msdn.microsoft.com/en-us/library/dd642331.aspx
As others said, you lazily-load objects if they're really resource-heavy or it takes some time to load them during object construction-time.
I think it depends on what you are initialising. I probably wouldn't do it for a list as the construction cost is quite small, so it can go in the constructor. But if it was a pre-populated list then I probably wouldn't until it was needed for the first time.
Basically, if the cost of construction outweighs the cost of doing an conditional check on each access then lazy create it. If not, do it in the constructor.
Lazy instantiation/initialization is a perfectly viable pattern. Keep in mind, though, that as a general rule consumers of your API do not expect getters and setters to take discernable time from the end user POV (or to fail).
The downside that I can see is that if you want to ask if Bars is null, it would never be, and you would be creating the list there.
I was just going to put a comment on Daniel's answer but I honestly don't think it goes far enough.
Although this is a very good pattern to use in certain situations (for instance, when the object is initialized from the database), it's a HORRIBLE habit to get into.
One of the best things about an object is that it offeres a secure, trusted environment. The very best case is if you make as many fields as possible "Final", filling them all in with the constructor. This makes your class quite bulletproof. Allowing fields to be changed through setters is a little less so, but not terrible. For instance:
class SafeClass
{
String name="";
Integer age=0;
public void setName(String newName)
{
assert(newName != null)
name=newName;
}// follow this pattern for age
...
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
}
With your pattern, the toString method would look like this:
if(name == null)
throw new IllegalStateException("SafeClass got into an illegal state! name is null")
if(age == null)
throw new IllegalStateException("SafeClass got into an illegal state! age is null")
public String toString() {
String s="Safe Class has name:"+name+" and age:"+age
}
Not only this, but you need null checks everywhere you might possibly use that object in your class (Outside your class is safe because of the null check in the getter, but you should be mostly using your classes members inside the class)
Also your class is perpetually in an uncertain state--for instance if you decided to make that class a hibernate class by adding a few annotations, how would you do it?
If you make any decision based on some micro-optomization without requirements and testing, it's almost certainly the wrong decision. In fact, there is a really really good chance that your pattern is actually slowing down the system even under the most ideal of circumstances because the if statement can cause a branch prediction failure on the CPU which will slow things down many many many more times than just assigning a value in the constructor unless the object you are creating is fairly complex or coming from a remote data source.
For an example of the brance prediction problem (which you are incurring repeatedly, nost just once), see the first answer to this awesome question: Why is it faster to process a sorted array than an unsorted array?
Let me just add one more point to many good points made by others...
The debugger will (by default) evaluate the properties when stepping through the code, which could potentially instantiate the Bar sooner than would normally happen by just executing the code. In other words, the mere act of debugging is changing the execution of the program.
This may or may not be a problem (depending on side-effects), but is something to be aware of.
Are you sure Foo should be instantiating anything at all?
To me it seems smelly (though not necessarily wrong) to let Foo instantiate anything at all. Unless it is Foo's express purpose to be a factory, it should not instantiate it's own collaborators, but instead get them injected in its constructor.
If however Foo's purpose of being is to create instances of type Bar, then I don't see anything wrong with doing it lazily.
I have seen code with the following logic in a few places:
public void func()
{
_myDictonary["foo"] = null;
_myDictionary.Remove("foo");
}
What is the point of setting foo to null in the dictionary before removing it?
I thought the garbage collection cares about the number of things pointing to whatever *foo originally was. If that's the case, wouldn't setting myDictonary["foo"] to null simply decrease the count by one? Wouldn't the same thing happen once myDictonary.Remove("foo") is called?
What is the point of _myDictonary["foo"] = null;
edit: To clarify - when I said "remove the count by one" I meant the following:
- myDictonary["foo"] originally points to an object. That means the object has one or more things referencing it.
- Once myDictonary["foo"] is set to null it is no longer referencing said object. This means that object has one less thing referencing it.
There is no point at all.
If you look at what the Remove method does (using .NET Reflector), you will find this:
this.entries[i].value = default(TValue);
That line sets the value of the dictionary item to null, as the default value for a reference type is null. So, as the Remove method sets the reference to null, there is no point to do it before calling the method.
Setting a dictionary entry to null does not decrease the ref count, as null is a perfectly suitable value to point to in a dictionary.
The two statements do very different things. Setting the value to null indicates that that is what the value should be for that key, whereas removing that key from the dictionary indicates that it should no longer be there.
There isn't much point to it.
However, if the Remove method causes heap allocations, and if the stored value is large, a garbage collection can happen when you call Remove, and it can also collect the value in the process (potentially freeing up memory). Practically, though, people don't usually worry about small things like this, unless it's been shown to be useful.
Edit:
Forgot to mention: Ideally, the dictionary itself should worry about its own implementation like this, not the caller.
It doesn't make much sense there, but there are times when it does make sense.
One example is in a Dispose() method. Consider this type:
public class Owner
{
// snip
private BigAllocation _bigAllocation;
// snip
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
// free managed resources
if (_bigAllocation != null)
{
_bigAllocation.Dispose();
_bigAllocation = null;
}
}
}
}
Now, you could argue that this is unnecessary, and you'd be mostly right. Usually Dispose() is only called before Owner is dereferenced, and when Owner is collected, _bigAllocation will be, as well... eventually.
However:
Setting _bigAllocation to null makes it eligible for collection right away, if nobody else has a reference to it. This can be advantageous if Owner is in a higher-numbered GC generation, or has a finalizer. Otherwise, Owner must be released before _bigAllocation is eligible for collection.
This is sort of a corner case, though. Most types shouldn't have finalizers, and in most cases _bigAllocation and Owner would be in the same generation.
I guess I could maybe see this being useful in a multi-threaded application where you null the object so no other thread can operate on it. Though, this seems heavy handed and poor design.
If that's the case, wouldn't setting myDictonary["foo"] to null
simply decrease the count by one?
Nope, the count doesn't change, the reference is still in the dictionary, it points to null.
I see no reason for the code being the way it is.
I don't know about the internals of Dictionary in particular, but some types of collection may hold references to objects which are effectively 'dead'. For example, a collection may hold an array and a count of how many valid items are in the array; zeroing the count would make any items in the collection inaccessible, but would not destroy any references to them. It may be that deleting an item from a Dictionary ends up making the area that holds the object available for reuse without actually deleting it. If another item with the same hash code gets added to the dictionary, then the first item would actually get deleted, but that might not happen for awhile, if ever.
This looks like an old C++ habit.
I suspect that the author is worried about older collections and/or other languages. If memory serves, some collections in C++ would hold pointers to the collected objects and when 'removed' would only remove the pointer but would not automatically call the destructor of the newly removed object. This causes a very subtle memory leak. The habit became to set the object to null before removing it to make sure the destructor was called.
Working on a number of legacy systems written in various versions of .NET, across many different companies, I keep finding examples of the following pattern:
public void FooBar()
{
object foo = null;
object bar = null;
try
{
foo = new object();
bar = new object();
// Code which throws exception.
}
finally
{
// Destroying objects
foo = null;
bar = null;
}
}
To anybody that knows how memory management works in .NET, this kind of code is painfully unnecessary; the garbage collector does not need you to manually assign null to tell that the old object can be collected, nor does assigning null instructs the GC to immediately collected the object.
This pattern is just noise, making it harder to understand what the code is trying to achieve.
Why, then, do I keep finding this pattern? Is there a school that teaches this practice? Is there a language in which assigning null values to locally scoped variables is required to correctly manage memory? Is there some additional value in explicitly assigning null that I haven't percieved?
It's FUDcargo cult programming (thanks to Daniel Earwicker) by developers who are used to "free" resources, bad GC implementations and bad API.
Some GCs didn't cope well with circular references. To get rid of them, you had to break the cycle "somewhere". Where? Well, if in doubt, then everywhere. Do that for a year and it's moved into your fingertips.
Also setting the field to null gives you the idea of "doing something" because as developers, we always fear "to forget something".
Lastly, we have APIs which must be closed explicitly because there is no real language support to say "close this when I'm done with it" and let the computer figure it out just like with GC. So you have an API where you have to call cleanup code and API where you don't. This sucks and encourages patterns like the above.
It is possible that it came from VB which used a reference counting strategy for memory management and object lifetime. Setting a reference to Nothing (equivalent to null) would decrement the reference count. Once that count became zero then the object was destroyed synchronously. The count would be decremented automatically upon leaving the scope of a method so even in VB this technique was mostly useless, however there were special situations where you would want to greedily destroy an object as illustrated by the following code.
Public Sub Main()
Dim big As Variant
Set big = GetReallyBigObject()
Call big.DoSomething
Set big = Nothing
Call TimeConsumingOperation
Call ConsumeMoreMemory
End Sub
In the above code the object referenced by big would have lingered until the end without the call to Set big = Nothing. That may be undesirable if the other stuff in the method was a time consuming operation or generated more memory pressure.
It comes from C/C++ where explicitly made setting your pointers to null was the norm (to eliminate dangling pointers)
After calling free():
#include <stdlib.h>
{
char *dp = malloc ( A_CONST );
// Now that we're freeing dp, it is a dangling pointer because it's pointing
// to freed memory
free ( dp );
// Set dp to NULL so it is no longer dangling
dp = NULL;
}
Classic VB developers also did the same thing when writing their COM components to prevent memory leaks.
It is more common in languages with deterministic garbage collection and without RAII, such as the old Visual Basic, but even there it's unnecessary and there it was often necessary to break cyclic references. So possibly it really stems from bad C++ programmers who use dumb pointers all over the place. In C++, it makes sense to set dumb pointers to 0 after deleting them to prevent double deletion.
I've seen this a lot in VBScript code (classic ASP) and I think it comes from there.
I think it used to be a common misunderstanding among former C/C++ developers. They knew that the GC will free their memory, but they didn't really understand when and how. Just clean it and carry on :)
I suspect that this pattern comes from translating C++ code to C# without pausing to understand the differences between C# finalization and C++ finalization. In C++ I often null things out in the destructor, either for debugging purposes (so that you can see in the debugger that the reference is no longer valid) or, rarely, because I want a smart object to be released. (If that's the meaning I'd rather call Release on it and make the meaning of the code crystal-clear to the maintainers.) As you note, this is pretty much senseless in C#.
You see this pattern in VB/VBScript all the time too, for different reasons. I mused a bit about what might cause that here:
Link
May be the convention of assigning null originated from the fact that had foo been an instance variable instead of a local variable, you should remove the reference before GC can collect it. Someone slept during the first sentence and started nullifying all their variables; the crowd followed.
It comes from C/C++ where doing a free()/delete on an already released pointer could result in a crash while releasing a NULL-pointer simply did nothing.
This means that this construct (C++) will cause problems
void foo()
{
myclass *mc = new myclass(); // lets assume you really need new here
if (foo == bar)
{
delete mc;
}
delete mc;
}
while this will work
void foo()
{
myclass *mc = new myclass(); // lets assume you really need new here
if (foo == bar)
{
delete mc;
mc = NULL;
}
delete mc;
}
Conclusion: IT's totally unneccessary in C#, Java and just about any other garbage-collecting language.
Consider a slight modification:
public void FooBar()
{
object foo = null;
object bar = null;
try
{
foo = new object();
bar = new object();
// Code which throws exception.
}
finally
{
// Destroying objects
foo = null;
bar = null;
}
vavoom(foo,bar);
}
The author(s) may have wanted to ensure that the great Vavoom (*) did not get pointers to malformed objects if an exception was previously thrown and caught. Paranoia, resulting in defensive coding, is not necessarily a bad thing in this business.
(*) If you know who he is, you know.
VB developers had to dispose all of their objects, to try and mitigate the chance of a Memory leak. I can imagine this is where it has come from as VB devs migrated over to .NEt / c#
I can see it coming from either a misunderstanding of how the garbage collection works, or an effort to force the GC to kick in immediately - perhaps because the objects foo and bar are quite large.
I've seen this in some Java code before. It was used on a static variable to signal that the object should be destroyed.
It probably didn't originate from Java though, as using it for anything other than a static variable would also not make sense in Java.
It comes from C++ code especially smart pointers. In that case it's rougly equivalent to a .Dispose() in C#.
It's not a good practice, at most a developer's instinct. There is no real value by assigning null in C#, except may be helping the GC to break a circular reference.
Will calling close on my WCF service kill all resources or set them up for GC or should I set it to null also?
Firstly, WCF proxies are IDisposable, so you can kind of use using:
using(var proxy = new MyProxy()) { // see below - not quite enough
// use proxy
}
Unfortunately, WCF also has a buggy Dispose() implementation that regularly throws exceptions. However, here's a really cool trick to get it to work correctly. I also blogged about this myself, but I think the first link is a lot better.
So: use IDisposable and using, but use it with caution (in this case).
Setting a field usually makes no difference. There are a few edge-cases (such as variables captured by multiple delegates, static fields, long-life objects, etc), but in general leave it alone. In particular, do not do this, as this can theoretically extend the life:
if(field != null) field = null; // BAD
This is not so much a WCF question as a .NET question; see also
Setting Objects to Null/Nothing after use in .NET
Is disposing this object, enough? or do i need to do more?
In the Dispose(bool) method implementation, Shouldn't one set members to null?
You only need to set a variable to null if it's going to be reachable for a long time afterwards. Say, a field on a long-lived object, or a static field. This holds in general, not just for WCF.