Suppose I have a loop which creates a new instance of a EF Class to add to databasde table in each loop iteration:
foreach (var record in records)
{
InvestmentTransactionStaging newRecord = new()
{
UserId = UserId,
InvestmentTransactionTypeId = interestRepaymentTransactionTypeId,
InvestmentEntityId = InvestmentEntityId,
Description = record.LoanReference,
Date = DateTime.Parse(record.Date),
};
_context.InvestmentTransactionsStaging.Add(newRecord);
}
Now normally, in the languages I am used to such as php it is not important destroy the instance after the call to _context..Add(), the garbage Collector takes care of it. If I omit this as I have above, would this then potentially cause any issues with memory in C#? Do I need to destroy the instance on each itertation that it is instamtiated and if so how? (I could not use the Disapose method as it complains the method is unavailable.
Created objects are automatically destructed when no longer reachable
Meaning, any object created, lives only as long as the running code is still between curly braces, where it was instantiated
Update - Going deep
As mentioned by computercarguy (as well as in the comments), it's not accurate to say the object only lives inside the code block in which it was declared, cause the object lives as long as there is a reference to it in the program.
Meaning, if you declare an instance called newRecord, and you don't assign it to any other pointer which has been declared outside of that code block in which newRecord was declared, then yes, when compiler exits that code block (and goes through the steps of Garbage Collection), it is removed from the memory, but if there is another pointer, for example oldRecord, and you assign the value (address) of the newRecord to it before the end of the code block, then the object isn't removed from memory, for it is still being pointed to, by a pointer ie oldRecord.
In conclusion: The garbage collector checks for objects that are no longer being used by the application, so there is no point in removing the object manually, because the GC does it for you, if it is safe to remove the object, for it can cause problems when implemented by us imperfect humans ☺
TLDR: Even if you set newRecord to null, you wouldn't have destroyed the class instance. It's still accessible in the _context.InvestmentTransactionsStaging object. (In this case, _context is probably a database.)
Long answer:
I might be getting into the weeds here, but your variable is just a pointer to a memory address where the class instance/object was created. When you null out that variable, you simply simply remove the value of the memory address from the variable, not the object. The object remains if other variables are pointing to the same memory address. Automatic garbage collection happens when that object's memory address is no longer being pointed to by any variables or other objects. This is when things get "destroyed".
This happens in many modern languages, not just C#. It might happen differently in C# than it does in PHP, but it ends up with similar results.
Even saying that the object is destroyed is not 100% correct, since the values aren't usually removed. The memory address is just used for something else and overwritten as needed. That can be within the same program or it could be returned to the OS for it to allocate to another process.
And this doesn't cover the complete lifecycle of objects, but it's the most useful explanation until you need to get into much lower level programming of microcontrollers and other embedded systems.
Related
I'm working on a TCP socket related application, where an object I've created refers to a System.Net.Sockets.Socket object. That latter object seems to become null and in order to understand why, I would like to check if my own object gets re-created. For that, I thought of the simplest possible approach by checking the memory address of this. However, when adding this to the watch-window I get following error message:
Name Value
&this error CS0211: Cannot take the address of the given expression
As it seems to be impossible to check the memory address of an object in C#, how can I verify that I'm dealing with the same or another object when debugging my code?
In C#, objects are moved during garbage collection. You can't simply take the address of it, because the address changed when the GC heap is compacted.
Dealing with pointers in C# requires unsafe code and you leave the terrain of safe code, basically making it as unsafe as C++.
You can use a debugger like windbg, which displays the memory addresses of objects - but they will still change when GC moves them around.
If you want to see if a new instance of your class gets created, why not set a breakpoint in the constructor?
I am convinced with #thomas answer above.
you can add a unique identifier (such as a GUID) property to your object and use that to determine if you have the same object.
you could override the Equals method to compare two objects if they same as below.
public class MyClass
{
public Guid Id { get; } = Guid.NewGuid();
public override bool Equals(object obj)
{
return obj is MyClass second && this.Id == second.Id;
}
}
As already explained, addresses of objects are not a viable means of reasoning about objects in garbage-collected virtual machines like DotNet. In DotNet you may get the chance to observe the address of an object if you use the fixed keyword, unsafe blocks, or GCHandle.Alloc(), but these are all very hacky and they keep objects fixed in memory so they cannot be garbage collected, which is something that you absolutely do not want. The moment you unfix an object, then its address is free to change, so you cannot keep track of it.
Luckily, you do not need any of that!
You don't need addresses, because all you want is a mnemonic for each object, for the purpose of identifying it during troubleshooting. For this, you have the following options:
Create a singleton which issues unique ids, and in the constructor of each object invoke this singleton to obtain a unique id, store the id with the object, and include the id in the ToString() method of the object, or in whatever other method you might be using for debug display.
Use the System.Runtime.Serialization.ObjectIDGenerator class, which does more or less what the singleton id generator would do, but in a more advanced, and possibly easier to use way. (I have no personal experience using it, so I cannot give any more advice about it.)
Use the System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode( object ) method, which returns what is known in other circles as The Identity Hash-Code of an Object. It is guaranteed to remain unchanged throughout the lifetime of the object, but it is not guaranteed to be unique among all objects. However, since it is 32-bits long, it will be a cold day in hell before another object gets issued the same hash code by coincidence, so it will serve all your troubleshooting purposes just fine.
Do yourself a favor and display the Identity Hash Code of your objects in hexadecimal; the number will be shorter, and will have a wider variety of digits than decimal, so it will be easier to retain in short-term memory while troubleshooting.
I'm new to object oriented code, and I have a question if this code is safe.
I have a local List "TempCand" that is assigned to a static member List "Candidates" of the class. When I leave the search method, I fear that the memory of my local variable is subject to garbage collection, which would then affect my static variable. Or is this ok?
public class search
{
class candidate
{
// ...
}
static List <candidate> Candidates = new List <candidate>();
static public void clean_Candidates()
{
List <candidate> TempCand = new List <candidate>();
// ...
// copy some elements of Candidates in clean_Candidates()
// ...
Candidates = TempCand;
}
}
I fear that the memory of my local variable is subject to garbage collection
One of the things that means something can't be garbage collected, is that there is an in-use local variable that references it.
which would then affect my static variable
Another of the things that means something can't be garbage collected, is that there is a static field that references it.
Local variables themselves aren't something that garbage collection affects at all; at some point when the method isn't using it (which may be when the scope is left, or may be before then) the local memory can just be re-used for something else. You pretty much will never care, because if you ever go to use it, then by definition you aren't at a point where it will never be used (there's an exception around timers and weak references, but you're using neither here).
Now, if that local variable is a reference type, then it could have been something that was keeping the object referenced alive. However, that again won't generally be visible, because this is only if it's the only reference to this object.
When the garbage collector kicks in, the first thing it does is to find all the things it can't collect:
Anything in a local variable that is in use.
Anything in a static field.
Anything in a field of an object the above two rules have said can't be collected.
Anything in a field that rule 3 said can't be collected, applying this rule recursively.
If you can "see" it, the GC can't collect it.
GC can't affect your static field, because by definition, being in a static field makes it off-limits.
TempCand and Candidates are references toward some data (a list of Candidate objects, in this case). Of course, you can create and destroy additional references without affecting the objects at all.
So, one possible scenario is this:
You create the Candidates list and put in 5 Candidate objects inside.
As the objects are referenced by the list, and the list is referenced by the static property, the objects will not be garbage collected.
You call the CleanCandidates static method. Within it, you create another reference (tempCand) to a list of candidates. You add references to three of the Candidate objects to the new list.
If a garbage run occurs it this moment, none of the Candidate objects will be collected, as they are still referenced from the static list.
You set that the static Candidates property now points to the new list and exit the method. If a garbage run occurs now, it can collect
The old list that Candidates used to reference. It's not referenced by anything now.
The two Candidate objects that were in the old list, but not in the new list, as there is nothing that references them now.
The tempCand reference, as it is out of scope - the method is finished, and a new reference will be created if it is called again. Note that the list that tempCand references will not be collected, as it's available now through the Candidates static field. (Note that collection of local fields is usually done "magically" with stack rewind)
So, the candidates you want alive will be (and stay alive) and the candidates that you threw out will eventually be garbage collected (if not referenced by some other, not shown code).
That said, the code shown is not thread safe. You could potentially execute CleanCandidates twice, and wreak havoc on the collection. You have to be very, very careful as your static list is effectively shared application state.
When an object is created, a reference is returned, and not the object.
What does this mean?
object a = new object();
Here a holds the reference.
It would be helpful if someone explains the creation of the object, creation of references.
I think of a reference as being like a set of directions to get to a house, where the house represents the object itself.
So if you were to tell someone how to get to your house, you might write down those directions on a piece of paper and give it to them - that's like assigning a reference to a variable.
Coming to your example:
object a = new object();
That's like building a new house (calling the constructor) and then on a piece of paper (the variable a) you write the directions to get to the new house. The paper doesn't have the house itself on it - just directions. If someone copies the contents of the piece of paper, like this:
object b = a;
that doesn't create a second house. It just copies the directions from the piece of paper a to the piece of paper b. Likewise then the two pieces of paper are independent - you could change the value of a to a different set of directions, and it wouldn't change what's on b.
I have an article on this which attempts to explain it another way, which you may find helpful.
The following statement:
object a = new object();
Actually does two different things.
First, new object() allocates the necessary memory to store the instance is allocated on the heap and returns the address on the heap that just got allocated.
Secondly, the assignment is evaluated and assigns the value returned from new to a.
When you say "a holds a reference" it means that a is the memory on the stack (or in registers, or on the heap depending no the lifetime of the reference) that points to the heap location of the instance you just created.
When you instantiate a class with the new keyword you create an object on the heap. It is not referenced by anyone yet. If an object has no references to itself it can be soon garbage collected. To operate with an object you need to reference it. So you create a variable which contains the address of the object(the reference).
With most modern languages you don't have to worry about references vs the object themselves. Back in the day (eg c++) you would have to worry about reference (sometimes called pointers) and even the memory allocated for them. Now the system (and what is called the garbage collector) worries about it for you.
Here is the details. Your example line means the following:
1) allocate memory for object
2) run the constructor
3) put the memory location of that object in the variable "a"
What does the mean to you conceptually as a programmer? Not much, most of the time. In C# you can still think of the variable a as the object. You don't have to worry about it pointing to the object under the hood. Most of the time it does not matter.
Where it matters is when you need to be concerned about the lifetime of the object. Because you have a reference to the object it will not be deallocated which means if it is using a system resource it will continue to do so.
When an object is no longer being referenced it will be deallocated by the garbage collector.
Another tongue-in-cheek analogy for the pile: the object is a helium balloon, and a reference is a string tied to that balloon which you are holding. Saying new object() is equivalent to asking a balloon guy (the memory manager) at the fair (your program) for a new balloon. He may either give you a balloon by means of handing you the string, or he may also tell you that there are no balloons left. You may find the latter very upsetting and run crying from the fair.
You may wish to share the balloon with your sibling, and here the analogy starts to fall apart. This could be seen as both you and your sibling holding onto the same string with your hands, or a second string being tied to the balloon for your sibling. Care must be taken to ensure that those who hold the string(s) tied to the balloon coordinate their movements, otherwise the balloon may be ripped or ruined.
Finally, in a language like C# when you become bored of the balloon, you can simply let go of the string. If others are still holding the string it will not go anywhere, but if they are not it floats harmlessly into a net in the sky, from where the balloon guy periodically collects the released balloons to replenish his stock. In other older languages like C, there is no such net and the balloon guy will never be able to recover the helium in that balloon to make another balloon for someone else. He would really appreciate it if you took the balloon back. This causes other problems: you may return the balloon, but forget that your sibling still wanted it. With their back turned, they don't notice you pulling the string out of their hand and later when they turn and look for it, they will find it is gone and be very upset.
Ridiculous analogy? Yes. Will it stick in your mind? Most likely :)
I have seen code with the following logic in a few places:
public void func()
{
_myDictonary["foo"] = null;
_myDictionary.Remove("foo");
}
What is the point of setting foo to null in the dictionary before removing it?
I thought the garbage collection cares about the number of things pointing to whatever *foo originally was. If that's the case, wouldn't setting myDictonary["foo"] to null simply decrease the count by one? Wouldn't the same thing happen once myDictonary.Remove("foo") is called?
What is the point of _myDictonary["foo"] = null;
edit: To clarify - when I said "remove the count by one" I meant the following:
- myDictonary["foo"] originally points to an object. That means the object has one or more things referencing it.
- Once myDictonary["foo"] is set to null it is no longer referencing said object. This means that object has one less thing referencing it.
There is no point at all.
If you look at what the Remove method does (using .NET Reflector), you will find this:
this.entries[i].value = default(TValue);
That line sets the value of the dictionary item to null, as the default value for a reference type is null. So, as the Remove method sets the reference to null, there is no point to do it before calling the method.
Setting a dictionary entry to null does not decrease the ref count, as null is a perfectly suitable value to point to in a dictionary.
The two statements do very different things. Setting the value to null indicates that that is what the value should be for that key, whereas removing that key from the dictionary indicates that it should no longer be there.
There isn't much point to it.
However, if the Remove method causes heap allocations, and if the stored value is large, a garbage collection can happen when you call Remove, and it can also collect the value in the process (potentially freeing up memory). Practically, though, people don't usually worry about small things like this, unless it's been shown to be useful.
Edit:
Forgot to mention: Ideally, the dictionary itself should worry about its own implementation like this, not the caller.
It doesn't make much sense there, but there are times when it does make sense.
One example is in a Dispose() method. Consider this type:
public class Owner
{
// snip
private BigAllocation _bigAllocation;
// snip
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
// free managed resources
if (_bigAllocation != null)
{
_bigAllocation.Dispose();
_bigAllocation = null;
}
}
}
}
Now, you could argue that this is unnecessary, and you'd be mostly right. Usually Dispose() is only called before Owner is dereferenced, and when Owner is collected, _bigAllocation will be, as well... eventually.
However:
Setting _bigAllocation to null makes it eligible for collection right away, if nobody else has a reference to it. This can be advantageous if Owner is in a higher-numbered GC generation, or has a finalizer. Otherwise, Owner must be released before _bigAllocation is eligible for collection.
This is sort of a corner case, though. Most types shouldn't have finalizers, and in most cases _bigAllocation and Owner would be in the same generation.
I guess I could maybe see this being useful in a multi-threaded application where you null the object so no other thread can operate on it. Though, this seems heavy handed and poor design.
If that's the case, wouldn't setting myDictonary["foo"] to null
simply decrease the count by one?
Nope, the count doesn't change, the reference is still in the dictionary, it points to null.
I see no reason for the code being the way it is.
I don't know about the internals of Dictionary in particular, but some types of collection may hold references to objects which are effectively 'dead'. For example, a collection may hold an array and a count of how many valid items are in the array; zeroing the count would make any items in the collection inaccessible, but would not destroy any references to them. It may be that deleting an item from a Dictionary ends up making the area that holds the object available for reuse without actually deleting it. If another item with the same hash code gets added to the dictionary, then the first item would actually get deleted, but that might not happen for awhile, if ever.
This looks like an old C++ habit.
I suspect that the author is worried about older collections and/or other languages. If memory serves, some collections in C++ would hold pointers to the collected objects and when 'removed' would only remove the pointer but would not automatically call the destructor of the newly removed object. This causes a very subtle memory leak. The habit became to set the object to null before removing it to make sure the destructor was called.
Should you set all the objects to null (Nothing in VB.NET) once you have finished with them?
I understand that in .NET it is essential to dispose of any instances of objects that implement the IDisposable interface to release some resources although the object can still be something after it is disposed (hence the isDisposed property in forms), so I assume it can still reside in memory or at least in part?
I also know that when an object goes out of scope it is then marked for collection ready for the next pass of the garbage collector (although this may take time).
So with this in mind will setting it to null speed up the system releasing the memory as it does not have to work out that it is no longer in scope and are they any bad side effects?
MSDN articles never do this in examples and currently I do this as I cannot
see the harm. However I have come across a mixture of opinions so any comments are useful.
Karl is absolutely correct, there is no need to set objects to null after use. If an object implements IDisposable, just make sure you call IDisposable.Dispose() when you're done with that object (wrapped in a try..finally, or, a using() block). But even if you don't remember to call Dispose(), the finaliser method on the object should be calling Dispose() for you.
I thought this was a good treatment:
Digging into IDisposable
and this
Understanding IDisposable
There isn't any point in trying to second guess the GC and its management strategies because it's self tuning and opaque. There was a good discussion about the inner workings with Jeffrey Richter on Dot Net Rocks here: Jeffrey Richter on the Windows Memory Model and
Richters book CLR via C# chapter 20 has a great treatment:
Another reason to avoid setting objects to null when you are done with them is that it can actually keep them alive for longer.
e.g.
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is now eligible for garbage collection
// ... rest of method not using 'someType' ...
}
will allow the object referred by someType to be GC'd after the call to "DoSomething" but
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is NOT eligible for garbage collection yet
// because that variable is used at the end of the method
// ... rest of method not using 'someType' ...
someType = null;
}
may sometimes keep the object alive until the end of the method. The JIT will usually optimized away the assignment to null, so both bits of code end up being the same.
No don't null objects. You can check out https://web.archive.org/web/20160325050833/http://codebetter.com/karlseguin/2008/04/28/foundations-of-programming-pt-7-back-to-basics-memory/ for more information, but setting things to null won't do anything, except dirty your code.
Also:
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
In general, there's no need to null objects after use, but in some cases I find it's a good practice.
If an object implements IDisposable and is stored in a field, I think it's good to null it, just to avoid using the disposed object. The bugs of the following sort can be painful:
this.myField.Dispose();
// ... at some later time
this.myField.DoSomething();
It's good to null the field after disposing it, and get a NullPtrEx right at the line where the field is used again. Otherwise, you might run into some cryptic bug down the line (depending on exactly what DoSomething does).
Chances are that your code is not structured tightly enough if you feel the need to null variables.
There are a number of ways to limit the scope of a variable:
As mentioned by Steve Tranby
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
Similarly, you can simply use curly brackets:
{
// Declare the variable and use it
SomeObject object = new SomeObject()
}
// The variable is no longer available
I find that using curly brackets without any "heading" to really clean out the code and help make it more understandable.
In general no need to set to null. But suppose you have a Reset functionality in your class.
Then you might do, because you do not want to call dispose twice, since some of the Dispose may not be implemented correctly and throw System.ObjectDisposed exception.
private void Reset()
{
if(_dataset != null)
{
_dataset.Dispose();
_dataset = null;
}
//..More such member variables like oracle connection etc. _oraConnection
}
The only time you should set a variable to null is when the variable does not go out of scope and you no longer need the data associated with it. Otherwise there is no need.
this kind of "there is no need to set objects to null after use" is not entirely accurate. There are times you need to NULL the variable after disposing it.
Yes, you should ALWAYS call .Dispose() or .Close() on anything that has it when you are done. Be it file handles, database connections or disposable objects.
Separate from that is the very practical pattern of LazyLoad.
Say I have and instantiated ObjA of class A. Class A has a public property called PropB of class B.
Internally, PropB uses the private variable of _B and defaults to null. When PropB.Get() is used, it checks to see if _PropB is null and if it is, opens the resources needed to instantiate a B into _PropB. It then returns _PropB.
To my experience, this is a really useful trick.
Where the need to null comes in is if you reset or change A in some way that the contents of _PropB were the child of the previous values of A, you will need to Dispose AND null out _PropB so LazyLoad can reset to fetch the right value IF the code requires it.
If you only do _PropB.Dispose() and shortly after expect the null check for LazyLoad to succeed, it won't be null, and you'll be looking at stale data. In effect, you must null it after Dispose() just to be sure.
I sure wish it were otherwise, but I've got code right now exhibiting this behavior after a Dispose() on a _PropB and outside of the calling function that did the Dispose (and thus almost out of scope), the private prop still isn't null, and the stale data is still there.
Eventually, the disposed property will null out, but that's been non-deterministic from my perspective.
The core reason, as dbkk alludes is that the parent container (ObjA with PropB) is keeping the instance of _PropB in scope, despite the Dispose().
Stephen Cleary explains very well in this post: Should I Set Variables to Null to Assist Garbage Collection?
Says:
The Short Answer, for the Impatient
Yes, if the variable is a static field, or if you are writing an enumerable method (using yield return) or an asynchronous method (using async and await). Otherwise, no.
This means that in regular methods (non-enumerable and non-asynchronous), you do not set local variables, method parameters, or instance fields to null.
(Even if you’re implementing IDisposable.Dispose, you still should not set variables to null).
The important thing that we should consider is Static Fields.
Static fields are always root objects, so they are always considered “alive” by the garbage collector. If a static field references an object that is no longer needed, it should be set to null so that the garbage collector will treat it as eligible for collection.
Setting static fields to null is meaningless if the entire process is shutting down. The entire heap is about to be garbage collected at that point, including all the root objects.
Conclusion:
Static fields; that’s about it. Anything else is a waste of time.
There are some cases where it makes sense to null references. For instance, when you're writing a collection--like a priority queue--and by your contract, you shouldn't be keeping those objects alive for the client after the client has removed them from the queue.
But this sort of thing only matters in long lived collections. If the queue's not going to survive the end of the function it was created in, then it matters a whole lot less.
On a whole, you really shouldn't bother. Let the compiler and GC do their jobs so you can do yours.
Take a look at this article as well: http://www.codeproject.com/KB/cs/idisposable.aspx
For the most part, setting an object to null has no effect. The only time you should be sure to do so is if you are working with a "large object", which is one larger than 84K in size (such as bitmaps).
I believe by design of the GC implementors, you can't speed up GC with nullification. I'm sure they'd prefer you not worry yourself with how/when GC runs -- treat it like this ubiquitous Being protecting and watching over and out for you...(bows head down, raises fist to the sky)...
Personally, I often explicitly set variables to null when I'm done with them as a form of self documentation. I don't declare, use, then set to null later -- I null immediately after they're no longer needed. I'm saying, explicitly, "I'm officially done with you...be gone..."
Is nullifying necessary in a GC'd language? No. Is it helpful for the GC? Maybe yes, maybe no, don't know for certain, by design I really can't control it, and regardless of today's answer with this version or that, future GC implementations could change the answer beyond my control. Plus if/when nulling is optimized out it's little more than a fancy comment if you will.
I figure if it makes my intent clearer to the next poor fool who follows in my footsteps, and if it "might" potentially help GC sometimes, then it's worth it to me. Mostly it makes me feel tidy and clear, and Mongo likes to feel tidy and clear. :)
I look at it like this: Programming languages exist to let people give other people an idea of intent and a compiler a job request of what to do -- the compiler converts that request into a different language (sometimes several) for a CPU -- the CPU(s) could give a hoot what language you used, your tab settings, comments, stylistic emphases, variable names, etc. -- a CPU's all about the bit stream that tells it what registers and opcodes and memory locations to twiddle. Many things written in code don't convert into what's consumed by the CPU in the sequence we specified. Our C, C++, C#, Lisp, Babel, assembler or whatever is theory rather than reality, written as a statement of work. What you see is not what you get, yes, even in assembler language.
I do understand the mindset of "unnecessary things" (like blank lines) "are nothing but noise and clutter up code." That was me earlier in my career; I totally get that. At this juncture I lean toward that which makes code clearer. It's not like I'm adding even 50 lines of "noise" to my programs -- it's a few lines here or there.
There are exceptions to any rule. In scenarios with volatile memory, static memory, race conditions, singletons, usage of "stale" data and all that kind of rot, that's different: you NEED to manage your own memory, locking and nullifying as apropos because the memory is not part of the GC'd Universe -- hopefully everyone understands that. The rest of the time with GC'd languages it's a matter of style rather than necessity or a guaranteed performance boost.
At the end of the day make sure you understand what is eligible for GC and what's not; lock, dispose, and nullify appropriately; wax on, wax off; breathe in, breathe out; and for everything else I say: If it feels good, do it. Your mileage may vary...as it should...
I think setting something back to null is messy. Imagine a scenario where the item being set to now is exposed say via property. Now is somehow some piece of code accidentally uses this property after the item is disposed you will get a null reference exception which requires some investigation to figure out exactly what is going on.
I believe framework disposables will allows throw ObjectDisposedException which is more meaningful. Not setting these back to null would be better then for that reason.
Some object suppose the .dispose() method which forces the resource to be removed from memory.