I'm working on a TCP socket related application, where an object I've created refers to a System.Net.Sockets.Socket object. That latter object seems to become null and in order to understand why, I would like to check if my own object gets re-created. For that, I thought of the simplest possible approach by checking the memory address of this. However, when adding this to the watch-window I get following error message:
Name Value
&this error CS0211: Cannot take the address of the given expression
As it seems to be impossible to check the memory address of an object in C#, how can I verify that I'm dealing with the same or another object when debugging my code?
In C#, objects are moved during garbage collection. You can't simply take the address of it, because the address changed when the GC heap is compacted.
Dealing with pointers in C# requires unsafe code and you leave the terrain of safe code, basically making it as unsafe as C++.
You can use a debugger like windbg, which displays the memory addresses of objects - but they will still change when GC moves them around.
If you want to see if a new instance of your class gets created, why not set a breakpoint in the constructor?
I am convinced with #thomas answer above.
you can add a unique identifier (such as a GUID) property to your object and use that to determine if you have the same object.
you could override the Equals method to compare two objects if they same as below.
public class MyClass
{
public Guid Id { get; } = Guid.NewGuid();
public override bool Equals(object obj)
{
return obj is MyClass second && this.Id == second.Id;
}
}
As already explained, addresses of objects are not a viable means of reasoning about objects in garbage-collected virtual machines like DotNet. In DotNet you may get the chance to observe the address of an object if you use the fixed keyword, unsafe blocks, or GCHandle.Alloc(), but these are all very hacky and they keep objects fixed in memory so they cannot be garbage collected, which is something that you absolutely do not want. The moment you unfix an object, then its address is free to change, so you cannot keep track of it.
Luckily, you do not need any of that!
You don't need addresses, because all you want is a mnemonic for each object, for the purpose of identifying it during troubleshooting. For this, you have the following options:
Create a singleton which issues unique ids, and in the constructor of each object invoke this singleton to obtain a unique id, store the id with the object, and include the id in the ToString() method of the object, or in whatever other method you might be using for debug display.
Use the System.Runtime.Serialization.ObjectIDGenerator class, which does more or less what the singleton id generator would do, but in a more advanced, and possibly easier to use way. (I have no personal experience using it, so I cannot give any more advice about it.)
Use the System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode( object ) method, which returns what is known in other circles as The Identity Hash-Code of an Object. It is guaranteed to remain unchanged throughout the lifetime of the object, but it is not guaranteed to be unique among all objects. However, since it is 32-bits long, it will be a cold day in hell before another object gets issued the same hash code by coincidence, so it will serve all your troubleshooting purposes just fine.
Do yourself a favor and display the Identity Hash Code of your objects in hexadecimal; the number will be shorter, and will have a wider variety of digits than decimal, so it will be easier to retain in short-term memory while troubleshooting.
Related
Suppose I have a loop which creates a new instance of a EF Class to add to databasde table in each loop iteration:
foreach (var record in records)
{
InvestmentTransactionStaging newRecord = new()
{
UserId = UserId,
InvestmentTransactionTypeId = interestRepaymentTransactionTypeId,
InvestmentEntityId = InvestmentEntityId,
Description = record.LoanReference,
Date = DateTime.Parse(record.Date),
};
_context.InvestmentTransactionsStaging.Add(newRecord);
}
Now normally, in the languages I am used to such as php it is not important destroy the instance after the call to _context..Add(), the garbage Collector takes care of it. If I omit this as I have above, would this then potentially cause any issues with memory in C#? Do I need to destroy the instance on each itertation that it is instamtiated and if so how? (I could not use the Disapose method as it complains the method is unavailable.
Created objects are automatically destructed when no longer reachable
Meaning, any object created, lives only as long as the running code is still between curly braces, where it was instantiated
Update - Going deep
As mentioned by computercarguy (as well as in the comments), it's not accurate to say the object only lives inside the code block in which it was declared, cause the object lives as long as there is a reference to it in the program.
Meaning, if you declare an instance called newRecord, and you don't assign it to any other pointer which has been declared outside of that code block in which newRecord was declared, then yes, when compiler exits that code block (and goes through the steps of Garbage Collection), it is removed from the memory, but if there is another pointer, for example oldRecord, and you assign the value (address) of the newRecord to it before the end of the code block, then the object isn't removed from memory, for it is still being pointed to, by a pointer ie oldRecord.
In conclusion: The garbage collector checks for objects that are no longer being used by the application, so there is no point in removing the object manually, because the GC does it for you, if it is safe to remove the object, for it can cause problems when implemented by us imperfect humans ☺
TLDR: Even if you set newRecord to null, you wouldn't have destroyed the class instance. It's still accessible in the _context.InvestmentTransactionsStaging object. (In this case, _context is probably a database.)
Long answer:
I might be getting into the weeds here, but your variable is just a pointer to a memory address where the class instance/object was created. When you null out that variable, you simply simply remove the value of the memory address from the variable, not the object. The object remains if other variables are pointing to the same memory address. Automatic garbage collection happens when that object's memory address is no longer being pointed to by any variables or other objects. This is when things get "destroyed".
This happens in many modern languages, not just C#. It might happen differently in C# than it does in PHP, but it ends up with similar results.
Even saying that the object is destroyed is not 100% correct, since the values aren't usually removed. The memory address is just used for something else and overwritten as needed. That can be within the same program or it could be returned to the OS for it to allocate to another process.
And this doesn't cover the complete lifecycle of objects, but it's the most useful explanation until you need to get into much lower level programming of microcontrollers and other embedded systems.
When I first began as a junior C# dev, I was always told during code reviews that if I was accessing an object's property more than once in a given scope then I should create a local variable within the routine as it was cheaper than having to retrieve it from the object. I never really questioned it as it came from more people I perceived to be quite knowledgeable at the time.
Below is a rudimentary example
Example 1: storing an objects identifer in a local variable
public void DoWork(MyDataType object)
{
long id = object.Id;
if (ObjectLookup.TryAdd(id, object))
{
DoSomeOtherWork(id);
}
}
Example 2: retrieving the identifier from the Id property of the object property anytime it is needed
public void DoWork(MyDataType object)
{
if (ObjectLookup.TryAdd(object.Id, object))
{
DoSomeOtherWork(object.Id);
}
}
Does it actually matter or was it more a preference of coding style where I was working? Or perhaps a situational design time choice for the developer to make?
As explained in this answer, if the property is a basic getter/setter than the CLR "will inline the property access and generate code that’s as efficient as accessing a field directly". However, if your property, for example, does some calculations every time the property is accessed, then storing the value of the property in a local variable will avoid the overhead of additional calculations being done.
All the memory allocation stuff aside, there is the principle of DRY(don't repeat yourself). When you can deal with one variable with a short name rather than repeating the object nesting to access the external property, why not do that?
Apart from that, by creating that local variable you are respecting the single responsibility principle by isolating the methods from the external entity they don't need to know about.
And lastly if the so-called resuing leads to unwanted instantiation of reference types or any repetitive calculation, then it is a must to create the local var and reuse it throughout the class/method.
Any way you look at it, this practice helps with readability and more maintainable code, and possibly safer too.
I don't know if it is faster or not (though I would say that the difference is negligible and thus unimportant), but I'll cook up some benchmark for you.
What IS important though will be made evident to you with an example;
public Class MyDataType
{
publig int id {
get {
// Some actual code
return this.GetHashCode() * 2;
}
}
}
Does this make more sense? The first time I will access the id Getter, some code will be executed. The second time, the same code will be executed costing twice as much with no need.
It is very probable, that the reviewers had some such case in mind and instead of going into every single one property and check what you are doing and if it is safe to access, they created a new rule.
Another reason to store, would be useability.
Imagine the following example
object.subObject.someOtherSubObject.id
In this case I ask in reviews to store to a variable even if they use it just once. That is because if this is used in a complicated if statement, it will reduce the readability and maintainability of the code in the future.
A local variable is essentially guaranteed to be fast, whereas there is an unknown amount of overhead involved in accessing the property.
It's almost always a good idea to avoid repeating code whenever possible. Storing the value once means that there is only one thing to change if it needs changing, rather than two or more.
Using a variable allows you to provide a name, which gives you an opportunity to describe your intent.
I would also point out that if you're referring to other members of an object a lot in one place, that can often be a strong indication that the code you're writing actually belongs in that other type instead.
You should consider that getting a value from a method that is calculated from an I/O-bound or CPU-bound process can be irrational. Therefore, it's better to define a var and store the result to avoid multiple same processing.
In the case that you are using a value like object.Id, utilizing a variable decorated with const keyword guarantees that the value will not change in the scope.
Finally, it's better to use a local var in the classes and methods.
`I need to know if two references from completely different parts of the program refers to the same object.
I can not compare references programaticaly because they are from the different context (one reference is not visible from another and vice versa).
Then I want to print unique identifier for each object using Console.WriteLine(). But ToString() method doesn't return "unique" identifier, it just returns "classname".
Is it possible to print unique identifier in C# (like in Java)?
The closest you can easily get (which won't be affected by the GC moving objects around etc) is probably RuntimeHelpers.GetHashCode(Object). This gives the hash code which would be returned by calling Object.GetHashCode() non-virtually on the object. This is still not a unique identifier though. It's probably good enough for diagnostic purposes, but you shouldn't rely on it for production comparisons.
EDIT: If this is just for diagnostics, you could add a sort of "canonicalizing ID generator" which was just a List<object>... when you ask for an object's "ID" you'd check whether it already existed in the list (by comparing references) and then add it to the end if it didn't. The ID would be the index into the list. Of course, doing this without introducing a memory leak would involve weak references etc, but as a simple hack this might work for you.
one reference is not visible from another and vice versa
I don't buy that. If you couldn't even get the handles, how would you get their ID's?
In C# you can always get handles to objects, and you can always compare them. Even if you have to use reflection to do it.
If you need to know if two references are pointing the same object, I'll just citate this.
By default, the operator == tests for
reference equality. This is done by
determining if two references indicate
the same object. Therefore reference
types do not need to implement
operator == in order to gain this
functionality.
So, == operator will do the trick without doing the Id workaround.
I presume you're calling ToString on your object reference, but not entirely clear on this or your explained situatyion, TBH, so just bear with me.
Does the type expose an ID property? If so, try this:
var idAsString = yourObjectInstance.ID.ToString();
Or, print directly:
Console.WriteLine(yourObjectInstance.ID);
EDIT:
I see Jon seen right through this problem, and makes my answer look rather naive - regardless, I'm leaving it in if for nothing else but to emphasise the lack of clarity of the question. And also, maybe, provide an avenue to go down based on Jon's statement that 'This [GetHashCode] is still not a unique identifier', should you decide to expose your own uniqueness by way of an identifier.
I have seen code with the following logic in a few places:
public void func()
{
_myDictonary["foo"] = null;
_myDictionary.Remove("foo");
}
What is the point of setting foo to null in the dictionary before removing it?
I thought the garbage collection cares about the number of things pointing to whatever *foo originally was. If that's the case, wouldn't setting myDictonary["foo"] to null simply decrease the count by one? Wouldn't the same thing happen once myDictonary.Remove("foo") is called?
What is the point of _myDictonary["foo"] = null;
edit: To clarify - when I said "remove the count by one" I meant the following:
- myDictonary["foo"] originally points to an object. That means the object has one or more things referencing it.
- Once myDictonary["foo"] is set to null it is no longer referencing said object. This means that object has one less thing referencing it.
There is no point at all.
If you look at what the Remove method does (using .NET Reflector), you will find this:
this.entries[i].value = default(TValue);
That line sets the value of the dictionary item to null, as the default value for a reference type is null. So, as the Remove method sets the reference to null, there is no point to do it before calling the method.
Setting a dictionary entry to null does not decrease the ref count, as null is a perfectly suitable value to point to in a dictionary.
The two statements do very different things. Setting the value to null indicates that that is what the value should be for that key, whereas removing that key from the dictionary indicates that it should no longer be there.
There isn't much point to it.
However, if the Remove method causes heap allocations, and if the stored value is large, a garbage collection can happen when you call Remove, and it can also collect the value in the process (potentially freeing up memory). Practically, though, people don't usually worry about small things like this, unless it's been shown to be useful.
Edit:
Forgot to mention: Ideally, the dictionary itself should worry about its own implementation like this, not the caller.
It doesn't make much sense there, but there are times when it does make sense.
One example is in a Dispose() method. Consider this type:
public class Owner
{
// snip
private BigAllocation _bigAllocation;
// snip
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
// free managed resources
if (_bigAllocation != null)
{
_bigAllocation.Dispose();
_bigAllocation = null;
}
}
}
}
Now, you could argue that this is unnecessary, and you'd be mostly right. Usually Dispose() is only called before Owner is dereferenced, and when Owner is collected, _bigAllocation will be, as well... eventually.
However:
Setting _bigAllocation to null makes it eligible for collection right away, if nobody else has a reference to it. This can be advantageous if Owner is in a higher-numbered GC generation, or has a finalizer. Otherwise, Owner must be released before _bigAllocation is eligible for collection.
This is sort of a corner case, though. Most types shouldn't have finalizers, and in most cases _bigAllocation and Owner would be in the same generation.
I guess I could maybe see this being useful in a multi-threaded application where you null the object so no other thread can operate on it. Though, this seems heavy handed and poor design.
If that's the case, wouldn't setting myDictonary["foo"] to null
simply decrease the count by one?
Nope, the count doesn't change, the reference is still in the dictionary, it points to null.
I see no reason for the code being the way it is.
I don't know about the internals of Dictionary in particular, but some types of collection may hold references to objects which are effectively 'dead'. For example, a collection may hold an array and a count of how many valid items are in the array; zeroing the count would make any items in the collection inaccessible, but would not destroy any references to them. It may be that deleting an item from a Dictionary ends up making the area that holds the object available for reuse without actually deleting it. If another item with the same hash code gets added to the dictionary, then the first item would actually get deleted, but that might not happen for awhile, if ever.
This looks like an old C++ habit.
I suspect that the author is worried about older collections and/or other languages. If memory serves, some collections in C++ would hold pointers to the collected objects and when 'removed' would only remove the pointer but would not automatically call the destructor of the newly removed object. This causes a very subtle memory leak. The habit became to set the object to null before removing it to make sure the destructor was called.
Should you set all the objects to null (Nothing in VB.NET) once you have finished with them?
I understand that in .NET it is essential to dispose of any instances of objects that implement the IDisposable interface to release some resources although the object can still be something after it is disposed (hence the isDisposed property in forms), so I assume it can still reside in memory or at least in part?
I also know that when an object goes out of scope it is then marked for collection ready for the next pass of the garbage collector (although this may take time).
So with this in mind will setting it to null speed up the system releasing the memory as it does not have to work out that it is no longer in scope and are they any bad side effects?
MSDN articles never do this in examples and currently I do this as I cannot
see the harm. However I have come across a mixture of opinions so any comments are useful.
Karl is absolutely correct, there is no need to set objects to null after use. If an object implements IDisposable, just make sure you call IDisposable.Dispose() when you're done with that object (wrapped in a try..finally, or, a using() block). But even if you don't remember to call Dispose(), the finaliser method on the object should be calling Dispose() for you.
I thought this was a good treatment:
Digging into IDisposable
and this
Understanding IDisposable
There isn't any point in trying to second guess the GC and its management strategies because it's self tuning and opaque. There was a good discussion about the inner workings with Jeffrey Richter on Dot Net Rocks here: Jeffrey Richter on the Windows Memory Model and
Richters book CLR via C# chapter 20 has a great treatment:
Another reason to avoid setting objects to null when you are done with them is that it can actually keep them alive for longer.
e.g.
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is now eligible for garbage collection
// ... rest of method not using 'someType' ...
}
will allow the object referred by someType to be GC'd after the call to "DoSomething" but
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is NOT eligible for garbage collection yet
// because that variable is used at the end of the method
// ... rest of method not using 'someType' ...
someType = null;
}
may sometimes keep the object alive until the end of the method. The JIT will usually optimized away the assignment to null, so both bits of code end up being the same.
No don't null objects. You can check out https://web.archive.org/web/20160325050833/http://codebetter.com/karlseguin/2008/04/28/foundations-of-programming-pt-7-back-to-basics-memory/ for more information, but setting things to null won't do anything, except dirty your code.
Also:
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
In general, there's no need to null objects after use, but in some cases I find it's a good practice.
If an object implements IDisposable and is stored in a field, I think it's good to null it, just to avoid using the disposed object. The bugs of the following sort can be painful:
this.myField.Dispose();
// ... at some later time
this.myField.DoSomething();
It's good to null the field after disposing it, and get a NullPtrEx right at the line where the field is used again. Otherwise, you might run into some cryptic bug down the line (depending on exactly what DoSomething does).
Chances are that your code is not structured tightly enough if you feel the need to null variables.
There are a number of ways to limit the scope of a variable:
As mentioned by Steve Tranby
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
Similarly, you can simply use curly brackets:
{
// Declare the variable and use it
SomeObject object = new SomeObject()
}
// The variable is no longer available
I find that using curly brackets without any "heading" to really clean out the code and help make it more understandable.
In general no need to set to null. But suppose you have a Reset functionality in your class.
Then you might do, because you do not want to call dispose twice, since some of the Dispose may not be implemented correctly and throw System.ObjectDisposed exception.
private void Reset()
{
if(_dataset != null)
{
_dataset.Dispose();
_dataset = null;
}
//..More such member variables like oracle connection etc. _oraConnection
}
The only time you should set a variable to null is when the variable does not go out of scope and you no longer need the data associated with it. Otherwise there is no need.
this kind of "there is no need to set objects to null after use" is not entirely accurate. There are times you need to NULL the variable after disposing it.
Yes, you should ALWAYS call .Dispose() or .Close() on anything that has it when you are done. Be it file handles, database connections or disposable objects.
Separate from that is the very practical pattern of LazyLoad.
Say I have and instantiated ObjA of class A. Class A has a public property called PropB of class B.
Internally, PropB uses the private variable of _B and defaults to null. When PropB.Get() is used, it checks to see if _PropB is null and if it is, opens the resources needed to instantiate a B into _PropB. It then returns _PropB.
To my experience, this is a really useful trick.
Where the need to null comes in is if you reset or change A in some way that the contents of _PropB were the child of the previous values of A, you will need to Dispose AND null out _PropB so LazyLoad can reset to fetch the right value IF the code requires it.
If you only do _PropB.Dispose() and shortly after expect the null check for LazyLoad to succeed, it won't be null, and you'll be looking at stale data. In effect, you must null it after Dispose() just to be sure.
I sure wish it were otherwise, but I've got code right now exhibiting this behavior after a Dispose() on a _PropB and outside of the calling function that did the Dispose (and thus almost out of scope), the private prop still isn't null, and the stale data is still there.
Eventually, the disposed property will null out, but that's been non-deterministic from my perspective.
The core reason, as dbkk alludes is that the parent container (ObjA with PropB) is keeping the instance of _PropB in scope, despite the Dispose().
Stephen Cleary explains very well in this post: Should I Set Variables to Null to Assist Garbage Collection?
Says:
The Short Answer, for the Impatient
Yes, if the variable is a static field, or if you are writing an enumerable method (using yield return) or an asynchronous method (using async and await). Otherwise, no.
This means that in regular methods (non-enumerable and non-asynchronous), you do not set local variables, method parameters, or instance fields to null.
(Even if you’re implementing IDisposable.Dispose, you still should not set variables to null).
The important thing that we should consider is Static Fields.
Static fields are always root objects, so they are always considered “alive” by the garbage collector. If a static field references an object that is no longer needed, it should be set to null so that the garbage collector will treat it as eligible for collection.
Setting static fields to null is meaningless if the entire process is shutting down. The entire heap is about to be garbage collected at that point, including all the root objects.
Conclusion:
Static fields; that’s about it. Anything else is a waste of time.
There are some cases where it makes sense to null references. For instance, when you're writing a collection--like a priority queue--and by your contract, you shouldn't be keeping those objects alive for the client after the client has removed them from the queue.
But this sort of thing only matters in long lived collections. If the queue's not going to survive the end of the function it was created in, then it matters a whole lot less.
On a whole, you really shouldn't bother. Let the compiler and GC do their jobs so you can do yours.
Take a look at this article as well: http://www.codeproject.com/KB/cs/idisposable.aspx
For the most part, setting an object to null has no effect. The only time you should be sure to do so is if you are working with a "large object", which is one larger than 84K in size (such as bitmaps).
I believe by design of the GC implementors, you can't speed up GC with nullification. I'm sure they'd prefer you not worry yourself with how/when GC runs -- treat it like this ubiquitous Being protecting and watching over and out for you...(bows head down, raises fist to the sky)...
Personally, I often explicitly set variables to null when I'm done with them as a form of self documentation. I don't declare, use, then set to null later -- I null immediately after they're no longer needed. I'm saying, explicitly, "I'm officially done with you...be gone..."
Is nullifying necessary in a GC'd language? No. Is it helpful for the GC? Maybe yes, maybe no, don't know for certain, by design I really can't control it, and regardless of today's answer with this version or that, future GC implementations could change the answer beyond my control. Plus if/when nulling is optimized out it's little more than a fancy comment if you will.
I figure if it makes my intent clearer to the next poor fool who follows in my footsteps, and if it "might" potentially help GC sometimes, then it's worth it to me. Mostly it makes me feel tidy and clear, and Mongo likes to feel tidy and clear. :)
I look at it like this: Programming languages exist to let people give other people an idea of intent and a compiler a job request of what to do -- the compiler converts that request into a different language (sometimes several) for a CPU -- the CPU(s) could give a hoot what language you used, your tab settings, comments, stylistic emphases, variable names, etc. -- a CPU's all about the bit stream that tells it what registers and opcodes and memory locations to twiddle. Many things written in code don't convert into what's consumed by the CPU in the sequence we specified. Our C, C++, C#, Lisp, Babel, assembler or whatever is theory rather than reality, written as a statement of work. What you see is not what you get, yes, even in assembler language.
I do understand the mindset of "unnecessary things" (like blank lines) "are nothing but noise and clutter up code." That was me earlier in my career; I totally get that. At this juncture I lean toward that which makes code clearer. It's not like I'm adding even 50 lines of "noise" to my programs -- it's a few lines here or there.
There are exceptions to any rule. In scenarios with volatile memory, static memory, race conditions, singletons, usage of "stale" data and all that kind of rot, that's different: you NEED to manage your own memory, locking and nullifying as apropos because the memory is not part of the GC'd Universe -- hopefully everyone understands that. The rest of the time with GC'd languages it's a matter of style rather than necessity or a guaranteed performance boost.
At the end of the day make sure you understand what is eligible for GC and what's not; lock, dispose, and nullify appropriately; wax on, wax off; breathe in, breathe out; and for everything else I say: If it feels good, do it. Your mileage may vary...as it should...
I think setting something back to null is messy. Imagine a scenario where the item being set to now is exposed say via property. Now is somehow some piece of code accidentally uses this property after the item is disposed you will get a null reference exception which requires some investigation to figure out exactly what is going on.
I believe framework disposables will allows throw ObjectDisposedException which is more meaningful. Not setting these back to null would be better then for that reason.
Some object suppose the .dispose() method which forces the resource to be removed from memory.