Please explain Memory Leakage in Managed Code with example?

Please explain Memory Leakage in Managed Code with example? - c#

I was asked this question in an interview: How does memory leakage problem occur in C# as of all know Garbage Collector responsible for all the memory management related work? So how is it possible?

From MSDN:-
A memory leak occurs when memory is allocated in a program and is
never returned to the operating system, even though the program does
not use the memory any longer. The following are the four basic types of memory leaks:
In a manually managed memory environment: Memory is dynamically allocated and referenced by a pointer. The pointer is erased before the memory is freed. After the pointer is erased, the memory can no longer be accessed and therefore cannot be freed.
In a dynamically managed memory environment: Memory is disposed of but never collected, because a reference to the object is still active. Because a reference to the object is still active, the garbage collector never collects that memory. This can occur with a reference that is set by the system or the program.
In a dynamically managed memory environment: The garbage collector can collect and free the memory but never returns it to the operating system. This occurs when the garbage collector cannot move the objects that are still in use to one portion of the memory and free the rest.
In any memory environment: Poor memory management can result when many large objects are declared and never permitted to leave scope. As a result, memory is used and never freed.
Dim DS As DataSet
Dim cn As New SqlClient.SqlConnection("data source=localhost;initial catalog=Northwind;integrated security=SSPI")
cn.Open()
Dim da As New SqlClient.SqlDataAdapter("Select * from Employees", cn)
Dim i As Integer
DS = New DataSet()
For i = 0 To 1000
da.Fill(DS, "Table" + i.ToString)
Next
Although this code is obviously inefficient and not practical, it is meant to demonstrate that if objects are added to a collection (such as adding the tables to the DataSet collection), the objects are kept active as long as the collection remains alive. If a collection is declared at the global level of the program, and objects are declared throughout the program and added to that collection, this means that even though the objects are no longer in scope, the objects remain alive because they are still being referenced.
You may also check this reference:-
Identify And Prevent Memory Leaks In Managed Code
The above link gives a very good conclusion
Although .NET reduces the need for you to be concerned with memory,
you still must pay attention to your application's use of memory to
ensure that it is well-behaved and efficient. Just because an
application is managed doesn't mean you can throw good software
engineering practices out the window and count on the GC to perform
magic.

Just holding on to a reference to an object when you actually don't have any use for it anymore is a good way to leak. Particularly so when you store it in a static variable, that creates a reference that lives for the life of the AppDomain unless you explicitly set it back to null. If such a reference is stored in a collection then you can get a true leak that can crash your program with OOM. Not usual.
Such leaks can be hard to find. Particularly events can be tricky that way, they will get the event source object to add a reference to the event handler object when you subscribe an event handler. Hard to see in C# code because you never explicitly pass this when you subscribe the event. If the event source object lives for a long time then you can get in trouble with all of the subscriber objects staying referenced.
The tricky ones do require a memory profiler to diagnose.

Example would be a Class Child containing ClickEventHandler method subscribed to an Event ClickEvent of another class Parent.
GC of Child class would be blocked until Parent class goes out of scope..Even if Child goes out of scope it won't be collected by GC until Parent goes out of scope
All such subscribers subscribing to a Broadcaster(Event) would not be collected by GC until the broadcaster goes out of scope.
So, its a one way relation
Broadcaster(ClickEvent) -> Subscribers(ClickEventHandler)
GC of all the ClickEventHandlers would be blocked until ClickEvent goes out of scope!

As example:
You have application with main form and static variable Popups[] collectionOfPopups.
Store all application popups objects into static array and never delete them.
So with each new popup will take memory and GC will never release it.
Try to read how GC works
http://msdn.microsoft.com/en-us/library/ee787088.aspx
This will explains everything.

There can be multiple reasons, but here is one:
Consider two classes:
class A
{
private B b;
}
class B
{
private A a;
}
If you create an object for each of those classes, and cross-link them, and after that both those objects go out of the scope, you will still have link to each of them within another one. It is very difficult for GC to catch these sorts of crosslinks, and it may continue to believe that both objects are still in use.

Related

what happens to memory when reassign array in C#?

In the following code, what happens to the memory myArray initially pointed to once it's reassigned in the 2nd line? Is that memory lost, or does the C# garbage collector take care of it?
static void Main(string[] args)
{
double[] myArray = new Double[10];
myArray = new Double[3];
}
As far as I've been reading, there's no way to explicitly free up the memory, so I'm hoping C# automatically does it.

Of course, C# automatically releases memory that was associated with myArray prior to the assignment. This does not happen right away, but as soon as garbage collector realizes there are no remaining references to that Double[10] array object*, the memory allocated to the object gets reclaimed.
If you change your program slightly to create a second reference to the same array, like this
double[] myArray = new Double[10];
double[] myArray2 = myArray; // Second reference
myArray = new Double[3];
garbage collector would not release the object, as long as a reference to it remains accessible to your program.
* Sometimes your program finishes execution before garbage collector gets to complete its analysis. The memory still gets released, though.

When your variable goes out of scope and the memory is no longer required then it becomes eligible for garbage collection.
MS explains all, as seen here: https://msdn.microsoft.com/en-us/library/aa691138(v=vs.71).aspx
C# employs automatic memory management, which frees developers from manually allocating and freeing the memory occupied by objects.
Automatic memory management policies are implemented by a garbage
collector. The memory management life cycle of an object is as
follows:
When the object is created, memory is allocated for it, the constructor is run, and the object is considered live.
If the object, or any part of it, cannot be accessed by any possible continuation of execution, other than the running of
destructors, the object is considered no longer in use, and it becomes
eligible for destruction. The C# compiler and the garbage collector
may choose to analyze code to determine which references to an object
may be used in the future. For instance, if a local variable that is
in scope is the only existing reference to an object, but that local
variable is never referred to in any possible continuation of
execution from the current execution point in the procedure, the
garbage collector may (but is not required to) treat the object as no
longer in use.
Once the object is eligible for destruction, at some unspecified later time the destructor (Section 10.12) (if any) for the
object is run. Unless overridden by explicit calls, the destructor for
the object is run once only.
Once the destructor for an object is run, if that object, or any part of it, cannot be accessed by any possible continuation of
execution, including the running of destructors, the object is
considered inaccessible and the object becomes eligible for
collection.
Finally, at some time after the object becomes eligible for collection, the garbage collector frees the memory associated with
that object.
If you want to go into more detail then you can look here: https://learn.microsoft.com/en-us/dotnet/standard/garbage-collection/index
or I'm sure you can also find some blogs on the topic.

Creating new objects repeatedly on same variable

I suppose this is a very silly question but I've been looking around and couldn't find answers on the following questions. Really appreciate answers shedding light on this.
1) What happens to the previous object if instantiating a new one within the same method. Example:
DataTable dataTable = new DataTable();
dataTable = new DataTable(); // Will the previously created object be destroyed and memory freed?
2) Same question as (1) but on static variable. Example:
static private DataView dataView;
private void RefreshGridView()
{
dataView = new DataView(GetDataTable()); // Will the previously created objects be destroyed and memory freed?
BindGridView();
}
Thanks!

// Will the previously created object be destroyed and memory freed?
Potentially. As soon as you do this, you'll no longer be holding a reference to the original DataTable. As long as nothing else references this object, it will become eligible for garbage collection. At some point after that, the GC will run, and collect the object, which will in turn free it's memory.
This is true for static, instance, and local variables. The mechanism is identical in all of these scnearios.
Note that if the object you're referencing implements IDisposable, it's a good practice to call Dispose() on it before losing the reference. Technically, a properly implemented IDisposable implementation will eventually handle things properly, but native resources may be tied up until the GC collection occurs, which may not happen quickly. Note that this has nothing to do with (managed) memory, but is still a good practice.
That being said, your first example is a bad practice. While the memory will (eventually) get cleaned up, you're performing extra allocations that serve no purpose whatsoever, so it'd be better to not "double allocate" the variable.

The short answer is that all of this is handled by the garbage collector. The instances won't be immediately deleted, but they will be marked as "unreachable" and freed at a later time.
I suggest having a read through the Garbage Collection article at MSDN

Objects are reference variable in C#. That means they store memory references in them.
So when you re-assign an object, you simply, overwrite the earlier value (memory reference) that it held. And the earlier value is now qualified for Garbage Collection. Now its Garbage Collector's job to free that memory. Same is for all kinds of variables including static

Which objects can I use in a finalizer method?

I have a class that should delete some file when disposed or finalized. Inside finalizers I can't use other objects because they could have been garbage-collected already.
Am I missing some point regarding finalizers and strings could be used?
UPD: Something like that:
public class TempFileStream : FileStream
{
private string _filename;
public TempFileStream(string filename)
:base(filename, FileMode.Open, FileAccess.Read, FileShare.Read)
{
_filename = filename;
}
protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (_filename == null) return;
try
{
File.Delete(_filename); // <-- oops! _filename could be gc-ed already
_filename = null;
}
catch (Exception e)
{
...
}
}
}

Yes, you can most certainly use strings from within a finalizer, and many other object types.
For the definitive source of all this, I would go pick up the book CLR via C#, 3rd edition, written by Jeffrey Richter. In chapter 21 this is all described in detail.
Anyway, here's what is really happening...
During garbage collection, any objects that have a finalizer that still wants to be called are placed on a special list, called the freachable list.
This list is considered a root, just as static variables and live local variables are. Therefore, any objects those objects refer to, and so on recursively is removed from the garbage collection cycle this time. They will survive the current garbage collection cycle as though they weren't eligible to collect to begin with.
Note that this includes strings, which was your question, but it also involves all other object types
Then, at some later point in time, the finalizer thread picks up the object from that list, and runs the finalizer on those objects, and then takes those objects off that list.
Then, the next time garbage collection runs, it finds the same objects once more, but this time the finalizer no longer wants to run, it has already been executed, and so the objects are collected as normal.
Let me illustrate with an example before I tell you what doesn't work.
Let's say you have objects A through Z, and each object references the next one, so you have object A referencing object B, B references C, C references D, and so on until Z.
Some of these objects implement finalizers, and they all implement IDisposable. Let's assume that A does not implement a finalizer but B does, and then some of the rest does as well, it's not important for this example which does beyond A and B.
Your program holds onto a reference to A, and only A.
In an ordinary, and correct, usage pattern you would dispose of A, which would dispose of B, which would dispose of C, etc. but you have a bug, so this doesn't happen. At some point, all of these objects are eligible for collection.
At this point GC will find all of these objects, but then notice that B has a finalizer, and it has not yet run. GC will therefore put B on the freachable list, and recursively take C, D, E, etc. up to Z, off of the GC list, because since B suddenly became in- eligible for collection, so does the rest. Note that some of these objects are also placed on the freachable list themselves, because they have finalizers on their own, but all the objects they refer to will survive GC.
A, however, is collected.
Let me make the above paragraph clear. At this point, A has been collected, but B, C, D, etc. up to Z are still alive as though nothing has happened. Though your code no longer has a reference to any of them, the freachable list has.
Then, the finalizer thread runs, and finalizes all of the objects in the freachable list, and takes the objects off of the list.
The next time GC is run, those objects are now collected.
So that certainly works, so what is the big bruaha about?
The problem is with the finalizer thread. This thread makes no assumptions about the order in which it should finalize those objects. It doesn't do this because in many cases it would be impossible for it to do so.
As I said above, in an ordinary world you would call dispose on A, which disposes B, which disposes C, etc. If one of these objects is a stream, the object referencing the stream might, in its call to Dispose, say "I'll just go ahead and flush my buffers before disposing the stream." This is perfectly legal and lots of existing code do this.
However, in the finalization thread, this order is no longer used, and thus if the stream was placed on the list before the objects that referenced it, the stream is finalized, and thus closed, before the object referencing it.
In other words, what you cannot do is summarized as follows:
You can not access any objects your object refer to, that has finalizers, as you have no guarantee that these objects will be in a usable state when your finalizer runs. The objects will still be there, in memory, and not collected, but they may be closed, terminated, finalized, etc. already.
So, back to your question:
Q. Can I use strings in finalizer method?
A. Yes, because strings do not implement a finalizer, and does not rely on other objects that has a finalizer, and will thus be alive and kicking at the time your finalizer runs.
The assumption that made you take the wrong path is the second sentence of the qustion:
Inside finalizers I can't use other objects because they could have been garbage-collected already.
The correct sentence would be:
Inside finalizer I can't use other objects that have finalizers, because they could have been finalized already.
For an example of something the finalizer would have no way of knowing the order in which to correctly finalize two objects, consider two objects that refer to each other and that both have finalizers. The finalizer thread would have to analyze the code to determine in which order they would normally be disposed, which might be a "dance" between the two objects. The finalizer thread does not do this, it just finalizes one before the other, and you have no guarantee which is first.
So, is there any time it is safe to access objects that also have a finalizer, from my own finalizer?
The only guaranteed safe scenario is when your program/class library/source code owns both objects so that you know that it is.
Before I explain this, this is not really good programming practices, so you probably shouldn't do it.
Example:
You have an object, Cache, that writes data to a file, this file is never kept open, and is thus only open when the object needs to write data to it.
You have another object, CacheManager, that uses the first one, and calls into the first object to give it data to write to the file.
CacheManager has a finalizer. The semantics here is that if the manager class is collected, but not disposed, it should delete the caches as it cannot guarantee their state.
However, the filename of the cache object is retrievable from a property of the cache object.
So the question is, do I need to make a copy of that filename into the manager object, to avoid problems during finalization?
Nope, you don't. When the manager is finalized, the cache object is still in memory, as is the filename string it refers to. What you cannot guarantee, however, is that any finalizer on the cache object hasn't already run.
However, in this case, if you know that the finalizer of the cache object either doesn't exist, or doesn't touch the file, your manager can read the filename property of the cache object, and delete the file.
However, since you now have a pretty strange dependency going on here, I would certainly advice against it.

Another point not yet mentioned is that although one might not expect that an object's finalizer would ever run while an object is in use, the finalization mechanism does not ensure that. Finalizers can be run in an arbitrary unknown threading context; as a consequence, they should either avoid using any types that aren't thread-safe, or should use locking or other means to ensure that they only use things in thread-safe fashion. Note finalizers should use Monitor.TryEnter rather than Monitor.Enter, and endeavor to act as gracefully as possible if a lock is unexpectedly held. Note that since finalizers aren't supposed to run while an object is still in use, the fact that a lock was unexpectedly held will often suggest that a finalizer was run early. Depending upon the design of the code which uses the lock, it may be possible to have the finalizer set a flag and try again to acquire the lock, and have any other code which uses the lock check after releasing it whether that flag is set and, if so, reregister the object for finalization.
Handling finalization cleanup correctly in all threading scenarios is difficult. Finalization might not seem complicated, but no convenient automated mechanisms exist by which objects can ensure that finalizers won't run while the objects in question are in use. Consequently, finalizers have a lot of subtle thread-safety issues. Code which ignores such issues will "usually" work, but may sometimes fail in difficult-to-diagnose ways.

You can call the dispose method inside your finalizer and have the file cleanup code in the Dispose method. Along with that you can also pass a boolean to your dispose method that indicates that you are invoking it from the finalizer.
For an excellent reference on the proper usage of Dispose and Fianlizers , read this Proper use of the IDisposable interface

How does a new statement allocate heap memory?

private button btnNew=new button();
btnNew.addclickhandler(this);
private DataGrid grid;
private void onClick(event click) {grid=new DataGrid();}
Hello ,I write a code like this sample ,I want to know that every time a user click on btnNew,what is going on in heap and stack memory?for example does a new block in heap memory assign to this grid?Or an older block remove and this new block replace it ?Or an older block remains in heap memory and also new block assign to it.
Is this block of code allocate a huge memory on several click?
**The DataGrid could be replace with any component I want to know about this type of new statement usage and memory allocation **
sorry, for my bad english!
.

what is going on with respect to heap
and stack memory?
since the button is reference type and declared in global will be allocated in heap, not in stack.
Is a new block in heap memory assigned to this button?
yes if memory is available, else unreached references will be removed and this one is allocated
Does this block of code allocate a
large amount of memory on a single
click?
No, but it will, if you add thousand buttons
Check out this cool article Memory in .NET - what goes where by Jon Skeet to understand the memory internals better..
Cheers

This is a huge topic. This is akin to asking "you type www.amazon.com into a browser. What happens next?" To answer that question fully you have to explain the architecture of the entire internet. To answer your question fully you have to understand the entire memory model of a modern operating system.
You should start by reading about the fundamentals of memory and garbage collection, here:
http://msdn.microsoft.com/en-us/library/ee787088.aspx
and then ask more specific questions about things you don't understand.

Using the new statement allocates memory on the heap. In general new memory is allocated. If the btnNew pointer was the only pointer associated with a button object, it should become a target to the garbage collector. So the memory will be freed again. For multiple clicks the same will happen, but you should be aware that the garbage collector does not work in real time. So in a high-frequency loop allocating large objects - "new" can become a problem in c#.

if button is a class (ref type), it is allocated on the heap. (value-types are allocated on the stack in the current CLR implementation, unless they are contained by another reference type or captured in a closure - in which case they are on the heap.).
The garbage collector has pre-allocated segments of memory of different sizes corresponding to generations 0, 1 and 2. WHen you new up an object, it is allocated in generation 0. And this allocation is really fast, since it is just moving a pointer by a delta = size of the object. The CLR clears the values in the object to default values as a prereq step before executing the ctor.
Periodically all threads are paused and the garbage collector runs. It creates a graph of reachable objects by traversing "roots". All unreachable objects are discarded. The generation segments are moved around / compacted to avoid fragmentation. Gen 0 is collected more frequently than 1 and so on... (since Gen-0 objects are likely to be short-lived objects). After the collection, the app threads resume.
For more on this, refer to documents explaining the garbage collector and generations. Here's one.

Destroying a struct object in C#?

I am a bit confused about the fact that in C# only the reference types get garbage collected.
That means GC picks only the reference types for memory de-allocation.
So what happens with the value types as they also occupy memory on stack ?

For a start, whether they're on the stack or part of the heap depends on what context they're part of - if they're within a reference type, they'll be on the heap anyway. (You should consider how much you really care about the stack/heap divide anyway - as Eric Lippert has written, it's largely an implementation detail.)
However, basically value type memory is reclaimed when the context is reclaimed - so when the stack is popped by you returning from a method, that "reclaims" the whole stack frame. Likewise if the value type value is actually part of an object, then the memory is reclaimed when that object is garbage collected.
The short answer is that you don't have to worry about it :) (This assumes you don't have anything other than the memory to worry about, of course - if you've got structs with references to native handles that need releasing, that's a somewhat different scenario.)

I am a bit confused about the fact that in C# only the reference types get garbage collected.
This is not a fact. Or, rather, the truth or falsity of this statement depends on what you mean by "get garbage collected". The garbage collector certainly looks at value types when collecting; those value types might be alive and holding on to a reference type:
struct S { public string str; }
...
S s = default(S); // local variable of value type
s.str = M();
when the garbage collector runs it certainly looks at s, because it needs to determine that s.str is still alive.
My suggestion: clarify precisely what you mean by the verb "gets garbage collected".
GC picks only the reference types for memory de-allocation.
Again, this is not a fact. Suppose you have an instance of
class C { int x; }
the memory for the integer will be on the garbage-collected heap, and therefore reclaimed by the garbage collector when the instance of C becomes unrooted.
Why do you believe the falsehood that only the memory of reference types is deallocated by the garbage collector? The correct statement is that memory that was allocated by the garbage collector is deallocated by the garbage collector, which I think makes perfect sense. The GC allocated it so it is responsible for cleaning it up.
So what happens with the value types as they also occupy memory on stack ?
Nothing at all happens to them. Nothing needs to happen to them. The stack is a million bytes. The size of the stack is determined when the thread starts up; it starts at a million bytes and it stays a million bytes throughout the entire execution of the thread. Memory on the stack is neither created nor destroyed; only its contents are changed.

There are too many verbs used in this question, like destroyed, reclaimed, deallocated, removed. That doesn't correspond well with what actually happens. A local variable simply ceases to be, Norwegian parrot style.
A method has a single point of entry, first thing that happens is that the CPU stack pointer is adjusted. Creating a "stack frame", storage space for the local variables. The CLR guarantees that this space is initialized to 0, not otherwise a feature you use strongly in C# because of the definite assignment rule.
A method has a single point of exit, even if you method code is peppered with multiple return statements. At that point, the stack pointer is simply restored to its original value. In effect it "forgets" that the local variables where ever there. Their values are not 'scrubbed' in any way, the bytes are still there. But they won't last long, the next call in your program are going to overwrite them again. The CLR zero-initialization rule ensures that you can never observe those old values, that would be insecure.
Very, very fast, takes no more than a single processor cycle. A visible side-effect of this behavior in the C# language is that value types cannot have a finalizer. Ensuring no extra work has to be done.

A value type on the stack is removed from the stack when it goes out of scope.

Value types are destroyed as soon as they go out of scope.

value types would get deallocated when the stack frame is removed after it has been executed i would assume

Would also like to add that stack is at a thread level, and heap is at application domain level.
So, when a thread ends it would reclam the stack memory used by that specific thread.

Every value type instance in .NET will be part of something else, which could be a larger enclosing value type instance, a heap object, or a stack frame. Whenever any of those things come into being, any structures within them will also come into being; those structures will then continue to exist as long as the thing containing them does. When the thing that contains the structure ceases to exist, the structure will as well. There is no way to destroy a structure without destroying the container, and there is no way to destroy something that contains one or more structures without destroying the structures contained therein.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.