I'm currently working with IronPython and for some reason when I try to create many ScriptScopes with variables in a separate AppDomain, my memory usage grows without ever GC'ing. When I run this same code in the current AppDomain, garbage collection works fine and the memory never grows. Below is a simple program to reproduce the OutOfMemoryException.
I have also tested setting one large string variable instead of many variables and get the same result. I'm running IronPython 2.7.4 on .NET version 4.5.50709, with my builds targeting x86 and .NET Framework 4. Is there something special I need to do to release unused ScriptScopes in a separate AppDomain or is this a memory leak?
public static void Main()
{
var testLeak = true;
var appDomain = testLeak ? AppDomain.CreateDomain("test") : AppDomain.CurrentDomain;
var engine = Python.CreateEngine(
appDomain, new Dictionary<string, object> { { "LightweightScopes", true } });
var scriptSource = engine.CreateScriptSourceFromString("pass");
var compiledCode = scriptSource.Compile();
for (var i = 0; i < 10000; i++)
{
var scope = engine.CreateScope();
for (var j = 0; j < 100000; j++)
{
scope.SetVariable("test" + j, j);
}
compiledCode.Execute(scope);
}
}
I figured this one out myself. If you look at the source for ScriptScope, this is how it overrides MarshalByRefObject:
public override object InitializeLifetimeService()
{
return (object) null;
}
This means it will never be garbage collected in a remote AppDomain. I guess I'll have to find a way to work with a single ScriptEngine and ScriptScope.
As you are just creating the AppDomain to host the PythonEngine, why are you not unloading it after usage? This takes care of the memory leak.
AppDomain.Unload(appDomain);
Consider the code below:
using System;
namespace memoryEater
{
internal class Program
{
private static void Main(string[] args)
{
Console.WriteLine("alloc 1");
var big1 = new BigObject();
Console.WriteLine("alloc 2");
var big2 = new BigObject();
Console.WriteLine("null 1");
big1 = null;
//GC.Collect();
Console.WriteLine("alloc3");
big1 = new BigObject();
Console.WriteLine("done");
Console.Read();
}
}
public class BigObject
{
private const uint OneMeg = 1024 * 1024;
private static int _idCnt;
private readonly int _myId;
private byte[][] _bigArray;
public BigObject()
{
_myId = _idCnt++;
Console.WriteLine("BigObject {0} creating... ", _myId);
_bigArray = new byte[700][];
for (int i = 0; i < 700; i++)
{
_bigArray[i] = new byte[OneMeg];
}
for (int j = 0; j < 700; j++)
{
for (int i = 0; i < OneMeg; i++)
{
_bigArray[j][i] = (byte)i;
}
}
Console.WriteLine("done");
}
~BigObject()
{
Console.WriteLine("BigObject {0} finalised", _myId);
}
}
}
I have a class, BigObject, which creates a 700MiB array in its constructor, and has a finalise method which does nothing other than print to console. In Main, I create two of these objects, free one, and then create a third.
If this is compiled for 32 bit (so as to limit memory to 2 gigs), an out of memory exception is thrown when creating the third BigObject. This is because, when memory is requested for the third time, the request cannot be satisfied and so the garbage collector runs. However, the first BigObject, which is ready to be collected, has a finaliser method so instead of being collected is placed on the finalisation queue and is finalised. The garbage collecter then halts and the exception is thrown. However, if the call to GC.Collect is uncommented, or the finalise method is removed, the code will run fine.
My question is, why does the garbage collector not do everything it can to satisfy the request for memory? If it ran twice (once to finalise and again to free) the above code would work fine. Shouldn't the garbage collector continue to finalise and collect until no more memory can be free'd before throwing the exception, and is there any way to configure it to behave this way (either in code or through Visual Studio)?
Its undeterministic when GC will work and try to reclaim memory.
If you add this line after big1 = null . However you should be carefult about forcing GC to collect. Its not recommended unless you know what you are doing.
GC.Collect();
GC.WaitForPendingFinalizers();
Best Practice for Forcing Garbage Collection in C#
When should I use GC.SuppressFinalize()?
Garbage collection in .NET (generations)
I guess its because the time the finalizer executes during garbage collection is undefined. Resources are not guaranteed to be released at any specific time (unless calling a Close method or a Dispose method.), also the order that finalizers are run is random so you could have a finalizer on another object waiting, while your object waits for that.
Suppose an object has a Finalize() method.
When it is first created, a pointer in the finalization queue was added.
The object has no references .
When a garbage collection occurs, it moves the reference from the finalization queue to the f-reachable queue and a thread is started to run the Finalize method (sequentially after other objects' Finalize methods).
So the object now (after resurrection) has only one root which is the pointer from the f-reachable queue.
At this point, does/did the object get promoted to the next generation ?
This is something you can just try. Run this code in the Release build without a debugger attached:
using System;
class Program {
static void Main(string[] args) {
var obj = new Foo();
// There are 3 generations, could have used GC.MaxGeneration
for (int ix = 0; ix < 3; ++ix) {
GC.Collect();
GC.WaitForPendingFinalizers();
}
Console.ReadLine();
}
}
class Foo {
int attempt = 0;
~Foo() {
// Don't try to do this over and over again, rough at program exit
if (attempt++ < 3) {
GC.ReRegisterForFinalize(this);
Console.WriteLine(GC.GetGeneration(this));
}
}
}
Output:
1
2
2
So it stays in the generation it got moved to by the collection, moving to the next one on each collection until it hits the last one. Which makes sense in general.
It seems like the answer is yes, this will happen. http://msdn.microsoft.com/en-us/magazine/bb985010.aspx says:
... Two GCs are required to reclaim memory used by objects that require finalization. In reality, more than two collections may be necessary since the objects could get promoted to an older generation.
Suppose i am having a class
Class ABC
{
public string Method1()
{
return "a";
}
public string Method2()
{
return "b";
}
public string Method3()
{
return "c";
}
}
and Now i am calling this methods in two ways like :
ABC obj=new ABC();
Response.Write(obj.Method1());
Response.Write(obj.Method2());
Another way
Response.Write(new ABC().Method1());
Response.Write(new ABC().Method2());
The output will be same for above two method .
Can some please help me understanding the difference between obj.Method1() and new ABC().Method1()
Thanks in Advance..
obj and new ABC() are separate instances. In your example the output is the same because there is no instance-level data to show.
Try this to see the difference:
Class ABC
{
public string Name = "default";
public string Method1()
{
return "a";
}
}
then use the code below to show the difference with instance-level data:
ABC obj=new ABC();
obj.Name = "NewObject";
Response.Write(obj.Method1());
Response.Write(obj.Name);
Response.Write(new ABC().Method1());
Response.Write(new ABC().Name);
What #d-stanley is trying to say is that you allocate memory on creation that is is very valuable resource.
And the more complete answer is this: Classes created with some logic in mind. Although is perfectly workable Response.Write(new ABC().Method1()); but this is very short function and not as much useless... When you design class you implemented some logic boundary functionality and properties. For example FileStream has a inner property of Stream and make it accessible via various properties and you could set it in overloaded Open() method and destroy it in Dispose() method. And for example another class BinaryReader implements Stream also but threat it differently. From your logic you could implement all functions on single class - some MotherOfAllFunctions class the implements all the functions of FileStream and BinaryReader - but it's not a way of doing it.
Another point: In most of the cases some (or huge) ammount of memory is taken to initialize some internal logic of the class - for example SqlConnection class. Then you call Open() or any other method to call a database - there's some very powerful mechanics is thrown kick-in to support state machine initialization, managed-to-unmanagment calls and a lot of code could be executed.
Actually what you doing in any new SomeCLass().SomeMethod<int>(ref AnotherObject) is:
Response.Write(
var tmpABC = new ABC(); // Constructor call . Executed always (may throw)
string result = tmpABC.Method1(); // Or whatever could be casted to `string`
tmpABC.Dispose(); // GC will kick-in and try to free memory
return result;
);
As you see - this is the same code as if you have written it in this way. So what happens here is a lot of memory allocations and almost immediately all this valuable memory is thrown away. It makes more sense to initialize ABC() class and all it functionality power once and then use it everywhere so minimize memory over allocation. For example - it doesn't make any sense to open SqlConnection function in every function call in your DAL class the then immediately close it - better declare local variable and keep it alive - some fully initialized classes live as long as application thread process exist. So in case of this code style:
public class Program
{
private static FileStream streamToLogFile = new FileStream(...);
public int Main(string [] args)
{
new Run(new Form1(streamToLogFile));
}
}
In this logic - there's no need to keep class Form1 and I created it inline but all the functions the need to access FileStream object (valuable resource !) will access the same instance that been initialized only once.
Summary: C#/.NET is supposed to be garbage collected. C# has a destructor, used to clean resources. What happen when an object A is garbage collected the same line I try to clone one of its variable members? Apparently, on multiprocessors, sometimes, the garbage collector wins...
The problem
Today, on a training session on C#, the teacher showed us some code which contained a bug only when run on multiprocessors.
I'll summarize to say that sometimes, the compiler or the JIT screws up by calling the finalizer of a C# class object before returning from its called method.
The full code, given in Visual C++ 2005 documentation, will be posted as an "answer" to avoid making a very very large questions, but the essential are below:
The following class has a "Hash" property which will return a cloned copy of an internal array. At is construction, the first item of the array has a value of 2. In the destructor, its value is set to zero.
The point is: If you try to get the "Hash" property of "Example", you'll get a clean copy of the array, whose first item is still 2, as the object is being used (and as such, not being garbage collected/finalized):
public class Example
{
private int nValue;
public int N { get { return nValue; } }
// The Hash property is slower because it clones an array. When
// KeepAlive is not used, the finalizer sometimes runs before
// the Hash property value is read.
private byte[] hashValue;
public byte[] Hash { get { return (byte[])hashValue.Clone(); } }
public Example()
{
nValue = 2;
hashValue = new byte[20];
hashValue[0] = 2;
}
~Example()
{
nValue = 0;
if (hashValue != null)
{
Array.Clear(hashValue, 0, hashValue.Length);
}
}
}
But nothing is so simple...
The code using this class is wokring inside a thread, and of course, for the test, the app is heavily multithreaded:
public static void Main(string[] args)
{
Thread t = new Thread(new ThreadStart(ThreadProc));
t.Start();
t.Join();
}
private static void ThreadProc()
{
// running is a boolean which is always true until
// the user press ENTER
while (running) DoWork();
}
The DoWork static method is the code where the problem happens:
private static void DoWork()
{
Example ex = new Example();
byte[] res = ex.Hash; // [1]
// If the finalizer runs before the call to the Hash
// property completes, the hashValue array might be
// cleared before the property value is read. The
// following test detects that.
if (res[0] != 2)
{
// Oops... The finalizer of ex was launched before
// the Hash method/property completed
}
}
Once every 1,000,000 excutions of DoWork, apparently, the Garbage Collector does its magic, and tries to reclaim "ex", as it is not anymore referenced in the remaning code of the function, and this time, it is faster than the "Hash" get method. So what we have in the end is a clone of a zero-ed byte array, instead of having the right one (with the 1st item at 2).
My guess is that there is inlining of the code, which essentially replaces the line marked [1] in the DoWork function by something like:
// Supposed inlined processing
byte[] res2 = ex.Hash2;
// note that after this line, "ex" could be garbage collected,
// but not res2
byte[] res = (byte[])res2.Clone();
If we supposed Hash2 is a simple accessor coded like:
// Hash2 code:
public byte[] Hash2 { get { return (byte[])hashValue; } }
So, the question is: Is this supposed to work that way in C#/.NET, or could this be considered as a bug of either the compiler of the JIT?
edit
See Chris Brumme's and Chris Lyons' blogs for an explanation.
http://blogs.msdn.com/cbrumme/archive/2003/04/19/51365.aspx
http://blogs.msdn.com/clyon/archive/2004/09/21/232445.aspx
Everyone's answer was interesting, but I couldn't choose one better than the other. So I gave you all a +1...
Sorry
:-)
Edit 2
I was unable to reproduce the problem on Linux/Ubuntu/Mono, despite using the same code on the same conditions (multiple same executable running simultaneously, release mode, etc.)
It's simply a bug in your code: finalizers should not be accessing managed objects.
The only reason to implement a finalizer is to release unmanaged resources. And in this case, you should carefully implement the standard IDisposable pattern.
With this pattern, you implement a protected method "protected Dispose(bool disposing)". When this method is called from the finalizer, it cleans up unmanaged resources, but does not attempt to clean up managed resources.
In your example, you don't have any unmanaged resources, so should not be implementing a finalizer.
What you're seeing is perfectly natural.
You don't keep a reference to the object that owns the byte array, so that object (not the byte array) is actually free for the garbage collector to collect.
The garbage collector really can be that aggressive.
So if you call a method on your object, which returns a reference to an internal data structure, and the finalizer for your object mess up that data structure, you need to keep a live reference to the object as well.
The garbage collector sees that the ex variable isn't used in that method any more, so it can, and as you notice, will garbage collect it under the right circumstances (ie. timing and need).
The correct way to do this is to call GC.KeepAlive on ex, so add this line of code to the bottom of your method, and all should be well:
GC.KeepAlive(ex);
I learned about this aggressive behavior by reading the book Applied .NET Framework Programming by Jeffrey Richter.
this looks like a race condition between your work thread and the GC thread(s); to avoid it, i think there are two options:
(1) change your if statement to use ex.Hash[0] instead of res, so that ex cannot be GC'd prematurely, or
(2) lock ex for the duration of the call to Hash
that's a pretty spiffy example - was the teacher's point that there may be a bug in the JIT compiler that only manifests on multicore systems, or that this kind of coding can have subtle race conditions with garbage collection?
I think what you are seeing is reasonable behavior due to the fact that things are running on multiple threads. This is the reason for the GC.KeepAlive() method, which should be used in this case to tell the GC that the object is still being used and that it isn't a candidate for cleanup.
Looking at the DoWork function in your "full code" response, the problem is that immediately after this line of code:
byte[] res = ex.Hash;
the function no longer makes any references to the ex object, so it becomes eligible for garbage collection at that point. Adding the call to GC.KeepAlive would prevent this from happening.
Yes, this is an issue that has come up before.
Its even more fun in that you need to run release for this to happen and you end up stratching your head going 'huh, how can that be null?'.
Interesting comment from Chris Brumme's blog
http://blogs.msdn.com/cbrumme/archive/2003/04/19/51365.aspx
class C {<br>
IntPtr _handle;
Static void OperateOnHandle(IntPtr h) { ... }
void m() {
OperateOnHandle(_handle);
...
}
...
}
class Other {
void work() {
if (something) {
C aC = new C();
aC.m();
... // most guess here
} else {
...
}
}
}
So we can’t say how long ‘aC’ might live in the above code. The JIT might report the reference until Other.work() completes. It might inline Other.work() into some other method, and report aC even longer. Even if you add “aC = null;” after your usage of it, the JIT is free to consider this assignment to be dead code and eliminate it. Regardless of when the JIT stops reporting the reference, the GC might not get around to collecting it for some time.
It’s more interesting to worry about the earliest point that aC could be collected. If you are like most people, you’ll guess that the soonest aC becomes eligible for collection is at the closing brace of Other.work()’s “if” clause, where I’ve added the comment. In fact, braces don’t exist in the IL. They are a syntactic contract between you and your language compiler. Other.work() is free to stop reporting aC as soon as it has initiated the call to aC.m().
That's perfectly nornal for the finalizer to be called in your do work method as after the
ex.Hash call, the CLR knows that the ex instance won't be needed anymore...
Now, if you want to keep the instance alive do this:
private static void DoWork()
{
Example ex = new Example();
byte[] res = ex.Hash; // [1]
// If the finalizer runs before the call to the Hash
// property completes, the hashValue array might be
// cleared before the property value is read. The
// following test detects that.
if (res[0] != 2) // NOTE
{
// Oops... The finalizer of ex was launched before
// the Hash method/property completed
}
GC.KeepAlive(ex); // keep our instance alive in case we need it.. uh.. we don't
}
GC.KeepAlive does... nothing :) it's an empty not inlinable /jittable method whose only purpose is to trick the GC into thinking the object will be used after this.
WARNING: Your example is perfectly valid if the DoWork method were a managed C++ method... You DO have to manually keep the managed instances alive manually if you don't want the destructor to be called from within another thread. IE. you pass a reference to a managed object who is going to delete a blob of unmanaged memory when finalized, and the method is using this same blob. If you don't hold the instance alive, you're going to have a race condition between the GC and your method's thread.
And this will end up in tears. And managed heap corruption...
The Full Code
You'll find below the full code, copy/pasted from a Visual C++ 2008 .cs file. As I'm now on Linux, and without any Mono compiler or knowledge about its use, there's no way I can do tests now. Still, a couple of hours ago, I saw this code work and its bug:
using System;
using System.Threading;
public class Example
{
private int nValue;
public int N { get { return nValue; } }
// The Hash property is slower because it clones an array. When
// KeepAlive is not used, the finalizer sometimes runs before
// the Hash property value is read.
private byte[] hashValue;
public byte[] Hash { get { return (byte[])hashValue.Clone(); } }
public byte[] Hash2 { get { return (byte[])hashValue; } }
public int returnNothing() { return 25; }
public Example()
{
nValue = 2;
hashValue = new byte[20];
hashValue[0] = 2;
}
~Example()
{
nValue = 0;
if (hashValue != null)
{
Array.Clear(hashValue, 0, hashValue.Length);
}
}
}
public class Test
{
private static int totalCount = 0;
private static int finalizerFirstCount = 0;
// This variable controls the thread that runs the demo.
private static bool running = true;
// In order to demonstrate the finalizer running first, the
// DoWork method must create an Example object and invoke its
// Hash property. If there are no other calls to members of
// the Example object in DoWork, garbage collection reclaims
// the Example object aggressively. Sometimes this means that
// the finalizer runs before the call to the Hash property
// completes.
private static void DoWork()
{
totalCount++;
// Create an Example object and save the value of the
// Hash property. There are no more calls to members of
// the object in the DoWork method, so it is available
// for aggressive garbage collection.
Example ex = new Example();
// Normal processing
byte[] res = ex.Hash;
// Supposed inlined processing
//byte[] res2 = ex.Hash2;
//byte[] res = (byte[])res2.Clone();
// successful try to keep reference alive
//ex.returnNothing();
// Failed try to keep reference alive
//ex = null;
// If the finalizer runs before the call to the Hash
// property completes, the hashValue array might be
// cleared before the property value is read. The
// following test detects that.
if (res[0] != 2)
{
finalizerFirstCount++;
Console.WriteLine("The finalizer ran first at {0} iterations.", totalCount);
}
//GC.KeepAlive(ex);
}
public static void Main(string[] args)
{
Console.WriteLine("Test:");
// Create a thread to run the test.
Thread t = new Thread(new ThreadStart(ThreadProc));
t.Start();
// The thread runs until Enter is pressed.
Console.WriteLine("Press Enter to stop the program.");
Console.ReadLine();
running = false;
// Wait for the thread to end.
t.Join();
Console.WriteLine("{0} iterations total; the finalizer ran first {1} times.", totalCount, finalizerFirstCount);
}
private static void ThreadProc()
{
while (running) DoWork();
}
}
For those interested, I can send the zipped project through email.