Memory leaking for service passed to unloaded AppDomain - c#

I am experiencing a memory leak in context with AppDomains. I've stripped it down to the following:
I got 3 projects, two library projects and a console project: Shared, DynamicallyLoadable and RemotingTimeoutPrototype (the console program). Shared contains interfaces used by both DynamicallyLoadable and RemotingTimeoutPrototype. Both reference Shared at compile-time. There are no other compile-time references between any of the projects.
Shared contains this:
public interface IHostService
{
string GetStuff();
}
public interface IRemoteClass
{
IHostService Alpha { get; set; }
string CallHostServices();
}
DynamicallyLoaded contains only a single type:
public class RemoteClass : MarshalByRefObject, IRemoteClass
{
public IHostService Alpha { get; set; }
public string CallHostServices()
{
Console.WriteLine("Domain {0}, RemoteClass.CallHostServices():", AppDomain.CurrentDomain.Id);
return Alpha.GetStuff();
}
}
The console project contains an implementation of IHostService:
public class Alpha : MarshalByRefObject, IHostService
{
readonly byte[] mBuffer = new byte[100*1024*1024];
public Alpha()
{
for (var i = 0; i < mBuffer.Length; ++i)
mBuffer[i] = (byte) (i%256);
}
public string GetStuff()
{
return "Alpha";
}
}
Program.Main consists of this:
while (true)
{
var otherDomain = AppDomain.CreateDomain("OtherDomain");
var proxy =
(IRemoteClass)
otherDomain.CreateInstanceFromAndUnwrap("../../../DynamicallyLoadable/bin/debug/DynamicallyLoadable.dll",
"DynamicallyLoadable.RemoteClass");
var alpha = new Alpha();
proxy.Alpha = alpha;
Console.WriteLine(proxy.CallHostServices());
Thread.Sleep(TimeSpan.FromSeconds(2));
AppDomain.Unload(otherDomain);
RemotingServices.Disconnect(alpha); // this was just an attempt, doesn't change a thing whether it's there or not
GC.Collect(); // same here, this shouldn't really be necessary, I just tried it out of despair
}
Now when I let this run for a while, memory consumption increases constantly and eventually I hit an OutOfMemoryException.
I would expect that if the domain holding the proxy is unloaded, then the proxy is unloaded, too, and there are no references to the concrete Alpha anymore, so this would be collected. But obviously it is not.
Note that I also checked that the domain is really being unloaded by referencing mscoree and enumerating the loaded domains with code along the lines of:
var runtimeHost = new CorRuntimeHost();
runtimeHost.EnumDomains(out handle);
// etc.
Also, if I attach a handler to otherDomain.DomainUnload(), this handler is called just fine.
Can anyone shed some light on this, please?

It might be a typical problem of blocked finalization thread (as you have while-loop in, probably, STA thread and never give control to another threads explicitly). Even though you unload/dispose the domain it might be in the memory because it is not finalised. Read more here: http://alexatnet.com/articles/memory-leaks-in-net-applications - specifically the section "Blocked Finalization Thread" (I'm the author)

Related

How to check if a function has no side-effects (is pure) at runtime?

So say we have loaded a function F that take in/out a set of args and returns a result. How to check at runtime if this F does not act on anything other than args members and functions? Meaning no Console.Writeline, Singletons (or other stuff not presented in args). Is it possible with CodeContracts library or some other solution?
Say we know that [Pure] attribute was presented in the function definition. This sucks for many cases when we have a lambda, yet at least it would be something
Why I do not see how [Pure] can help - this code compiles:
class Test {
public struct Message {
public string Data;
}
public struct Package {
public int Size;
}
[Pure]
public static List<Package> Decomposse(Message m) {
Console.WriteLine("rrrr"); // This sould not happen
var mtu = 1400;
Package p = new Package{Size = mtu};
return Enumerable.Repeat(p, m.Data.Length / mtu).ToList();
}
}
And I want to eliminate (or at least detect that function calls stuff like Console.WriteLine("rrrr"))
It doesn't matter if the function has inputs or a result. Too many things can happen in a code body, e.g. instantiated object constructors. The problem is the modern language.
What about safe API calls which just retrieve data like DateTime.Now()? Are you going to build a list of API calls which mutate state and keep it updated for the rest of us over time, for all applications in your organization or on earth? Are you going to document what processes the compiler inlines? Then by reducing this approach to absurdity, can we accept it is not feasible?
My architecture models machines which should only change "Product" data points, but even I admit this is an unenforceable rule. I have other rules as well to try to enforce determinism. However, these modules must make API calls at some point to do the meaningful work already organized in APIs today. Otherwise we would rewrite them all.
class Machine1Product
{
public Cvar<int> Y { get; set; }
}
class Machine1 : Producer<Machine1Product>, IMachine
{
public Cvar<int> X { get; set; }
public void M()
{
// work which changes only product data points (Y)
}
}
Until a minimalist language is developed for functions, there is no observing or preventing side effects.

Unclear behavior by Garbage Collector while collecting instance properties or fields of reachable object

Till today I was thinking that members of reachable objects are also considered to be reachable.
But, today I found one behavior which creates a problem for us either when Optimize Code is checked or application is executed in Release Mode. It is clear that, release mode comes down to the code optimization as well. So, it seems code optimization is reason for this behavior.
Let's take a look to that code:
public class Demo
{
public Action myDelWithMethod = null;
public Demo()
{
myDelWithMethod = new Action(Method);
// ... Pass it to unmanaged library, which will save that delegate and execute during some lifetime
// Check whether object is alive or not after GC
var reference = new WeakReference(myDelWithMethod, false);
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, true);
GC.WaitForPendingFinalizers();
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, true);
Console.WriteLine(reference.IsAlive);
// end - Check whether object is alive or not after GC
}
private void Method() { }
}
I simplified code a bit. Actually, we are using our special delegate, not Action. But the behavior is same. This code is written in mind with "members of reachable objects are also considered to be reachable". But, that delegate will be collected by GC asap. And we have to pass it to some unmanaged library, which will use it for some time.
You can test demo by just adding that line to the Main method:
var p = new Demo();
I can understand the reason of that optimization, but what is the recommended way to prevent such case without creating another function which will use that variable myDelWithMethod which will be called from some place? One, option I found that, it will work if I will set myDelWithMethod in the constructor like so:
myDelWithMethod = () => { };
Then, it won't be collected until owning instance is collected. It seems it can't optimize code in the same way, if lambda expression is setted as a value.
So, will be happy to hear your thoughts. Here are my questions:
Is it right that, members of reachable objects are also considered to
be reachable?
Why it is not collected in case of lambda expression?
Any recommended ways to prevent collection in such cases?
However strange this would sound, JIT is able to treat an object as unreachable even if the object's instance method is being executed - including constructors.
An example would be the following code:
static void Main(string[] args)
{
SomeClass sc = new SomeClass() { Field = new Random().Next() };
sc.DoSomethingElse();
}
class SomeClass
{
public int Field;
public void DoSomethingElse()
{
Console.WriteLine(this.Field.ToString());
// LINE 2: further code, possibly triggering GC
Console.WriteLine("Am I dead?");
}
~SomeClass()
{
Console.WriteLine("Killing...");
}
}
that may print:
615323
Killing...
Am I dead?
This is because of inlining and Eager Root Collection technique - DoSomethingElse method do not use any SomeClass fields, so SomeClass instance is no longer needed after LINE 2.
This happens to code in your constructor. After // ... Pass it to unmanaged library line your Demo instance becomes unreachable, thus its field myDelWithMethod. This answers the first question.
The case of empty lamba expression is different because in such case this lambda is cached in a static field, always reachable:
public class Demo
{
[Serializable]
[CompilerGenerated]
private sealed class <>c
{
public static readonly <>c <>9 = new <>c();
public static Action <>9__1_0;
internal void <.ctor>b__1_0()
{
}
}
public Action myDelWithMethod;
public Demo()
{
myDelWithMethod = (<>c.<>9__1_0 ?? (<>c.<>9__1_0 = new Action(<>c.<>9.<.ctor>b__1_0)));
}
}
Regarding recommended ways in such scenarios, you need to make sure Demo has lifetime long enough to cover all unmanaged code execution. This really depends on your code architecture. You may make Demo static, or use it in a controlled scope related to the unmanaged code scope. It really depends.

Load same assembly second time with clean static variables

I have a .dll library, which I cannot modify, with classes which uses many static variables and singleton instances.
Now I need a second instance of all these classes and I need some solution which would isolate static variables between instances of some class without altering any other properties of the assembly.
Loading the same assembly second time doesn't actually load it again, but I found that reading it to byte array and then loading it, actually solves half of the problem:
lib.dll:
namespace lib
{
public class Class1 : ILib
{
private static int i;
public int DoSth()
{
return i++;
}
public string GetPath()
{
return typeof(Class1).Assembly.Location;
}
}
}
app.exe:
namespace test
{
public interface ILib
{
int DoSth();
string GetPath();
}
class Program
{
static void Main()
{
var assembly1 = Assembly.LoadFile(Path.GetFullPath(".\\lib.dll"));
var instance1 = (ILib)assembly1.CreateInstance("lib.Class1");
Console.WriteLine(instance1.GetPath());
Console.WriteLine(instance1.DoSth());
Console.WriteLine(instance1.DoSth());
var assembly2 = Assembly.LoadFile(Path.GetFullPath(".\\lib.dll"));
var instance2 = (ILib)assembly2.CreateInstance("lib.Class1");
Console.WriteLine(instance2.GetPath());
Console.WriteLine(instance2.DoSth());
Console.WriteLine(instance2.DoSth());
var assembly3 = AppDomain.CurrentDomain.Load(File.ReadAllBytes("lib.dll"));
var instance3 = (ILib)assembly3.CreateInstance("lib.Class1");
Console.WriteLine(instance3.GetPath());
Console.WriteLine(instance3.DoSth());
Console.WriteLine(instance3.DoSth());
Console.Read();
}
}
}
this returns:
C:\bin\lib.dll
0
1
C:\bin\lib.dll
2
3
0
1
Static variables got restarted but unfortunately the next problem is that assembly location which is used within the library is empty.
I would like to avoid loading the library to different AppDomain because it creates too many problems with cross domain code; some classes are not serializable.
I would like to avoid physically copying the library on disk.
I would like to avoid IL weaving and using Mono.Cecil or similar because it's an overkill.
Loading assembly into separate AppDomain or separate process are only sensible options you have. Either deal with cross-domain/cross-process communication or get version of library that does not have problems you trying to work around.
If you want to fix your load from bytes you'd need to read all articles around https://blogs.msdn.microsoft.com/suzcook/2003/09/19/loadfile-vs-loadfrom/.

Singleton pattern on persistent in-memory cache

Using what I judged was the best of all worlds on the Implementing the Singleton Pattern in C# amazing article, I have been using with success the following class to persist user-defined data in memory (for the very rarely modified data):
public class Params
{
static readonly Params Instance = new Params();
Params()
{
}
public static Params InMemory
{
get
{
return Instance;
}
}
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
public int ChunkSize
{
get
{
// Loc uses the Localizations impl
LC.Loc("params.chunksize").To<int>();
}
}
public void RebuildLocalizations()
{
_localizations = null;
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
My usage would look something like this:
var allLocs = Params.InMemory.Localizations; //etc
Whenever I update the database, the RefreshLocalizations gets called, so only part of my in-memory store is rebuilt. I have a single production environment out of about 10 that seems to be misbehaving when the RefreshLocalizations gets called, not refreshing at all, but this is also seems to be intermittent and very odd altogether.
My current suspicions goes towards the singleton, which I think does the job great and all the unit tests prove that the singleton mechanism, the refresh mechanism and the RAM performance all work as expected.
That said, I am down to these possibilities:
This customer is lying when he says their environment is not using loading balance, which is a setting I am not expecting the in-memory stuff to work properly (right?)
There is some non-standard pool configuration in their IIS which I am testing against (maybe in a Web Garden setting?)
The singleton is failing somehow, but not sure how.
Any suggestions?
.NET 3.5 so not much parallel juice available, and not ready to use the Reactive Extensions for now
Edit1: as per the suggestions, would the getter look something like:
public IEnumerable<Localization> Localizations
{
get
{
lock(_localizations) {
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
}
To expand on my comment, here is how you might make the Localizations property thread safe:
public class Params
{
private object _lock = new object();
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
lock (_lock) {
if ( _localizations == null ) {
_localizations = new Repository<Localization>().Get();
}
return _localizations;
}
}
}
public void RebuildLocalizations()
{
lock(_lock) {
_localizations = null;
}
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
There is no point in creating a thread safe singleton, if your properties are not going to be thread safe.
You should either lock around assignment of the _localization field, or instantiate in your singleton's constructor (preferred). Any suggestion which applies to singleton instantiation applies to this lazy-instantiated property.
The same thing further applies to all properties (and their properties) of Localization. If this is a Singleton, it means that any thread can access it any time, and simply locking the getter will again do nothing.
For example, consider this case:
Thread 1 Thread 2
// both threads access the singleton, but you are "safe" because you locked
1. var loc1 = Params.Localizations; var loc2 = Params.Localizations;
// do stuff // thread 2 calls the same property...
2. var value = loc1.ChunkSize; var chunk = LC.Loc("params.chunksize");
// invalidate // ...there is a slight pause here...
3. loc1.RebuildLocalizations();
// ...and gets the wrong value
4. var value = chunk.To();
If you are only reading these values, then it might not matter, but you can see how you can easily get in trouble with this approach.
Remember that with threading, you never know if a different thread will execute something between two instructions. Only simple 32-bit assignments are atomic, nothing else.
This means that, in this line here:
return LC.Loc("params.chunksize").To<int>();
is, as far as threading is concerned, equivalent to:
var loc = LC.Loc("params.chunksize");
Thread.Sleep(1); // anything can happen here :-(
return loc.To<int>();
Any thread can jump in between Loc and To.

C# MultiThread Safe Class Design

I'm trying to designing a class and I'm having issues with accessing some of the nested fields and I have some concerns with how multithread safe the whole design is. I would like to know if anyone has a better idea of how this should be designed or if any changes that should be made?
using System;
using System.Collections;
namespace SystemClass
{
public class Program
{
static void Main(string[] args)
{
System system = new System();
//Seems like an awkward way to access all the members
dynamic deviceInstance = (((DeviceType)((DeviceGroup)system.deviceGroups[0]).deviceTypes[0]).deviceInstances[0]);
Boolean checkLocked = deviceInstance.locked;
//Seems like this method for accessing fields might have problems with multithreading
foreach (DeviceGroup dg in system.deviceGroups)
{
foreach (DeviceType dt in dg.deviceTypes)
{
foreach (dynamic di in dt.deviceInstances)
{
checkLocked = di.locked;
}
}
}
}
}
public class System
{
public ArrayList deviceGroups = new ArrayList();
public System()
{
//API called to get names of all the DeviceGroups
deviceGroups.Add(new DeviceGroup("Motherboard"));
}
}
public class DeviceGroup
{
public ArrayList deviceTypes = new ArrayList();
public DeviceGroup() {}
public DeviceGroup(string deviceGroupName)
{
//API called to get names of all the Devicetypes
deviceTypes.Add(new DeviceType("Keyboard"));
deviceTypes.Add(new DeviceType("Mouse"));
}
}
public class DeviceType
{
public ArrayList deviceInstances = new ArrayList();
public bool deviceConnected;
public DeviceType() {}
public DeviceType(string DeviceType)
{
//API called to get hardwareIDs of all the device instances
deviceInstances.Add(new Mouse("0001"));
deviceInstances.Add(new Keyboard("0003"));
deviceInstances.Add(new Keyboard("0004"));
//Start thread CheckConnection that updates deviceConnected periodically
}
public void CheckConnection()
{
//API call to check connection and returns true
this.deviceConnected = true;
}
}
public class Keyboard
{
public string hardwareAddress;
public bool keypress;
public bool deviceConnected;
public Keyboard() {}
public Keyboard(string hardwareAddress)
{
this.hardwareAddress = hardwareAddress;
//Start thread to update deviceConnected periodically
}
public void CheckKeyPress()
{
//if API returns true
this.keypress = true;
}
}
public class Mouse
{
public string hardwareAddress;
public bool click;
public Mouse() {}
public Mouse(string hardwareAddress)
{
this.hardwareAddress = hardwareAddress;
}
public void CheckClick()
{
//if API returns true
this.click = true;
}
}
}
Making a class thread-safe is a heck of a difficult thing to do.
The first, naive, way, that many tends to attempt is just adding a lock and ensuring that no code that touches mutable data does so without using the lock. By that I mean that everything in the class that is subject to change, has to first lock the locking object before touching the data, be it just reading from it, or writing to it.
However, if this is your solution, then you should probably not do anything at all to the code, just document that the class is not thread-safe and leave it to the programmer that uses it.
Why?
Because you've effectively just serialized all access to it. Two threads that tries use the class at the same time, even though they are touching separate parts of it, will block. One of the threads will be given access, the other one will wait until the first one is complete.
This is actually discouraging multi-threaded usage of your class, so in this case you're adding overhead of locking to your class, and not actually getting any benefits from it. Yes, your class is now "thread safe", but it isn't actually a good thread-citizen.
The other way is to start adding granular locks, or writing lock-free constructs (seriously hard), so that if two parts of the object aren't always related, code that accesses each part have their own lock. This would allow multiple threads that accesses different parts of the data to run in parallel without blocking one another.
This becomes hard wherever you need to work on more than one part of the data at a time, as you need to be super-careful to take the locks in the right order, or suffer deadlocks. It should be your class' responsibility to ensure the locks are taken in the right order, not the code that uses the class.
As for your specific example, it looks to me as though the parts that will change from background threads are only the "is the device connected" boolean values. In this case I would make that field volatile, and use a lock around each. If, however, the list of devices will change from background threads, you're going to run into problems pretty fast.
You should first try to identify all the parts that will be changed by background threads, and then devise scenarios for how you want the changes to propagate to other threads, how to react to the changes, etc.

Categories

Resources