Lambda and memory leaks: Looking for alternative approaches

Lambda and memory leaks: Looking for alternative approaches - c#

Edit:
I would be greatful if an experienced programmer with the ability to verify this sort of thing showed me the proof that this method is safe from memory leaks. I've been introducing it to many of my coding efforts but I still have a small doubt in my mind. Unfortunately I'm not good enough / don't know the tools to investigate it.
Original:
I learned recently that some uses of lambda expressions can create memory leaks:
ProjectData Project;
void OnLaunchNewProject()
{
NewProjectUI newProjectUI = new NewProjectUI();
newProjectUI.OnCompleted += (o, e) =>
{
Project = newProjectUI.NewProject;
view.Content = string.Format("Project {0} was created and saved successfully.", Project.Name);
};
newProjectUI.OnCancelled += (o, e) => { view.Content = "Operation was cancelled.";};
view.Content = newProjectUI;
}
I learned the bad impact of this method in this blog.
I do not fully understand the impact of referencing local variables in lambda expressions and this limits my ability to circumvent the problem.
Between the typical approach and the use of lambda, what's the ideal compromise? The thing I like about lambda is skipping the definition of the EventHandler's arguments in the body of my class (the sender/routed args) when I don't need them.

Unfortunately, the blog post you mentioned is wrong. There is not general issue with memory leaks in lambda expressions. In the blog examples, the finalizers are never called because the author never removes the anonymous methods from the event. So the .NET runtime thinks that the method still might be called later and cannot remove the class from memory.
In your code, you will release the NewProjectUI instance somewhere, and this is the point where all events become unsued and when the assigned lambda method also becomes unused. Then the GC can remove the anonymous lambda-helper-class and free the used memory.
So, again: There is no issue in .NET when using local variables in lambda expressions.
But to make your code better, move the code from the lambda expressions to named methods and add these methods to the events. This also makes it possible to remove the methods from the events when they're no longer needed.

Related

Is there a benefit to explicit use of "new EventHandler" declaration?

In assigning event handlers to something like a context MenuItem, for instance, there are two acceptable syntaxes:
MenuItem item = new MenuItem("Open Image", btnOpenImage_Click);
...and...
MenuItem item = new MenuItem("Open Image", new EventHandler(btnOpenImage_Click));
I also note that the same appears to apply to this:
listView.ItemClick += listView_ItemClick;
...and...
listView.ItemClick += new ItemClickEventHandler(listView_ItemClick);
Is there any particular advantage for the second (explicit) over the first? Or is this more of a stylistic question?

In C# 1.0 you had no choice but to explicitly define the delegate type and the target.
Since C# 2.0 the compiler allows you to express yourself in a more succinct manner by means of an implicit conversion from a method group to a compatible delegate type. It's really just syntactic sugar.
Sometimes you have no choice but to use the long-winded syntax if the correct overload cannot be resolved from the method group due to an ambiguity.

It's syntax for older version of C# compiler (<=1.1). Not needed anymore. Current compilers are sophisticated enough to get it right.
There's one (little) benefit, sometimes. If you assign event handlers by "+=", the Intellisense autocomplete feature may make writing code a bit faster.

The only time when this is useful is if it would otherwise be ambiguous - for example, if it was MenuItem(string, Delegate) - of if there were multiple equally matching overloads that would match the signature. This also includes var syntax (shown below) and generic type inference (not shown):
EventHandler handler = SomeMethod; // fine
EventHandler handler = new EventHandler(SomeMethod); // fine
var handler = new EventHandler(SomeMethod); // fine
var handler = (EventHandler)SomeMethod; // fine
var handler = SomeMethod; // not fine
In all other cases, it is redundant and is unnecessary in any compiler from 2.0 onwards.

Related to your edit - The adding of handlers isn't really affected by using new or not but there's a slight difference in removing handlers like this
listView.ItemClick -= listView_ItemClick;
and
listView.ItemClick -= new ItemClickEventHandler(listView_ItemClick);
albeit unlikely to affect most scenarios. The first version, without the new keyword, is supposedly more efficient.
There's a detailed explanation in this post but the conclusion is
So both works but which one should we use? If the events are
subscribed/unsubscribed once at the beginning/end like in a typical
WinForm application then it hardly matters. However if this is done
multiple times then the second approach is preferable as it does less
of costly heap allocations and will work faster
(second approach in that post being the one without the new keyword)
Having said that it seems like a micro optimization to me so it's unlikely to be a bottleneck in the majority of cases.

C# Handling Events

When I deal set up events, I usually write as such:
data.event += new data.returndataeventhandler(method);
And have a method as such:
void method(parameter)
{
dosomething();
}
This is when the event returns an object.
I have just been reading through somebody elses code and they have used, what seems to be a much cleaner way, as such:
data.ReturnData += delegate(DataSet returnedDataSet)
{
dataset = returnedDataSet;
};
Is there any downfall to this way?
Thanks.

The one major downfall of using anonymous delegates (or the even-cleaner Lambda as suggested by tster) is that you're not going to be able to unsubscribe it from the event later unless you give it some sort of name.
In most cases, this is "No Big Deal (tm)" because the delegate will go away whenever the event source goes away, but this can be a "Subtle Mistake (tm)" if you're subscribing to static events or events on long-lived objects (e.g., the WPF Dispatcher object).
In your case, this doesn't look like a problem at all, so I'd definitely recommend going with tster's recommendation (assuming you're using an appropriately recent version of .Net):
data.ReturnData += returnedDataSet => dataset = returnedDataSet;
(The compiler can infer the type of returnedDataSet from the EventHandler type of ReturnData.)

The primary downfall of using anonymous delegates is that they are not reusable. Other than that there is typically no difference between defining a delegate and then using it elsewhere in your code versus using an anonymous delegate.

One down fall is that it will not appear in your method drop down list. If you do it inline, it should only be simple, nothing overly complex.

Like said by others, the most obvious is not reusable.
Other points:
readability in particular if you have large method body
because .NET generate a random name for anonymous method (not very meaningful or readable) if you use reflection type technology or profiler, it may complicate traceability.

The only downfall is that if you have more than one event it's easier to point it to a method. If you had to attach events in different blocks to the same handler, you would have to store your delegate somewhere so that both blocks could "see" it.

Even cleaner:
data.ReturnData += returnedDataSet => dataset = returnedDataSet;

Nope its just anonymous method thats all.
You can read more about anonymous methods here.

Aside from the other answers of reusablity/Intellisense, I believe the only downfall is if you need to remove the handler later. With a delegate/lamba you cannot easily remove your handler if it no longer needs to be called.

Best practices of using lambda expressions for event handlers

After discovering lambda expressions, and their use as anonymous functions, I've found myself writing a lot of more trivial events such as these:
txtLogin.GotFocus += (o, e) =>
{
txtLogin.Text = string.Empty;
txtLogin.ForeColor = SystemColors.ControlText;
};
txtLogin.LostFocus += (o, e) =>
{
txtLogin.Text = "Login...";
txtLogin.ForeColor = SystemColors.InactiveCaptionText;
};
I've also moved away from event handlers which just call other functions, replacing them with small lambdas which do the same:
backgroundWorker.DoWork += (o, e) => DatabaseLookup.Open(e.Argument as string);
I've found some similar questions addressing performance concerns and pointing out that you can't remove them, but I haven't found any addressing the simple question of is this a good idea?
Is the use of lambdas in such a way considered good form, or do more experience programmers look down on this? Does it hide event handlers in hard-to-find places, or does it do the code a service by reducing the number of trivial event handlers?

It's a perfectly reasonable idea - but in this particular case, I would use an anonymous method instead:
txtLogin.LostFocus += delegate
{
txtLogin.Text = "Login...";
txtLogin.ForeColor = SystemColors.InactiveCaptionText;
};
The benefit is that you don't have to specify the parameters - which makes it clearer that you're not using them. This is the only advantage that anonymous methods have over lambda expressions.
The performance hit is almost always going to be negligible. The inability to remove them afterwards is a very real problem if you do need to be able to remove the handler, but I find that often I don't. (Reactive Extensions has a nice approach to this - when you subscribe to an observable sequence, you're given back an IDisposable which will remove the subscription if you call it. Very neat.)

Actually, it's consider it putting event handlers in easy-to-find places, namely right next to the name of the event it's assigned to.
A lot of the time, you'll see event handlers like:
void Text1_KeyDown(....) {....}
attached to the KeyUp event of txtFirstName, because after using Intellisense to create the handler, someone decided to rename the textbox, and that KeyUp worked better. With the Lambda, the object, the event and the function are all together.

It's a tricky one. I remember reading in Code Complete about how some (smart) people say you should keep the flow of control as simple as possible, with many arguing for single entry and exit points from a method, because not doing so made the program harder to follow.
Lambdas are getting even further away from that, making it very difficult in some cases to follow what's happening, with control leaping around from place to place.
Basically, I think it probably is a bad idea because of this, but it's also powerful and makes life easier. I certainly use them a fair amount. In summary, use with caution!

Passing a lambda to a secondary AppDomain as a stream of IL and assembling it back using DynamicMethod

Is it possible to pass a lambda expression to a secondary AppDomain as a stream of IL bytes and then assemble it back there using DynamicMethod so it can be called?
I'm not too sure this is the right way to go in the first place, so here's the (detailed) reason I ask this question...
In my applications, there are a lot of cases when I need to load a couple of assemblies for reflection, so I can determine what to do with them next. The problem part is I need to be able to unload the assemblies after I'm finished reflecting over them. This means I need to load them using another AppDomain.
Now, most of my cases are sort of similar, except not quite. For example, sometimes I need to return a simple confirmation, other times I need to serialize a resource stream from the assembly, and again other times I need to make a callback or two.
So I end up writing the same semi-complicated temporary AppDomain creation code over and over again and implementing custom MarshalByRefObject proxies to communicate between the new domain and the original one.
As this is not really acceptable anymore, I decided to code me an AssemblyReflector class that could be used this way:
using (var reflector = new AssemblyReflector(#"C:\MyAssembly.dll"))
{
bool isMyAssembly = reflector.Execute(assembly =>
{
return assembly.GetType("MyAssembly.MyType") != null;
});
}
AssemblyReflector would automize the AppDomain unloading by virtue of IDisposable, and allow me to execute a Func<Assembly,object>-type lambda holding the reflection code in another AppDomain transparently.
The problem is, lambdas cannot be passed to other domains so simply. So after searching around, I found what looks like a way to do just that: pass the lambda to the new AppDomain as an IL stream - and that brings me to the original question.
Here's what I tried, but didn't work (the problem was BadImageFormatException being thrown when trying to call the new delegate):
public delegate object AssemblyReflectorDelegate(Assembly reflectedAssembly);
public class AssemblyReflector : IDisposable
{
private AppDomain _domain;
private string _assemblyFile;
public AssemblyReflector(string fileName) { ... }
public void Dispose() { ... }
public object Execute(AssemblyReflectorDelegate reflector)
{
var body = reflector.Method.GetMethodBody();
_domain.SetData("IL", body.GetILAsByteArray());
_domain.SetData("MaxStackSize", body.MaxStackSize);
_domain.SetData("FileName", _assemblyFile);
_domain.DoCallBack(() =>
{
var il = (byte[])AppDomain.CurrentDomain.GetData("IL");
var stack = (int)AppDomain.CurrentDomain.GetData("MaxStackSize");
var fileName = (string)AppDomain.CurrentDomain.GetData("FileName");
var args = Assembly.ReflectionOnlyLoadFrom(fileName);
var pars = new Type[] { typeof(Assembly) };
var dm = new DynamicMethod("", typeof(object), pars,
typeof(string).Module);
dm.GetDynamicILInfo().SetCode(il, stack);
var clone = (AssemblyReflectorDelegate)dm.CreateDelegate(
typeof(AssemblyReflectorDelegate));
var result = clone(args); // <-- BadImageFormatException thrown.
AppDomain.CurrentDomain.SetData("Result", result);
});
// Result obviously needs to be serializable for this to work.
return _domain.GetData("Result");
}
}
Am I even close (what's missing?), or is this a pointless excercise all in all?
NOTE: I realize that if this worked, I'd still have to be carefull about what I put into lambda in regard to references. That's not a problem, though.
UPDATE: I managed to get a little further. It seems that simply calling SetCode(...) is not nearly enough to reconstruct the method. Here's what's needed:
// Build a method signature. Since we know which delegate this is, this simply
// means adding its argument types together.
var builder = SignatureHelper.GetLocalVarSigHelper();
builder.AddArgument(typeof(Assembly), false);
var signature = builder.GetSignature();
// This is the tricky part... See explanation below.
di.SetCode(ILTokenResolver.Resolve(il, di, module), stack);
dm.InitLocals = initLocals; // Value gotten from original method's MethodInfo.
di.SetLocalSignature(signature);
The trick is as follows. Original IL contains certain metadata tokens which are valid only in the context of the original method. I needed to parse the IL and replace those tokens with ones that are valid in the new context. I did this by using a special class, ILTokenResolver, which I adapted from these two sources: Drew Wilson and Haibo Luo.
There is still a small problem with this - the new IL doesn't seem to be exactly valid. Depending on the exact contents of the lambda, it may or may not throw an InvalidProgramException at runtime.
As a simple example, this works:
reflector.Execute(a => { return 5; });
while this doesn't:
reflector.Execute(a => { int a = 5; return a; });
There are also more complex examples that are either working or not, depending on some yet-to-be-determined difference. It could be I missed some small but important detail. But I'm reasonably confident I'll find it after a more detailed comparison of the ildasm outputs. I'll post my findings here, when I do.
EDIT: Oh, man. I completely forgot this question was still open. But as it probably became obvious in itself, I gave up on solving this. I'm not happy about it, that's for sure. It's really a shame, but I guess I'll wait for better support from the framework and/or CLR before I attempt this again. There're just to many hacks one has to do to make this work, and even then it's not reliable. Apologies to everyone interested.

I didn't get exactly what is the problem you are trying to solve, but I made a component in the past that may solve it.
Basically, its purpose was to generate a Lambda Expression from a string. It uses a separate AppDomain to run the CodeDOM compiler. The IL of a compiled method is serialized to the original AppDomain, and then rebuild to a delegate with DynamicMethod. Then, the delegate is called and an lambda expression is returned.
I posted a full explanation of it on my blog. Naturally, it's open source. So, if you get to use it, please send me any feedback you think is reasonable.

Probably not, because a lambda is more than just an expression in source code. lambda expressions also create closures which capture/hoist variables into their own hidden classes. The program is modified by the compiler so everywhere you use those variables you're actually talking to the class. So you'd have to not only pass the code for the lambda, but also any changes to closure variables over time.

Ab-using languages

Some time ago I had to address a certain C# design problem when I was implementing a JavaScript code-generation framework. One of the solutions I came with was using the “using” keyword in a totally different (hackish, if you please) way. I used it as a syntax sugar (well, originally it is one anyway) for building hierarchical code structure. Something that looked like this:
CodeBuilder cb = new CodeBuilder();
using(cb.Function("foo"))
{
// Generate some function code
cb.Add(someStatement);
cb.Add(someOtherStatement);
using(cb.While(someCondition))
{
cb.Add(someLoopStatement);
// Generate some more code
}
}
It is working because the Function and the While methods return IDisposable object, that, upon dispose, tells the builder to close the current scope. Such thing can be helpful for any tree-like structure that need to be hard-codded.
Do you think such “hacks” are justified? Because you can say that in C++, for example, many of the features such as templates and operator overloading get over-abused and this behavior is encouraged by many (look at boost for example). On the other side, you can say that many modern languages discourage such abuse and give you specific, much more restricted features.
My example is, of course, somewhat esoteric, but real. So what do you think about the specific hack and of the whole issue? Have you encountered similar dilemmas? How much abuse can you tolerate?

I think this is something that has blown over from languages like Ruby that have much more extensive mechanisms to let you create languages within your language (google for "dsl" or "domain specific languages" if you want to know more). C# is less flexible in this respect.
I think creating DSL's in this way is a good thing. It makes for more readable code. Using blocks can be a useful part of a DSL in C#. In this case I think there are better alternatives. The use of using is this case strays a bit too far from its original purpose. This can confuse the reader. I like Anton Gogolev's solution better for example.

Offtopic, but just take a look at how pretty this becomes with lambdas:
var codeBuilder = new CodeBuilder();
codeBuilder.DefineFunction("Foo", x =>
{
codeBuilder.While(condition, y =>
{
}
}

It would be better if the disposable object returned from cb.Function(name) was the object on which the statements should be added. That internally this function builder passed through the calls to private/internal functions on the CodeBuilder is fine, just that to public consumers the sequence is clear.
So long as the Dispose implementation would make the following code cause a runtime error.
CodeBuilder cb = new CodeBuilder();
var f = cb.Function("foo")
using(function)
{
// Generate some function code
f.Add(someStatement);
}
function.Add(something); // this should throw
Then the behaviour is intuitive and relatively reasonable and correct usage (below) encourages and prevents this happening
CodeBuilder cb = new CodeBuilder();
using(var function = cb.Function("foo"))
{
// Generate some function code
function.Add(someStatement);
}
I have to ask why you are using your own classes rather than the provided CodeDomProvider implementations though. (There are good reasons for this, notably that the current implementation lacks many of the c# 3.0 features) but since you don't mention it yourself...
Edit: I would second Anoton's suggest to use lamdas. The readability is much improved (and you have the option of allowing Expression Trees

If you go by the strictest definitions of IDisposable then this is an abuse. It's meant to be used as a method for releasing native resources in a deterministic fashion by a managed object.
The use of IDisposable has evolved to essentially be used by "any object which should have a deterministic lifetime". I'm not saying this is write or wrong but that's how many API's and users are choosing to use IDisposable. Given that definition it's not an abuse.

I wouldn't consider it terribly bad abuse, but I also wouldn't consider it good form because of the cognitive wall you're building for your maintenance developers. The using statement implies a certain class of lifetime management. This is fine in its usual uses and in slightly customized ones (like #heeen's reference to an RAII analogue), but those situations still keep the spirit of the using statement intact.
In your particular case, I might argue that a more functional approach like #Anton Gogolev's would be more in the spirit of the language as well as maintainable.
As to your primary question, I think each such hack must ultimately stand on its own merits as the "best" solution for a particular language in a particular situation. The definition of best is subjective, of course, but there are definitely times (especially when the external constraints of budgets and schedules are thrown into the mix) where a slightly more hackish approach is the only reasonable answer.

I often "abuse" using blocks. I think they provide a great way of defining scope. I have a whole series of objects that I use for capture and restoring state (e.g. of Combo boxes or the mouse pointer) during operations that may change the state. I also use them for creating and dropping database connections.
E.g.:
using(_cursorStack.ChangeCursor(System.Windows.Forms.Cursors.WaitCursor))
{
...
}

I wouldn't call it abuse. Looks more like a fancied up RAII technique to me. People have been using these for things like monitors.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Lambda and memory leaks: Looking for alternative approaches - c#

Related

Is there a benefit to explicit use of "new EventHandler" declaration?

C# Handling Events

Best practices of using lambda expressions for event handlers

Passing a lambda to a secondary AppDomain as a stream of IL and assembling it back using DynamicMethod

Ab-using languages

Categories

Resources