How to identify a lambda closure with reflection - c#

I am writing a component that involves Actions and came upon a requirement to find a way to identify using reflection cases when the Action.Target object is a closure that the compiler have generated. I am doing an experiment to try and find a way, the purpose of this little experiment is to develop a predicate that takes an Action and returns a bool that tells if the action target is an instance of such closure class.
In my test case, I have the following methods that create 4 different types of actions:
private void _createClosure(int i)
{
ClosureAction = new Action(() =>
{
var j = i;
var k = somenum;
});
}
private void _createLambda()
{
LambdaAction = new Action(() =>
{
this._instanceAction();
});
}
private void _createInstance()
{
InstanceAction = new Action(_instanceAction);
}
private void _createStatic()
{
StaticAction = new Action(_staticAction);
}
private int somenum;
private void _instanceAction()
{
somenum++;
}
private static void _staticAction()
{
}
The following table shows the properties of each action:
As you see, LambaAction and ClosureAction are quite similar in terms of defnition, they both use lambda, but the closure one has a local function variable that is being used inside the lambda, and therefore the compiler is forced into generating a closure class. Its clear that the second row, the one that presents ClosureAction, is the only one that has a target that is a closure type. The static one does not have a target at all, and the other two use the calling class (Called ActionReferences) as target. The next table presents a comparison of the target reflection type properties:
So we can see that what's unique about the closure case is that the target type is not a type info but rather a nested type. It's also the only one that is private nested, sealed and has a name that contains the string +<>c__DisplayClass. Now while I think that these characteristics are conclusive for any normal usage case, I would prefer to define a predicate that I can rely on. I don't like to base this mechanism on compilers naming conventions or properties that are not unique because technically, the user may create a private nested sealed class with the same naming convention... it's not likely, but it's not 100% clean solution.
So finally - the question is this: Is there a clean cut way to write a predicate the identifies actions that are actually compiler generated closures?
Thanks

This isn't 100% accurate, but it generally works:
bool isClosure = action.Target != null && Attribute.IsDefined(
action.Target.GetType(), typeof(CompilerGeneratedAttribute));
Console.WriteLine(isClosure);
You can of course force false positives just by manually adding [CompilerGenerated] to any type you choose.
You could also use action.Method.DeclaringType, but since all captures involve a target instance, it is useful to retain the Target check:
bool isClosure = action.Target != null && Attribute.IsDefined(
action.Method.DeclaringType, typeof(CompilerGeneratedAttribute));

Related

Why does C# compiler create private DisplayClass when using LINQ method Any() and how can I avoid it?

I have this code (the whole code is not important but can be seen on this link):
internal static class PlayCardActionValidator
{
public static bool CanPlayCard(...)
{
// ...
var hasBigger =
playerCards.Any(
c => c.Suit == otherPlayerCard.Suit
&& c.GetValue() > otherPlayerCard.GetValue());
// ...
}
}
After opening the code in decompiler (ILSpy) for example I noticed the existence of newly created class <>c__DisplayClass0_0 by the C# compiler:
This wouldn't be a problem for me if this code wasn't critical for the performance of the system. This method is called millions of times and the garbage collector is cleaning these <>c__DisplayClass0_0 instances which slows down the performance:
How can I avoid creating this class (his instances and their garbage collecting) when using the Any method?
Why does the C# compiler create this class and is there any alternative of Any() I can use?
To understand the "display class" you have to understand closures. The lambda you pass here is a closure, a special type of method that magically drags in state from the scope of the method it's in and "closes around" it.
...except of course that there's no such thing as magic. All that state has to actually live somewhere real, somewhere that's associated with the closure method and readily available from it. And what do you call the programming pattern where you associate state directly with one or more methods?
That's right: classes. The compiler transforms the lambda into a closure class, then instantiates the class inside the hosting method so the hosting method can access the state in the class.
The only way to not have this happen is to not use closures. If this is really impacting performance, use an old-school FOR loop instead of a LINQ expression.
How can I avoid creating this class (his instances and their garbage collecting) when using the Any method?
Why does the C# compiler creates this class and is there any alternative of Any() I can use?
Other posters already explained the why part, so the better question would be How can I avoid creation of a closure?. And the answer is simple: if lambda is using only the passed parameters and/or constants, the compiler will not create a closure. For instance:
bool AnyClub() { return playerCards.Any(c => c.Suit == CardSuit.Club); }
bool AnyOf(CardSuit suit) { return playerCards.Any(c => c.Suit == suit); }
The first will not create a closure while the second will.
With all that in mind, and assuming you don't want to use for/foreach loops, you can create own extension methods similar to those in System.Linq.Enumerable but with additional parameters. For this particular case, something like this would work:
public static class Extensions
{
public static bool Any<T, TArg>(this IEnumerable<T> source, TArg arg, Func<T, TArg, bool> predicate)
{
foreach (var item in source)
if (predicate(item, arg)) return true;
return false;
}
}
and change the code in question to:
var hasBigger =
playerCards.Any(otherPlayerCard,
(c, opc) => c.Suit == opc.Suit
&& c.GetValue() > opc.GetValue());

Reflection, contravariance and polymorphism

I have a base class (abstract) with multiple implementations, and some of them contain collection properties of other implementations - like so:
class BigThing : BaseThing
{
/* other properties omitted for brevity */
List<SquareThing> Squares { get; set; }
List<LittleThing> SmallThings { get; set;}
/* etc. */
}
Now sometimes I get a BigThing and I need to map it to another BigThing, along with all of its collections of BaseThings. However, when this happens, I need to be able to tell if a BaseThing in a collection from the source BigThing is a new BaseThing, and thus should be Add()-ed to the destination BigThing's collection, or if it's an existing BaseThing that should be mapped to one of the BaseThings that already exist in the destination collection. Each implementation of BaseThing has a different set of matching criteria on which it should be evaluated for new-ness. I have the following generic extension method to evaluate this:
static void UpdateOrCreateThing<T>(this T candidate, ICollection<T> destinationEntities) where T : BaseThing
{
var thingToUpdate = destinationEntites.FirstOrDefault(candidate.ThingMatchingCriteria);
if (thingToUpdate == null) /* Create new thing and add to destinationEntities */
else /* Map thing */
}
Which works fine. However I think I am getting lost with the method that deals in BigThings. I want to make this method generic because there are a few different kinds of BigThings, and I don't want to have to write methods for each, and if I add collection properties I don't want to have to change my methods. I have written the following generic method that makes use of reflection, but it is not
void MapThing(T sourceThing, T destinationThing) where T : BaseThing
{
//Take care of first-level properties
Mapper.Map(sourceThing, destinationThing);
//Now find all properties which are collections
var collectionPropertyInfo = typeof(T).GetProperties().Where(p => typeof(ICollection).IsAssignableFrom(p.PropertyType));
//Get property values for source and destination
var sourceProperties = collectionPropertyInfo.Select(p => p.GetValue(sourceThing));
var destinationProperties = collectionPropertyInfo.Select(p => p.GetValue(destinationThing));
//Now loop through collection properties and call extension method on each item
for (int i = 0; i < collectionPropertyInfo.Count; i++)
{
//These casts make me suspicious, although they do work and the values are retained
var thisSourcePropertyCollection = sourceProperties[i] as ICollection;
var sourcePropertyCollectionAsThings = thisSourcePropertyCollection.Cast<BaseThing>();
//Repeat for destination properties
var thisDestinationPropertyCollection = destinationProperties[i] as ICollection;
var destinationPropertyCollectionAsThings = thisDestinationPropertyCollection.Cast<BaseThing>();
foreach (BaseThing thing in sourcePropertyCollectionAsThings)
{
thing.UpdateOrCreateThing(destinationPropertyCollectionAsThings);
}
}
}
This compiles and runs, and the extension method runs successfully (matching and mapping as expected), but the collection property values in destinationThing remain unchanged. I suspect I have lost the reference to the original destinationThing properties with all the casting and assigning to other variables and so on. Is my approach here fundamentally flawed? Am I missing a more obvious solution? Or is there some simple bug in my code that's leading to the incorrect behavior?
Without thinking too much, I'd say you have fallen to a inheritance abuse trap, and now trying to save yourself, you might want to consider how can you solve your problem while ditching the existing design which leads you to do such things at the first place. I know, this is painful, but it's an investment in future :-)
That said,
var destinationPropertyCollectionAsThings =
thisDestinationPropertyCollection.Cast<BaseThing>();
foreach (BaseThing thing in sourcePropertyCollectionAsThings)
{
thing.UpdateOrCreateThing(destinationPropertyCollectionAsThings);
}
You are losing your ICollection when you use Linq Cast operator that creates the new IEnumerable<BaseThing>. You can't use contravariance either, because ICollectiondoes not support it. If it would, you'd get away with as ICollection<BaseThing> which would be nice.
Instead, you have to build the generic method call dynamically, and invoke it. The simplest way is probably using dynamic keyword, and let the runtime figure out, as such:
thing.UpdateOrCreateThing((dynamic)thisDestinationPropertyCollection);

Is there any way to create some codes which compiler will generate some another code like below?

Let say I have this kind of class in c# language:
public class ABC {
public int var_1;
public int var_2;
public int var_3;
//... until 100
public int var_100;
public int GetData_WithBasicIfElse (int id) {
if(id == 1)
return var_1;
else if(id == 2)
return var_2;
else //and so on until
else if(id == 100)
return var_100;
}
public int GetData_WithReflection(int id){
string key = "var_" + id.ToString ();
FieldInfo info = GetType ().GetField (key);
return info != null ? (int)info.GetValue (this) : 0;
}
public int GetData_WithSpecialCode(int id){
//put the simple codes here, then compilers compile it, it will generate code like the method GetData_WithBasicIfElse
}
}
Actually in most cases, I can use the array to hold var_n variable, but I am just curious if there is another way. I do not want to use GetData_WithBasicIfElse (not elegant), but I am wondering if there is another solution beside using reflection.
What I mean with GetData_WithSpecialCode is, it contains the special code that will be transformed by compiler (when compile time, where it will be binary file) into some pattern like GetData_WithBasicIfElse.
UPDATED
This technique's called Template metaprogramming, as you can see in here: http://en.wikipedia.org/wiki/Template_metaprogramming, in the factorial source code.
T4 Template
A T4 Template can generate that desired C# code, that will be later compiled into IL code, as if you have written that code yourself. If you want to use this technique, the most natural way is to use partial classes. The first partial defines all the class except the auto-generated method. The second partial would be generated by a simple T4 template. (In the compiled code there's no difference between a class defined in a single file or in several partials).
Reflection.Emit
If you really want to generate code at runtime, it's much harder to do, but you can do it using Reflection.Emit This allow to directly emit IL at run time.
Expression Trees
This also allows to generate and compile code at run time. It's easier than the second option. See an introudction here.
Reflection
If you want to use your original Reflection solution you should store the FieldInfos in an static structure (array, list, dictionary or whatever) so that you only have the overhead of reflecting the fields once. This will improve the performace.
What to choose
Unless there is a good reason not to do so, I'd prefer the T4 template. It's the easier to implement, and you leave the compiler the reponsibility to compile and optimize your code. besides you don't have to work with "obscure, unusual" concepts.
In general I wouldn't advice you the second option. Between other things, I think this requires full trust. And, you need a good knowledge of what you're doing. You also miss the compiler optimizations.
Using expression trees is not as hard as using Reflection.Emit, but it's still hard to do.
And reflection always add a little overhead, specially if you don't cache the FieldInfos (or PropertyInfos or whatever). I would leave it for cases where is the only solution. For example checking if a property exists or accessing a private or protected member of a class from ouside.
I really wonder why you can't use array. For sure using some kind of dictionary would be better. But believing you really can't, you have at least two options to generate such a method:
1) CSharpCodeProvider
You can build a string with helper class containing your method, and it will be compiled to a different assembly:
string source = "public class Description" +
"{" +
" public int GetData_WithBasicIfElse(int id) {" +
// ... all ifs generated here
" }" +
"}";
CSharpCodeProvider codeProvider = new CSharpCodeProvider();
System.CodeDom.Compiler.CompilerParameters parameters = new CompilerParameters();
parameters.GenerateExecutable = false;
parameters.GenerateInMemory = true;
CompilerResults result = codeProvider.CompileAssemblyFromSource(parameters, source);
if (!result.Errors.HasErrors)
{
Type type = result.CompiledAssembly.GetType("Description");
var instance = Activator.CreateInstance(type);
}
and now you have an instance of helper class
2) Linq Expressions
You can build a method using Linq Expressions
ParameterExpression id = Expression.Parameter(typeof(int), "id");
List<Expression> expressions = new List<Expression>();
// here a lot of adding if-else statement expressions
expressions.Add(...);
var lambda = Expression.Lambda(
Expression.Block(
expressions
),
id);
Then you can use a result of lambda.Compile() as a method to call dynamically.
You can use a dictionary to map the ids:
public class ABC
{
public int var_1;
public int var_2;
public int var_3;
//... until 100
public int var_100;
private Dictionary<int,int> map;
public ABC()
{
//build up the mapping
map = new Dictionary<int,int>();
map.Add(1,var_1);
map.Add(2,var_2);
map.Add(100,var_100);
}
public int GetData(int id)
{
//maybe here you need to do check if the key is present
return map[id];
}
}
Can you change some details in your code? If you define the integers as one Array with 100 elements you can simply use id as an index and return that:
public int GetData_WithSpecialCode(int id){
return var_array(id)
}
If you really need to access the values from outside (they are defined public?) you can expose them using a property wich is preferred to public integers.
I haven't used them, but I know Visual Studio comes with T4 Text Templates that may do what you need.
Of course,
switch (id)
{
case 1:
return this.var_1;
case 2:
return this.var_2;
// etc. etc.
}
or,
var lookup = new Dictionary<int, Func<int>>
{
{ 1, () => return this.var_1 },
{ 2, () => return this.var_2 },
// etc. etc.
};
return lookup[i]();

Creating an object via lambda factory vs direct "new Type()" syntax

For example, consider a utility class SerializableList:
public class SerializableList : List<ISerializable>
{
public T Add<T>(T item) where T : ISerializable
{
base.Add(item);
return item;
}
public T Add<T>(Func<T> factory) where T : ISerializable
{
var item = factory();
base.Add(item);
return item;
}
}
Usually I'd use it like this:
var serializableList = new SerializableList();
var item1 = serializableList.Add(new Class1());
var item2 = serializableList.Add(new Class2());
I could also have used it via factoring, like this:
var serializableList = new SerializableList();
var item1 = serializableList.Add(() => new Class1());
var item2 = serializableList.Add(() => new Class2());
The second approach appears to be a preferred usage pattern, as I've been lately noticing on SO. Is it really so (and why, if yes) or is it just a matter of taste?
Given your example, the factory method is silly. Unless the callee requires the ability to control the point of instantiation, instantiate multiple instances, or lazy evaluation, it's just useless overhead.
The compiler will not be able to optimize out delegate creation.
To reference the examples of using the factory syntax that you gave in comments on the question. Both examples are trying (albeit poorly) to provide guaranteed cleanup of the instances.
If you consider a using statement:
using (var x = new Something()) { }
The naive implementation would be:
var x = new Something();
try
{
}
finally
{
if ((x != null) && (x is IDisposable))
((IDisposable)x).Dispose();
}
The problem with this code is that it is possible for an exception to occur after the assignment of x, but before the try block is entered. If this happens, x will not be properly disposed, because the finally block will not execute. To deal with this, the code for a using statement will actually be something more like:
Something x = null;
try
{
x = new Something();
}
finally
{
if ((x != null) && (x is IDisposable))
((IDisposable)x).Dispose();
}
Both of the examples that you reference using factory parameters are attempting to deal with this same issue. Passing a factory allows for the instance to be instantiated within the guarded block. Passing the instance directly allows for the possibility of something to go wrong along the way and not have Dispose() called.
In those cases, passing the factory parameter makes sense.
Caching
In the example you have provided it does not make sense as others have pointed out. Instead I will give you another example,
public class MyClass{
public MyClass(string file){
// load a huge file
// do lots of computing...
// then store results...
}
}
private ConcurrentDictionary<string,MyClass> Cache = new ....
public MyClass GetCachedItem(string key){
return Cache.GetOrAdd(key, k => new MyClass(key));
}
In above example, let's say we are loading a big file and we are calculating something and we are interested in end result of that calculation. To speedup my access, when I try to load files through Cache, Cache will return me cached entry if it has it, only when cache does not find the item, it will call the Factory method, and create new instance of MyClass.
So you are reading files many times, but you are only creating instance of class that holds data just once. This pattern is only useful for caching purpose.
But if you are not caching, and every iteration requires to call new operator, then it makes no sense to use factory pattern at all.
Alternate Error Object or Error Logging
For some reason, if creation fails, List can create an error object, for example,
T defaultObject = ....
public T Add<T>(Func<T> factory) where T : ISerializable
{
T item;
try{
item = factory();
}catch(ex){
Log(ex);
item = defaultObject;
}
base.Add(item);
return item;
}
In this example, you can monitor factory if it generates an exception while creating new object, and when that happens, you Log the error, and return something else and keep some default value in list. I don't know what will be practical use of this, but Error Logging sounds better candidate here.
No, there's no general preference of passing the factory instead of the value. However, in very particular situations, you will prefer to pass the factory method instead of the value.
Think about it:
What's the difference between passing the parameter as a value, or
passing it as a factory method (e.g. using Func<T>)?
Answer is simple: order of execution.
In the first case, you need to pass the value, so you must obtain it before calling the target method.
In the second case, you can postpone the value creation/calculation/obtaining till it's needed by the target method.
Why would you want to postpone the value creation/calculation/obtaining? obvious things come to mind:
Processor-intensive or memory-intensive creation of the value, that you want to happen only in case the value is really needed (on-demand). This is Lazy loading then.
If the value creation depends on parameters that are accessible by the target method but not from outside of it. So, you would pass Func<T, T> instead of Func<T>.
The question compares methods with different purposes. The second one should be named CreateAndAdd<T>(Func<T> factory).
So depending what functionality is required, should be used one or another method.

Deep Reflection in .NET

I need to create the ability to drill through an objects properties like two or three deep. For instance, class A has a property reference to class B, which I need to access class C. What is the best way to do this: straight reflection, or maybe using the TypeDescriptor, or something else?
Thanks.
It's not too hard to write. I put a few classes together to deal with this so I could serialize properties of a WinForm. Take a look at this class and the related classes.
http://csharptest.net/browse/src/Library/Reflection/PropertySerializer.cs
If you know the path in a static context (ie the path is always the same) and the properties are accessible (internal or public) you can use dynamic
[Test]
public void Foo()
{
var a = new A
{
B = new B
{
C = new C
{
Name = "hello"
}
}
};
DoReflection(a);
}
private void DoReflection(dynamic value)
{
string message = value.B.C.Name;
Debug.WriteLine(message);
}
I you wanna write you own serialization code for whatever reason, you'll be using reflection.
What you do is that you write a recursive method of serlizating a type. You then apply this as you see fit to get the result.
var type = myObjectOfSomeType.GetType();
// now depending on what you want to store
// I'll save all public properties
var properties = type.GetProperties(); // get all public properties
foreach(var p in properties)
{
var value = p.GetValue(myObjectOfSomeType, null);
Writevalue(p.Name, value);
}
The implementation of WriteValue have to recognize the built in types and treat them accordingly, that's typical things like string, char, integer, double, DateTime etc.
If it encounters a sequence or collection you need to write out many values.
If it encounters a non trivial type you'll apply this recursive pattern again.
The end result is a recursive algorithm that traverses your object model and writes out values as it encounters types that I know how to serialize.
However, I do recommend looking into WCF, not for building services, but for serialization. It shipped as part of the .NET 3.0 framework with a new assembly System.Runtime.Serilization and in general is very capable when dealing with serialization and data annotations.

Categories

Resources