Iterating a expression tree to check for null

Iterating a expression tree to check for null - c#

I am really struggling to write a generic class, that needs to check if all members of a passed expression are not null, by invoking the actual object.
I am calling the method like:
new TestClass<Class1, Class2>.IsAnyMemberNull(x => x.Property1.List1, object1);
and the Method looks like:
public bool IsAnyMemberNull(Expression<Func<T, IList<TM>>> listExpression, T entityDb)
{
var expressionsToCheck = new List<MemberExpression>();
var expression = listExpression.Body as MemberExpression;
while (expression != null)
{
expressionsToCheck.Add(expression);
expression = expression.Expression as MemberExpression;
}
for (var i = expressionsToCheck.Count - 1; i >= 0; i--)
{
var objectMember = Expression.Convert(expressionsToCheck[i], typeof (object));
var lambda = Expression.Lambda<Func<T, object>>(objectMember);
var value = lambda.Compile().Invoke(entityDb);
if (value == null)
return true;
}
return false;
}
When executing, I get the exception:
incorrect number of parameters supplied for lambda declaration
Any ideas, what I have done wrong?

While it is possible to get your code working, so that correct lambdas are built and compiled, using repeatedly compiled lambdas to achieve null-checking is a costly overkill.
Generally, using lambdas so casually (compiling a lambda for each property in chain and a single object) comes with a notable performance hit. I've run some tests, and on my computer executing this method 1000 times for a given object yielded ~300-700 ms time (depending on the number of properties in the chain). Dunno how many entities you deal with, but it's not a good sign, and better replacements are available. Please read further...
The question is, what are you using this for? That method of yours reminds me of null-conditional operators quite a lot. Generally, if you:
just want to check if any property in the property chain is a null
use C# 6
have all parameter chain lambdas (like x => x.Property1.List1) known at runtime
Then you might scrap all that IsAnyMemberNull method altogether in favour of something like:
object1.Property1.List1 == null
Much more concise and no additional methods are required. I ran it 1 million times and it was still within ~23ms time. It means it's dozens thousands faster than creating all those lambdas.
If you can't use null-coalescing operators for whatever reason (especially when the expression is built dynamically), you might instead decide to use Field/Property reflection.
I took the liberty of removing all that generic class wrapping in favour of a generic method. From your usage example, it seemed the only purpose of the generic class was to access a specific method with class' generic type parameters. It means one new class would have to be made and stored for each variation of the method, for no apparent reason, if I'm not mistaken, for the rest of application's lifetime. Generic methods in specific classes are generally preferred to specific methods in generic classes, in cases like these.
Also, I removed IList, because I see no reason how requiring the last parameter to be of IList type serves the purpose of the function; it only limits its applicability with no apparent gain.
Overall, the result is the following:
public bool IsAnyMemberNull<TEntity, TMember>(Expression<Func<TEntity, TMember>> paramChain, TEntity entityDb)
{
var expressionsToCheck = new List<MemberExpression>();
var expression = paramChain.Body as MemberExpression;
while (expression != null)
{
expressionsToCheck.Add(expression);
expression = expression.Expression as MemberExpression;
}
object value = entityDb;
for (var i = expressionsToCheck.Count - 1; i >= 0; i--)
{
var member = expressionsToCheck[i].Member;
if (member is PropertyInfo) value = (member as PropertyInfo).GetValue(value);
else if (member is FieldInfo) value = (member as FieldInfo).GetValue(value);
else throw new Exception(); // member generally should be a property or field, shouldn't it?
if (value == null)
return true;
}
return false;
}
After running this ~1000 times, it took about 4-6ms; 50-100 times better than lambdas, though null-propagation still reigns supreme.
Called as following (assuming it still resides in TestClass, which it doesn't need to):
new TestClass().IsAnyMemberNull<Class1,Class2>(x => x.Property1.List1, object1);
(Class1 and Class2 might not be necessary, thanks to type inference)
Hope this helps. It's not exactly what you asked for, but I'm afraid with all this lambda-spawning you could run into serious performance issues; especially if this code was to be used many times per request.

You have a problem in lambda expression creation - it is simpler than you thought. You should build lambda for each expressionToCheck with original expression parameter:
for (var i = expressionsToCheck.Count - 1; i >= 0; i--)
{
var lambda = Expression.Lambda<Func<T, object>>(expressionsToCheck[i], listExpression.Parameters);
var value = lambda.Compile().Invoke(entityDb);
if (value == null)
return true;
}

Related

using expression trees to compare objects by a single property nets InvalidOperationException

I am trying to use Expression Trees because based on description, that seems to be the most correct (performant, configurable) approach.
I expect to be able to craft a statement that gets the first item from the existingItems collection that matches the propertyNameToCompareOn value of the incomingItem.
I have a method with the following signature and simulated code body...
DetectDifferences<T>(List<T> incomingItems, List<T> existingItems)
{
var propertyNameToCompareOn = GetThisValueFromAConfigFile(T.FullName());
//does this belong outside of the loop?
var leftParam = Expression.Parameter(typeof(T), "left");
var leftProperty = Expression.Property(leftParam, identField);
var rightParam = Expression.Parameter(typeof(T), "right");
var rightProperty = Expression.Property(rightParam, identField);
//this throws the error
var condition = Expression.Lambda<Func<T, bool>>(Expression.Equal(leftProperty, rightProperty));
foreach (var incomingItem in incomingItems) //could be a parallel or something else.
{
// also, where am I supposed to provide incomingItem to this statement?
var existingItem = existingItems.FirstOrDefault(expression/condition/idk);
// the statement for Foo would be something like
var existingFoos = exsistingItems.FirstOrDefault(f => f.Bar.Equals(incomingItem.Bar);
//if item does not exist, consider it new for persistence
//if item does exist, compare a configured list of the remaining properties between the
// objects. If they are all the same, report no changes. If any
// important property is different, capture the differences for
// persistence. (This is where precalculating hashes seems like the
// wrong approach due to expense.)
}
}
At the marked line above, I get an "Incorrect number of parameters supplied for lambda declaration" InvalidOperationException. At this point I am just hacking crap together from the web and I really dont know what this wants. There are a bunch of overloads that VS can full my screen with, and none of the examples make sense from the articles on MSDN/SO.
PS - I dont really want an IComparer or similar implementation if it can be helped. I can do that with reflection. I do need to make this as rapid as possible, but allow it to be called for multiple types, hence the choice of expression trees.

When working with expression trees, it's important to first understand, in real code, what you want to do.
I always begin by first writing out (in static code) what the resulting expression looks like with real C# lambda syntax.
Based on your description, your stated goal is that you should be able to (dynamically) look up some property of the type T that gives some sort of quick comparison. How would you write this if both T and TProperty were both known at compile time?
I suspect it would look something like this:
Func<Foo, Foo, bool> comparer = (Foo first, Foo second) =>
first.FooProperty == second.FooProperty;
Right away we can see that your Expression is wrong. You don't need one input T, you need two!
It should also be obvious why you're getting the InvalidOperationException as well. You never supplied any parameters to your lambda expression, only the body. Above, 'first' and 'second' are the parameters provided to the lambda. You'll need to provide them to the Expression.Lambda()call as well.
var condition = Expression.Lambda<Func<T,T, bool>>(
Expression.Equal(leftProperty, rightProperty),
leftParam,
rightParam);
This simply uses the Expression.Lambda(Expression, ParameterExpression[]) overload for Expression.Lambda. Each ParameterExpression is the parameter that is used in the body. That's it. Don't forget to .Compile() your expression into a delegate if you want to actually invoke it.
Of course this doesn't mean that your technique will be necessarily fast. If you're using fancy expression trees to compare two lists with a naive O(n^2) approach, it won't matter.

Here's a method to make a property access expression;
public static Expression<Func<T, object>> MakeLambda<T>(string propertyName)
{
var param = Expression.Parameter(typeof(T));
var propertyInfo = typeof(T).GetProperty(propertyName);
var expr = Expression.MakeMemberAccess(param, propertyInfo);
var lambda = Expression.Lambda<Func<T, object>>(expr, param);
return lambda;
}
which you can use like this;
var accessor = MakeLambda<Foo>("Name").Compile();
accessor(myFooInstance); // returns name
Making your missing line
var existingItem = existingItems.FirstOrDefault(e => accessor(e) == accessor(incomingItem));
Be aware the == only works well for value types like ints; careful of comparing objects.
Here's proof the lambda approach is much faster;
static void Main(string[] args)
{
var l1 = new List<Foo> { };
for(var i = 0; i < 10000000; i++)
{
l1.Add(new Foo { Name = "x" + i.ToString() });
}
var propertyName = nameof(Foo.Name);
var lambda = MakeLambda<Foo>(propertyName);
var f = lambda.Compile();
var propertyInfo = typeof(Foo).GetProperty(nameof(Foo.Name));
var sw1 = Stopwatch.StartNew();
foreach (var item in l1)
{
var value = f(item);
}
sw1.Stop();
var sw2 = Stopwatch.StartNew();
foreach (var item in l1)
{
var value = propertyInfo.GetValue(item);
}
sw2.Stop();
Console.WriteLine($"{sw1.ElapsedMilliseconds} vs {sw2.ElapsedMilliseconds}");
}
As someone's also pointed out, though, the double-loop in the OP is O(N^2) and that should probably be the next consideration if efficiency is the driver here.

Is reflection used when retrieving information from a linq expression?

I have the following extension method:
public static string ToPropertyName<T,E>(this Expression<Func<E, T>> propertyExpression)
{
if (propertyExpression == null)
return null;
string propName;
MemberExpression propRef = (propertyExpression.Body as MemberExpression);
UnaryExpression propVal = null;
// -- handle ref types
if (propRef != null)
propName = propRef.Member.Name;
else
{
// -- handle value types
propVal = propertyExpression.Body as UnaryExpression;
if (propVal == null)
throw new ArgumentException("The property parameter does not point to a property", "property");
propName = ((MemberExpression)propVal.Operand).Member.Name;
}
return propName;
}
I use linq expression when passing property names instead of strings to provide strong typing and I use this function to retrieving the name of the property as a string. Does this method use reflection?
My reason for asking is this method is used quite a lot in our code and I want it to be reasonably fast enough.

As far as I know, reflection is not involved in the sense that some kind of dynamic type introspection happens behind the scenes. However, types from the System.Reflection such as Type or PropertyInfo are used together with types from the System.Linq.Expressions namespace. They are used by the compiler only to describe any Func<T,E> passed to your method as an abstract syntax tree (AST). Since this transformation from a Func<T,E> to an expression tree is done by the compiler, and not at run-time, only the lambda's static aspects are described.
Remember though that building this expression tree (complex object graph) from a lambda at run-time might take somewhat longer than simply passing a property name string (single object), simply because more objects need to be instantiated (the number depends on the complexity of the lambda passed to your method), but again, no dynamic type inspection à la someObject.GetType() is involved.
Example:
This MSDN article shows that the following lambda expression:
Expression<Func<int, bool>> lambda1 = num => num < 5;
is transformed to something like this by the compiler:
ParameterExpression numParam = Expression.Parameter(typeof(int), "num");
ConstantExpression five = Expression.Constant(5, typeof(int));
BinaryExpression numLessThanFive = Expression.LessThan(numParam, five);
Expression<Func<int, bool>> lambda1 =
Expression.Lambda<Func<int, bool>>(
numLessThanFive,
new ParameterExpression[] { numParam });
Beyond this, nothing else happens. So this is the object graph that might then be passed into your method.

Since you're method naming is ToPropertyName, I suppose you're trying to get the typed name of some particular property of a class. Did you benchmark the Expression<Func<E, T>> approach? The cost of creating the expression is quite larger and since your method is static, I see you're not caching the member expression as well. In other words even if the expression approach doesn't use reflection the cost can be high. See this question: How do you get a C# property name as a string with reflection? where you have another approach:
public static string GetName<T>(this T item) where T : class
{
if (item == null)
return string.Empty;
return typeof(T).GetProperties()[0].Name;
}
You can use it to get name of property or a variable, like this:
new { property }.GetName();
To speed up further, you would need to cache the member info. If what you have is absolutely Func<E, T> then your approach suits. Also see this: lambda expression based reflection vs normal reflection
A related question: Get all the property names and corresponding values into a dictionary

Converting a lambda expression into a unique key for caching

I've had a look at other questions similar to this one but I couldn't find any workable answers.
I've been using the following code to generate unique keys for storing the results of my linq queries to the cache.
string key = ((LambdaExpression)expression).Body.ToString();
foreach (ParameterExpression param in expression.Parameters)
{
string name = param.Name;
string typeName = param.Type.Name;
key = key.Replace(name + ".", typeName + ".");
}
return key;
It seems to work fine for simple queries containing integers or booleans but when my query contains nested constant expressions e.g.
// Get all the crops on a farm where the slug matches the given slug.
(x => x.Crops.Any(y => slug == y.Slug) && x.Deleted == false)
The key returned is thus:
(True AndAlso (Farm.Crops.Any(y =>
(value(OzFarmGuide.Controllers.FarmController+<>c__DisplayClassd).slug
== y.Slug)) AndAlso (Farm.Deleted == False)))
As you can see any crop name I pass will give the same key result. Is there a way I can extract the value of the given parameter so that I can differentiate between my queries?
Also converting the y to say the correct type name would be nice.....

As Polity and Marc said in their comments, what you need is a partial evaluator of the LINQ expression. You can read how to do that using ExpressionVisitor in Matt Warren's LINQ: Building an IQueryable Provider - Part III. The article Caching the results of LINQ queries by Pete Montgomery (linked to by Polity) describes some more specifics regarding this kind of caching, e.g. how to represent collections in the query.
Also, I'm not sure I would rely on ToString() like this. I think it's meant mostly for debugging purposes and it might change in the future. The alternative would be creating your own IEqualityComparer<Expression> that can create a hash code for any expression and can compare two expressions for equality. I would probably do that using ExpressionVisitor too, but doing so would be quite tedious.

I've been trying to figure out a scenario where this kind of approach could be useful without leading to bloated cache that is insanely hard to maintain.
I know this isn't directly answering your question, but I want to raise a few questions about this approach that, at first, may sound tempting:
How did you plan to manage parameter ordering? Ie. (x => x.blah == "slug" && !x.Deleted) cache key should equal (x => !x.Deleted && x.blah == "slug") cache key.
How did you plan to avoid duplicate objects in cache? Ie. Same farm from multiple queries would by design be cached separately with each query. Say, for each slug that appears in the farm, we have a separate copy of the farm.
Extending the above with more parameters, such as parcel, farmer etc. would lead to more matching queries with each having a separate copy of the farm cached. The same applies to each type you might query plus the parameters might not be in the same order
Now, what happens if you update the farm? Without knowing which cached queries would contain your farm, you'd be forced to kill your whole cache. Which kind of is counterproductive to what you're trying to achieve.
I can see the reasoning behind this approach. A 0-maintenance performance layer. However, if the above points are not taken into consideration, the approach will first kill the performance, then lead to a lot of attempts to maintain it, then prove to be completely unmaintainable.
I've been down that road. Eventually wasted a lot of time and gave up.
I found a much better approach by caching each resulting entity separately when the results come from the backend with an extension method for each type separately or through a common interface.
Then you can build extension method for your lambda expressions to first try the cache before hitting the db.
var query = (x => x.Crops.Any(y => slug == y.Slug) && x.Deleted == false);
var results = query.FromCache();
if (!results.Any()) {
results = query.FromDatabase();
results.ForEach(x = x.ToCache());
}
Of course, you will still need to track which queries have actually hit the database to avoid query A returning 3 farms from DB satisfying query B with one matching farm from cache while the database would actually have 20 matching farms available. So, each query stll need to hit DB at least once.
And you need to track queries returning 0 results to avoid them consequently hitting the DB for nothing.
But all in all, you get away with a lot less code and as a bonus, when you update a farm, you can
var farm = (f => f.farmId == farmId).FromCache().First();
farm.Name = "My Test Farm";
var updatedFarm = farm.ToDatabase();
updatedFarm.ToCache();

What about this?
public class KeyGeneratorVisitor : ExpressionVisitor
{
protected override Expression VisitParameter(ParameterExpression node)
{
return Expression.Parameter(node.Type, node.Type.Name);
}
protected override Expression VisitMember(MemberExpression node)
{
if (CanBeEvaluated(node))
{
return Expression.Constant(Evaluate(node));
}
else
{
return base.VisitMember(node);
}
}
private static bool CanBeEvaluated(MemberExpression exp)
{
while (exp.Expression.NodeType == ExpressionType.MemberAccess)
{
exp = (MemberExpression) exp.Expression;
}
return (exp.Expression.NodeType == ExpressionType.Constant);
}
private static object Evaluate(Expression exp)
{
if (exp.NodeType == ExpressionType.Constant)
{
return ((ConstantExpression) exp).Value;
}
else
{
MemberExpression mexp = (MemberExpression) exp;
object value = Evaluate(mexp.Expression);
FieldInfo field = mexp.Member as FieldInfo;
if (field != null)
{
return field.GetValue(value);
}
else
{
PropertyInfo property = (PropertyInfo) mexp.Member;
return property.GetValue(value, null);
}
}
}
}
This will replace the complex constant expressions to their original values as well as the parameter names to their type names. So just have to create a new KeyGeneratorVisitor instance and call its Visit or VisitAndConvert method with your expression.
Please note that the Expression.ToString method will be also invoked on your complex types, so either override their ToString methods or write a custom logic for them in the Evaluate method.

How about:
var call = expression.Body as MethodCallExpression;
if (call != null)
{
List<object> list = new List<object>();
foreach (Expression argument in call.Arguments)
{
object o = Expression.Lambda(argument, expression.Parameters).Compile().DynamicInvoke();
list.Add(o);
}
StringBuilder keyValue = new StringBuilder();
keyValue.Append(expression.Body.ToString());
list.ForEach(e => keyValue.Append(String.Format("_{0}", e.ToString())));
string key = keyValue.ToString();
}

when and in what scenario to use Expression Tree

I was reading about Expression Tree feature and how you can create delegates using lambda expressions. I still can't get as to in what scenario it is useful and in what real world example should I use it.

The primary use for expression trees is for out-of-process LINQ providers such as LINQ to SQL.
When you write something like this:
var query = people.Where(x => x.Age > 18)
.Select(x => x.Name);
those lambda expressions can either be converted to delegates, which can then be executed (as they are in LINQ to Object) or they can be converted to expression trees, which can be analyzed by the query source and acted on accordingly (e.g. by turning them into SQL, web service calls etc). The difference is that expression trees represent the code as data. They can be compiled into delegates if necessary, but usually (within LINQ anyway) they're never executed directly - just examined to find out the logic they contain.
Expression trees are also used extensively in the Dynamic Language Runtime, where they represent the code which should execute when a dynamic expression is evaluated. Expression trees are well suited for this as they can be composed and broken down again, and after they're compiled the resulting IL is JIT-compiled as normal.
Most developers will never need to mess with the expression tree API, although it has a few other uses.

Aside from LINQ, another very simple use case is to extract both the name and the value of a property. I use this in a fluent API for validating data transfer objects. It's safer to pass one lambda parameter to define both name and value rather than have a second string parameter for the name, and run the risk of developers getting it wrong.
Here's an example (minus all the safety checks and other housekeeping):
public Validator<T> Check<T>(Expression<Func<T>> expr) {
// Analyse the expression as data
string name = ((MemberExpression) expr.Body).Member.Name;
// Compile and execute it to get the value
T value = (expr.Compile())();
return new Validator<T>(name, value);
}
Example of use:
Check(() => x.Name).NotNull.MinLength(1);
Check(() => x.Age).GreaterThan(18);

I used expression trees to make a null-safe evaluator:
string name = myObject.NullSafeEval(x => x.Foo.GetBar(42).Baz.Name, "Default");
This methods analyzes and rewrites the expression tree to insert null checks before each property or method call along the "path" to Name. If a null is encountered along the way, the default value is returned.
See implementation here
Expression trees are also commonly used to avoid referring to a property by hard-coding its name in a string:
private string _foo;
public string Foo
{
get { return _foo; }
set
{
_foo = value;
OnPropertyChanged(() => Foo);
// Rather than:
// OnPropertyChanged("Foo");
}
}
static string GetPropertyName<T>(Expression<Func<T>> expr)
{
var memberExpr = expr.Body as MemberExpression;
if (memberExpr == null)
throw new ArgumentException("expr", "The expression body must be a member expression");
return memberExpr.Member.Name;
}
protected void OnPropertyChanged<T>(Expression<Func<T>> expr)
{
OnPropertyChanged(GetPropertyName(expr));
}
This enables compile time checking and name refactoring

Pass a property of a Linq entity in a method to set and get result

I'm trying to pass in a property of a Linq entity to be used by my method. I can easily pass a property to be queried
Func<Entities.MyEntity, ResultType> GetProperty = ent => ent.Property;
However this returns ResultType and cannot be used to set the property.
I thought about using reflection to get a propertyInfo, but this will let me fetch the property but then I can't use Linq syntax to call my property. Is there any guru out there that knows how to do this?
I have a hunch I could do it by constructing a chunk of an expression tree and applying it onto the query...
I was really hoping to do something like:
var value = myQueryEntity.CallMagicFunction(); //typesafe
myQueryEntity.CallMagicFunction() = value; //typesafe

Indeed, an expression tree should work; for basic member access (a field/property directly off the object):
static MemberInfo ReadMember(LambdaExpression expr)
{
if(expr == null) throw new ArgumentNullException("expr");
MemberExpression me = expr.Body as MemberExpression;
if(me == null || !ReferenceEquals(me.Expression, expr.Parameters[0])) {
throw new ArgumentException("expr");
}
return me.Member;
}
with
Expression<Func<Customer, int>> func = c => c.Id;
MemberInfo member = ReadMember(func);
// for simplicity assume prop:
PropertyInfo prop = (PropertyInfo)member;
From there you can do pretty much anything; in particular you can get the get/set accessors (if you want to create a delegate), or use GetValue / SetValue.
Note that in .NET 4.0 you can set properties directly on an Expression (but the C# compiler doesn't add any extra support for this, so you'd need to write your own Expression by hand).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Iterating a expression tree to check for null - c#

Related

using expression trees to compare objects by a single property nets InvalidOperationException

Is reflection used when retrieving information from a linq expression?

Converting a lambda expression into a unique key for caching

when and in what scenario to use Expression Tree

Pass a property of a Linq entity in a method to set and get result

Categories

Resources