Performance and Concept questions in LINQ - c#

I have some questions / concerns using LINQ in my projects. First question is - Is there is difference in performance between old (select Item from..) linq and new version (.Select(r => ..))?
Second question, How LINQ expresions is being translated (and in what)? Will it be translated to old syntax first and then to something else (intermediate language)?

There isn't any difference between the two ways we can write a linq query.
Specifically, this
var adults = from customer in customers
where customer.Age>18
select customer;
is equivalent to this:
var adults = customers.Where(customer=>customer.Age>18);
Actually, the compiler translates the first query to the second query. The first way of writing a linq query is something like a syntactic sugar. Under the hood, if you compile your code and then you make use of a dissasembler to see the IL code, you will notice that your query has been translated to the second one of the above forms.
Queries written with the first way, we say that we have used the query syntax. While queries written with the second way, we say that we have used the fluent syntax.

Is there is difference in performance between old (select Item from..) linq and new version (.Select(r => ..))?
Neither of these are older than the other, as both came into the language with at the same time. If anything .Select() could be argued as older as while the method call will almost always be a call to an extension method (and hence only available since .NET 3.5 and only callable that way with C# 3.0) there were method calls generally since 1.0.
There's no difference in performance, as they are different ways to say the same thing. (It's just about possible that you could find a case that resulted in a redundancy for one but not the other, but for the most part those redundancies are caught by the compiler and removed).
How LINQ expresions is being translated (and in what)? Will it be translated to old syntax first and then to something else (intermediate language)?
Consider that, as per the above, from item in someSource select item.ID and someSouce.Select(item => item.ID) are the same thing. The compiler has to do two things:
Determine how the call should be made.
Determine how the lambda should be used in that.
These two go hand in hand. The first part is the same as with any other method call:
Look for a method defined on the type of someSource that is called Select() and takes one parameter of the appropriate type (I'll come to "appropriate type" in a minute).
If no method is found, look for a method defined on the immediate base of the type of someSource, and so on until you have no more base classes to examine (after reaching object).
If no method is found, look for an extension method defined on a static class that is available to use through a using which has its first (this) parameter the type of someSource, and its second parameter of the appropriate type that I said I'll come back to in a minute.
If no method is found, look for a generic extension method that can accept the types of someSource and the lambda as parameters.
If no method is found, do the above two steps for the base types of someSource and interfaces it implements, continuing to further base types or interfaces those interfaces extend.
If no method is found, raise a compiler error. Likewise, if any of the above steps found two or more equally applicable method in the same step raise a compiler error.
So far this is the same as how "".IsNormalized() calls the IsNormalized() method defined on string, "".GetHashCode() calls the GetHashCode() method defined on object (though a later step means the override defined on string is what is actually executed) and "".GetType() calls the GetType() method defined on object.
Indeed we can see this in the following:
public class WeirdSelect
{
public int Select<T>(Func<WeirdSelect, T> ignored)
{
Console.WriteLine("Select‎ Was Called");
return 2;
}
}
void Main()
{
int result = from whatever in new WeirdSelect() select whatever;
}
Here because WeirdSelect has its own applicable Select method, that is executed instead of one of the extension methods defined in Enumerable and Queryable.
Now, I hand-waved over "parameter of the appropriate type" above because the one complication that lambdas bring into this is that a lambda in C# code can be turned into either a delegate (in this case a Func<TSource, TResult> where TSource is the type of the lambdas parameter and TResult the type of the value it returns) or an expression (in this case a Expression<Func<TSource, TResult>>) in the produced CIL code.
As such, the method call resolution is looking for either a method that will accept a Func<TSource, TResult> (or a similar delegate) or one that will accept an Expression<Func<TSource, TResult>> (or a similar expression). If it finds both at the same stage in the search there will be a compiler error, hence the following will not work:
public class WeirdSelect
{
public int Select<T>(Func<WeirdSelect, T> ignored)
{
Console.WriteLine("Select‎ Was Called");
return 2;
}
public int Select<T>(Expression<Func<WeirdSelect, T>> ignored)
{
Console.WriteLine("Select‎ Was Called on expression");
return 1;
}
}
void Main()
{
int result = from whatever in new WeirdSelect() select whatever;
}
Now, 99.999% of the time we are either using select with something that implements IQueryable<T> or something that implements IEnumerable<T>. If it implements IQueryable<T> then the method call resolution will find public static IQueryable<TResult> Select<TSource, TResult>(this IQueryable<TSource> source, Expression<Func<TSource, TResult>> selector) defined in Queryable and if it implements IEnumerable<T> it will find public static IEnumerable<TResult> Select<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector) defined in Enumerable. It doesn't matter that IQueryable<T> derives from IEnumerable<T> because its method will be found in an earlier step in the process described above, before IEnumerable<T> is considered as a base interface.
Therefore 99.999% of the time there will be a call made to one of those two extension methods. In the IQueryable<T> case the lambda is turned into some code that produces an appropriate Expression which is then passed to the method (the query engine then able to turn that into whatever code is appropriate, e.g. creating appropriate SQL queries if its a database-backed query engine, or something else otherwise). In the IEnumerable<T> case the lamda is turned into an anonymous delegate which is passed to the method which works a bit like:
public static IEnumerable<TResult> Select<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
//Simplifying a few things, this code is to show the idea only
foreach(var item in source)
yield return selector(item);
}
To come back to your question:
Will it be translated to old syntax first and then to something else (intermediate language)?
You could think of the newer from item in source select… syntax as being "turned into" the older source.Select(…) syntax (but not really older since it depends on extension methods over 99% of the time) because it makes the method call a bit clearer, but really they amount to the same thing. In the CIL produced the differences depend on whether the call was a instance method or (as is almost always the case) an extension method and even more so on whether the lambda is used to produce an expression or a delegate.

Related

How does the compiler know what datatype the lambda expression should be

So when using EF Core and you use most of the Linq extensions you actually use System.Linq.Expressions instead of the usual Func.
So lets say you are using FirstOrDefault on a DbSet.
DbContext.Foos.FirstOrDefault(x=> x.Bar == true);
When you ctrl + lmb on FirstOrDefault it will show you the following overload:
public static TSource FirstOrDefault<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
But there is also an overload for Func:
public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
When you want to store an expression in a variable you can do something like the following:
Func<Entity, bool> = x => x.Bar == true;
and
Expression<Func<Entity, bool>> = x => x.Bar == true;
So how does the compiler decide which overload should be used while using these extension methods?
The accepted answer is a reasonable explanation, but I thought I might provide a little more detail.
So lets say you are using FirstOrDefault on a DbSet. DbContext.Foos.FirstOrDefault(x=> x.Bar == true);
First off, I hope you would not write that. If you want to ask "is it raining?" do you ask "is it raining?" or do you ask "is the statement that it is raining a true statement?" Just say FirstOrDefault(x => x.Bar).
Next, given these overloads:
public static TSource FirstOrDefault<TSource>(
this IQueryable<TSource> source,
Expression<Func<TSource, bool>> predicate)
public static TSource FirstOrDefault<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
How does the compiler choose which overload is the best?
First we do type inference to determine what TSource is in each. The details of the type inference algorithm are complex; ask a more focussed question if you have a question about it.
If type inference fails to determine a type for TSource in either, the failed inference method is discarded from the set of candidates. In your example TSource can be determined to be Foo, presumably.
Next, of the candidates that remain, we check them for applicability of arguments to formals. That is, can we convert every supplied argument to its corresponding formal parameter type? (And of course, is the number of arguments provided correct, and so on.) In your example both methods are applicable.
Of the applicable candidates that remain, we now enter a round of betterness checking. How does betterness checking work? Again, we do it argument-by-argument. In this case we have two questions to answer:
DbContext.Foos can be converted to either IEnumerable<Foo> or IQueryable<Foo>. Which, if either, is the better conversion?
The lambda can be converted to either a delegate or an expression tree. Which, if either, is the better conversion?
The second question is easy to answer: neither is better. We learn nothing from this argument with respect to betterness.
To answer the first question, we apply the rule conversion to specific is better than conversion to general. If given a choice to convert to Giraffe or Mammal, converting to Giraffe is better. So now the question is which is more specific, IQueryable<Foo> or IEnumerable<Foo>?
The rule of specificity checking is straightforward: if X can be implicitly converted to Y but Y cannot be implicitly converted to X, then X is the more specific. A Giraffe can be used where an Animal is needed, but an Animal cannot be used where a Giraffe is needed, so Giraffe is more specific. Or: every giraffe is an animal, but not every animal is a giraffe, so giraffe is more specific.
By this measure, IQueryable<T> is more specific than IEnumerable<T> because every queryable is an enumerable but not every enumerable is a queryable.
So the queryable is more specific, and therefore that conversion is better.
Now we ask the question "is there a unique applicable candidate method where compared to every other candidate, at least one conversion was better and no conversion was worse?" There is; the queryable candidate has the property that it is better in one argument than every other, and not worse in every other argument, and it is the unique method that has this property.
Therefore overload resolution chooses that method.
I encourage you to read the specification if you have more questions.
Inherited class proximity matters more than exact method parameter types
Notice that the Expression<Func<T,bool>> variant applies to IQueryable<T>, whereas the Func<T, bool> variant applies to IEnumerable<T>.
When looking for a matching method, the compiler will always pick the one closest to the object's type. The inheritance hierarchy is as follows:
DbSet<T> : IQueryable<T> : IEnumerable<T>
Note: there may be other inheritances inbetween, but that doesn't matter. What matters is which is closest to DbSet<T>. IQueryable<T> is closer related to DbSet<T> than IEnumerable<T>.
Therefore, the compiler will try to find a matching method in IQueryable<T>. It asks two questions:
Does this type have a method by that name?
Do the method parameter types match/map?
IQueryable<T> has a FirstOrDefault method, so bullet point 1 is satisfied); and since x => x.MyBoolean can be implicitly converted to an Expression<Func<T, bool>>, bullet point 2 is also satisfied.
Therefore, you end up with the Expression<Func<T,bool>> variant defined on IQueryable<T>.
Suppose x => x.MyBoolean could not be implicitly converted to Expression<Func<T,bool>> but could be converted to Func<T,bool> (note: this isn't the case, but this could happen for other types/values), then bullet point 2 would not have been satisfied.
At this point, since the compiler did not find a match in IQueryable<T>, it will keep looking further, stumbling on IEnumerable<T> and ask itself the same questions (bullet points). Both bullet points would have been satified.
Therefore, in this case, you would've ended up with the Func<T,bool> variant defined on IEnumerable<T>.
Update
Here's a dotnetfiddle example.
Notice that even though I pass int values (which the base method signature uses), The double signature of the Derived class fits (because int implicitly converts to double) and the compiler never looks in the Base class.
However, this isn't true in Derived2. Since int does not implicitly convert to string, there is no match found in Derived2, and the compiler looks further in Base and uses the int method from Base.
I think the most useful place to look in the C# spec is Anonymous Function Expressions:
An anonymous function does not have a value or type in and of itself, but is convertible to a compatible delegate or expression tree type
...
In an implicitly typed parameter list, the types of the parameters are inferred from the context in which the anonymous function occurs—specifically, when the anonymous function is converted to a compatible delegate type or expression tree type, that type provides the parameter types.
Which then leads us to Anonymous Function Conversions:
A lambda expression F is compatible with an expression tree type Expression<D> if F is compatible with the delegate type D. Note that this does not apply to anonymous methods, only lambda expressions.
Those are the dry bits from the spec. However, it's also helpful to read Eric Lippert's How do we ensure that type inference terminates to pull bits together.

Passing a method to a LINQ query

In a project I'm currently working on, we have many static Expressions that we have to bring in local scope with a variable when we call the Invoke method on them and pass our lambda expressions' arguments to.
Today, we declared a static method whose parameter is exactly the type that the query is expecting. So, my coworker and I were messing around to see if we could get this method to do the project in the Select statement of our query, instead of invoking it on the whole object, without bringing it into local scope.
And it worked! But we do not understand why.
Imagine code like this
// old way
public static class ManyExpressions {
public static Expression<Func<SomeDataType, bool> UsefulExpression {
get {
// TODO implement more believable lies and logic here
return (sdt) => sdt.someCondition == true && false || true;
}
}
}
public class ARealController : BaseController {
/* many declarations of important things */
public ARealClass( /* many ninjected in things */) {
/* many assignments */
}
public JsonNet<ImportantDataResult> getSomeInfo(/* many useful parameter */) {
var usefulExpression = ManyExpressions.UsefulExpression;
// the db context is all taken care of in BaseController
var result = db.SomeDataType
.Where(sdt => usefulExpression.Invoke(sdt))
.Select(sdt => new { /* grab important things*/ })
.ToList();
return JsonNet(result);
}
}
And then you get to do this!
// new way
public class SomeModelClass {
/* many properties, no constructor, and very few useful methods */
// TODO come up with better fake names
public static SomeModelClass FromDbEntity(DbEntity dbEntity) {
return new SomeModelClass { /* init all properties here*/ };
}
}
public class ARealController : BaseController {
/* many declarations of important things */
public ARealClass( /* many ninjected in things */) {
/* many assignments */
}
public JsonNet<SomeModelClass> getSomeInfo(/* many useful parameter */) {
// the db context is all taken care of in BaseController
var result = db.SomeDataType
.Select(SomeModelClass.FromDbEntity) // TODO; explain this magic
.ToList();
return JsonNet(result);
}
}
So when ReSharper prompts me to do this (which is not often, as this condition of matching the type that is expected by the Delegate isn't often satisfied), it says convert to a Method Group. I kind of vaguely understand that a Method Group is a set of methods, and the C# compiler can take care of converting the method group to an explicitly typed and appropriate overload for the LINQ provider and what not... but I'm fuzzy on why this works exactly.
What's going on here?
It's great to ask a question when you don't understand something, but the problem is that it can be hard to know which bit someone doesn't understand. I hope I help here, rather than tell you a bunch of stuff you know, and not actually answer your question.
Let's go back to the days before Linq, before expressions, before lambda, and before even anonymous delegates.
In .NET 1.0 we didn't have any of those. We didn't even have generics. We did though have delegates. And a delegate is related to a function pointer (if you know C, C++ or languages with such) or function as argument/variable (if you know Javascript or languages with such).
We could define a delegate:
public delegate int MyDelegate(double someValue, double someOtherValue);
And then use it as a type for a field, property, variable, method argument or as the basis of an event.
But at the time the only way to actually give a value for a delegate was to refer to an actual method.
public int CompareDoubles(double x, double y)
{
if (x < y) return -1;
return y < x ? 1 : 0;
}
MyDelegate dele = CompareDoubles;
We can invoke that with dele.Invoke(1.0, 2.0) or the shorthand dele(1.0, 2.0).
Now, because we have overloading in .NET, we can have more than one thing that CompareDoubles refers to. That isn't a problem, because if we also had e.g. public int CompareDoubles(double x, double y, double z){…} the compiler could know that you could only possibly have meant to assign the other CompareDoubles to dele so it's unambiguous. Still, while in the context CompareDoubles means a method that takes two double arguments and returns an int, outside of that context CompareDoubles means the group of all the methods with that name.
Hence, Method Group which is what we call that.
Now, with .NET 2.0 we got generics, which is useful with delegates, and at the same time in C#2 we got anonymous methods, which is also useful. As of 2.0 we could now do:
MyDelegate dele = delegate (double x, double y)
{
if (x < y) return -1;
return y < x ? 1 : 0;
};
This part was just syntactic sugar from C#2, and behind the scenes there's still a method there, though it has an "unspeakable name" (a name that is valid as a .NET name but not valid as a C# name, so C# names can't clash with it). It was handy if, as was often the case, one was creating methods just to have them used once with a particular delegate though.
Move forward a bit further, and at .NET 3.5 have covariance and contravariance (great with delegates) the Func and Action delegates (great for reusing the same name based on type, rather than having a bunch of different delegates which were often very similar) and along with it came C#3 which had lambda expressions.
Now, these are a bit like anonymous methods in one use, but not in another.
That's why we can't do:
var func = (int i) => i * 2;
var works out what it means from what's been assigned to it, but lamdas work out what they are from what they've been assigned to, so this is ambiguous.
It could mean:
Func<int, int> func = i => i * 2;
In which case it's shorthand for:
Func<int, int> func = delegate(int i){return i * 2;};
Which in turn is shorthand something like for:
int <>SomeNameImpossibleInC# (int i)
{
return i * 2;
}
Func<int, int> func = <>SomeNameImpossibleInC#;
But it can also be used as:
Expression<Func<int, int>> func = i => i * 2;
Which is shorthand for:
Expression<Func<int, int>> func = Expression.Lambda<Func<int, int>>(
Expression.Multiply(
param,
Expression.Constant(2)
),
param
);
And we also with .NET 3.5 have Linq which makes heavy use of both of these. Indeed, Expressions is considered part of Linq and is in the System.Linq.Expressions namespace. Note that the object we get here is a description of what we want done (take the parameter, multiply it by two, give us the result) not of how to do it.
Now, Linq operates in two main ways. On IQueryable and IQueryable<T> and on IEnumerable and IEnumerable<T>. The former defines operations to be used on "a provider" with just what "a provider does" being up to that provider, and the latter defines the same operations on in-memory sequences of values.
We can move from one to the other. We can turn an IEnumerable<T> into an IQueryable<T> with AsQueryable which will give us a wrapper on that enumerable, and we can turn the IQueryable<T> into an IEnumerable<T> just by treating it as one, because IQueryable<T> derives from IEnumerable<T>.
The enumerable form uses the delegates. A simplified version of how Select works (there are many optimisations this version leaves out, and I'm skipping error checking and in indirection to ensure that error checking happens immediately) would be:
public static IEnumerable<TResult> Select(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
foreach(TSource item in source) yield return selector(item);
}
The queryable version on the other hand works by taking the expression tree from the Expression<TSource, TResult> making it part of an expression that includes the call to Select, and the source queryable, and returns an object wrapping that expression. So in other words a call to queryable's Select returns an object that represents a call to queryable's Select!
Just what is done with that depends on the provider. Database providers turn them into SQL, enumerables call Compile() on the expression to create a delegate and then we're back at the first version of Select above, and so on.
But that history considered, let's go backwards through the history again. A lambda can represent either an expression or a delegate (and if an expression, we can Compile() it to get the same delegate). A delegate is a way of pointing to a method through a variable, and a method is part of a method group. All of this is built on technology which in the first version could only be called by creating a method and then passing that.
Now, lets say we have a method that takes a single argument and has a result.
public string IntString(int num) { return num.ToString(); }
Now lets say we referenced it in a lambda selector:
Enumerable.Range(0, 10).Select(i => IntString(i));
We have a lambda creating an anonymous method for a delegate, and that anonymous method in turn calls a method with the same argument and return types. In a way that's a bit like if we had:
public string MyAnonymousMethod(int i){return IntString(i);}
MyAnonymousMethod is a bit pointless here; all it does is call IntString(i) and return the result, so why not just call IntString in the first place and cut out going through that method:
Enumerable.Range(0, 10).Select(IntString);
We've cut out a needless (though see note below about delegate caching) level of indirection by taking the lambda-based delegate and converting it to a method group. Hence ReSharper's advice "Convert to Method Group" or however it's worded (I don't use ReSharper myself).
There is though something to be careful of here. IQueryable<T>'s Select only takes expressions, so the provider can try to work out how to convert it to its way of doing stuff (e.g. SQL against a database). IEnumerable<T>'s Select only takes delegates so they can be executed in the .NET application itself. We can go from the former to the latter (when the queryable is really a wrapped enumerable) with Compile(), but we can't go from the latter to the former: We don't have a way of taking a delegate and turning it into an expression that means anything other than "call this delegate" which isn't something that can be turned into SQL.
Now when we use a lambda expression like i => i * 2 it will be an expression when used with IQueryable<T> and a delegate when used with IEnumerable<T> due to overload resolution rules favouring the expression with queryable (as a type it can handle both, but the expression form works with the most derived type). If though we explicitly give it a delegate, whether because we typed it somewhere as Func<> or it comes from a method group, then the overloads taking expressions aren't available and those taking delegates are used. This means it doesn't get passed to the database but rather the linq expression up to that point becomes the "database part" and it gets called and the rest of the work done in memory.
95% of the time that's best avoided. So 95% of the time if you get advice of "convert to method group" with a database-backed query you should think "uh oh! that's actually a delegate. Why is that a delegate? Can I change it to be an expression?". Only the remaining 5% of the time should you think "that'll be slightly shorter if I just pass in the method name". (Also, using a method group instead of a delegate prevents caching of delegates the compiler can do otherwise, so it might be less efficient).
There, I hope I covered the bit that you didn't understand in the course of all that, or at least there's a bit here you can point to and say "that bit there, that's the bit I don't grok".
Select(SomeModelClass.FromDbEntity)
This uses Enumerable.Select which is not what you want. This transitions out of "queryable-LINQ" into LINQ to objects. This means the database cannot execute this code.
.Where(sdt => usefulExpression.Invoke(sdt))
Here, I assume you meant .Where(usefulExpression). This passes the expression into the expression tree underlying the query. The LINQ provider can translate this expression.
When you perform experiments like this use SQL Profiler to see what SQL goes over the wire. Make sure all relevant parts of the query are translatable.
I don't want to disappoint you, but there is no magic at all. And I would suggest you to be very careful with this "new way".
Always check the result of a function by hovering it in VS. Remember that IQueryable<T> "inherits" IEnumerable<T> and also Queryable contains the extension methods with the same names as the Enumerable, and the only difference is that the former works with Expression<Func<...>> while the later just with Func<..>.
So anytime you use Func or method group over IQueryable<T>, the compiler will pick the Enumerable overload, thus silently switching from LINQ to Entities to LINQ to Objects context. But there is a huge difference between the two - this former is executed in database while the later in memory.
The key point is to stay as long as possible in the IQueryable<T> context, so the "old way" should be preferred. E.g. from your examples
.Where(sdt => sdt.someCondition == true && false || true)
or
.Where(ManyExpressions.UsefulExpression)
or
.Where(usefulExpression)
but not
.Where(sdt => usefulExpression.Invoke(sdt))
And never
.Select(SomeModelClass.FromDbEntity)
This solution threw up some red flags for me. Key among them was:
var result = db.SomeDataType
.Select(SomeModelClass.FromDbEntity) // TODO; explain this magic
.ToList(); // <<!!!!!!!!!!!!!
Whenever you're dealing with Entity Framework, you can read "ToList()" as "Copy the whole thing into memory." So "ToList()" should only be done at the last possible second.
Consider: there are lots of useful object you can pass around when dealing with EF:
The database context
The specific dataset you're targeting (e.g. context.Orders)
Queries against a context:
.
var query = context.Where(o => o.Customer.Name == "John")
.Where(o => o.TxNumber > 100000)
.OrderBy(o => o.TxDate);
//I've pulled NO data so far! "var query" is just an object I can pass around
//and even add on to! For example, I can now do this:
query = query.ThenBy(o => o.Items.Description); //and now I've appended that to my query
The real magic is that those lambdas can be thrown in to a variable too. Here's a method I use in one of my projects to do that:
/// <summary>
/// Generates the Lambda "TIn => TIn.memberName [comparison] value"
/// </summary>
static Expression<Func<TIn, bool>> MakeSimplePredicate<TIn>(string memberName, ExpressionType comparison, object value)
{
var parameter = Expression.Parameter(typeof(TIn), "t");
Expression left = Expression.PropertyOrField(parameter, memberName);
return (Expression<Func<TIn, bool>>)Expression.Lambda(Expression.MakeBinary(comparison, left, Expression.Constant(value)), parameter);
}
With this code, you can write something like the following:
public GetQuery(string field, string value)
{
var query = context.Orders;
var condition = MakeSimplePredicate<Order>(field, ExpressionType.Equal, value);
return query.Where(condition);
}
The best thing is that at this time, no data call has been. You can continue to add conditions as you wish. When you're ready to fetch the data, simply iterate through it or call ToList().
Enjoy!
Oh, and check this out if you'd like to see a more thoroughly-developed solution, albeit from a different context.
My Post on Linq Expression Trees

Why does .Where() on an IQueryable return a different type based on whether a Lamba or Func<T,Tresult> are passed as parameters

In my Entity Framework Code First project I have a repository object which contains routes
public class EFRepository
{
...
public IQueryable<Route> Routes
{
get { return context.Routes; }
}
...
}
If I run
var routes = this.repository.Routes
.Where(r => r.DeployedService.IsActive.HasValue
&& r.DeployedService.IsActive.Value);
The routes object is of type IQueryable<Route>.
However, if I run
Func<Route, bool> routeIsActive = r => r.DeployedService.IsActive.HasValue
&& r.DeployedService.IsActive.Value;
var routes = this.repository.Routes.Where(routeIsActive);
The routes object in this case is of type IEnumerable<Route>.
I would have thought that they would be evaluated the same but clearly I am wrong. What is the difference between the two statements and why do they return different types.
The method .Where(Expression<Func<Route,bool>>) is defined by IQueryable<Route>. The method .Where(Func<Route,bool>) on the other hand is defined by IEnumerable<Route> and not by IQueryable<Route>. Thus each returns its own type for fluent LINQ method chaining.
The additional method defined by IQueryable allows the expression tree to be pushed down to the LINQ provider, e.g. LINQ-to-entities, for lazy late evaluation at the provider level where this is possible.
Because passing Func<Route,bool> will make it Linq-to-objects (Func is .NET delegate which will be executed in .NET code). It tells EF: load all routes and I will do filtering in .NET code.
You need to pass an expression (Expression<Func<Route,bool>>) which will be internally translated to SQL to work with Linq-to-entities. It tells EF: here is the filter I want to translate to SQL and execute on the database server and I want to receive only filtered result set.
IQuerable<T> inherits from IEnumerable<T>. This means that Where is overloaded:
One overload takes an expression and returns an IQuerable<T>.
public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
The other overload takes a function and returns an IEnumerable<T>.
public static IQueryable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
If you pass in a lambda, both overloads are applicable, and overload resolution prefers the first one. If you pass in a Func<T,bool>, then only the second overload is applicable.
If you change the type of your variable to Expression<Func<Route, bool>>, then you'll get an IQueryable<Route> back.

List<T> Ordering

I have an issue, I am allowing a user to select the criterea for ordering a List
Lets say my list is called
List<Cars> AllCars = new List<Cars>;
allCars = //call the database and get all the cars
I now want to order this list
allCars.orderBy(registrationDate)
I understand the above doesn't work but i haven't anyidea what i should be putting in the brackets.
allCars.OrderBy(c => c.RegistrationDate);
I understand the above doesn't work but i haven't anyidea what i should be putting in the brackets.
The declaration of Enumerable.OrderBy is
public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector
)
and, as it's an extension method it can be invoked as
source.OrderBy(keySelector).
Your List<Car> is playing the role of source as List<T> : IEnumerable<T>. The second parameter is the more interesting one and the one that you are confused by. It's declared as being of type
Func<TSource, TKey>
This means that it is a delegate that eats instances of TSource (in your case Car) and returns an instance of TKey; it's up to you to decide what TKey is. You have stated that you want to order by Car.registrationDate so it sounds like TKey is DateTime. Now, how do we get one of these delegates?
In the old days we could say
DateTime GetRegistrationDate(Car car) {
return car.registrationDate;
}
and use OrderBy like so:
allCars.OrderBy(GetRegistrationDate).
In C# 2.0 we gained the ability to use anonymous delegates; these are delegates that don't have a name and are defined in-place.
allCars.OrderBy(delegate(Car car) { return car.registrationDate; });
Then, in C# 3.0 we gained the ability to use lambda expressions which are very special anonymous delegates with a compact notation
allCars.OrderBy(car => car.registrationDate);
Here, c => c.registrationDate is the lambda expression and it represents a Func<Car, DateTime> than can be used the second parameter in Enumerable.OrderBy.
allCars.orderBy(registrationDate)
The reason this doesn't work is because registrationDate is not a delegate. In fact, without any context at all registrationDate is meaningless to the compiler. It doesn't know if you mean Car.registrationDate or maybe you mean ConferenceAttendee.registrationDate or who knows what. This is why you must give additional context to the compiler and tell it that you want the property Car.registrationDate. To do this, you use a delegate in one of the three ways mentioned above.

How do I decipher the Select method docs on MSDN?

I'm struggling to get my head around LINQ and have come to the conclusion that searching through dozens of examples until I find one that is near to my own application in C# is not teaching me how to fish.
So back to the docs where I immediately hit a brick wall.
Can someone please help me decipher the Enumerable.Select method as presented here on msdn
http://msdn.microsoft.com/en-us/library/bb548891.aspx and given as a tip by Intellisense?
Enumerable.Select(TSource, TResult) Method (IEnumerable(TSource>), Func(TSource, TResult))
Here is the same line broken down with line numbers if it helps to refer:
Enumerable.Select
(TSource, TResult)
Method
(IEnumerable(TSource>),
Func
(TSource, TResult))
It might help to look at the definition of this method in C#, from the MSDN article you refer to:
public static IEnumerable<TResult> Select<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TResult> selector
)
The <angle brackets> denote the type parameters for this generic method, and we can start to explore the purpose of the method simply by looking at what the type parameters are doing.
We begin by looking at the name of the generic method:
Select<TSource, TResult>
This tells us that the method called Select deals with two different types:
The type TSource; and
The type TResult
Let's look at the parameters:
The first parameter is IEnumerable<TSource> source — a source, providing a TSource enumeration.
The second parameter is Func<TSource, TResult> selector — a selector function that takes a TSource and turns it into a TResult. (This can be verified by exploring the definition of Func)
Then we look at its return value:
IEnumerable<TResult>
We now know this method will return a TResult enumeration.
To summarise, we have a function that takes an enumeration of TSource, and a selector function that takes individual TSource items and returns TResult items, and then the whole select function returns an enumeration of TResult.
An example:
To put this into concrete terms, lets say that TSource is of type Person (a class representing a person, with a name, age, gender, etc), and TResult is of type String (representing the person's name). We're going to give the Select function a list of Persons, and a function that, given a Person will select just their name. As the output of calling this Select function, we will get a list of Strings containing just the names of the people.
Aside:
The last piece of the puzzle from the original method signature, at the top, is the this keyword before the first parameter. This is part of the syntax for defining Extension Methods, and all it essentially means is that instead of calling the static Select method (passing in your source enumeration, and selector function) you can just invoke the Select method directly on your enumeration, just as if it had a Select method (and pass in only one parameter — the selector function).
I hope this makes it clearer for you?
The way to think of Select is as mapping each element of a sequence. Hence:
Enumerable.Select<TSource, TResult>: the Select method is parameterised by its source and result types (the type of thing you are mapping and the type you are mapping it to)
IEnumerable<TSource>: the sequence of things to map
Func<TSource, TResult>: the mapping function, that will be applied to each element of the source sequence
The result being an IEnumerable<TResult>, a sequence of mapping results.
For example, you could imagine (as a trivial example) mapping a sequence of integers to the string representations:
IEnumerable<string> strings = ints.Select(i => i.ToString());
Here ints is the IEnumerable<TSource> (IEnumerable<int>) and i => i.ToString() is the Func<TSource, TResult> (Func<int, string>).
I'm of the opinion that the later chapters of C# in Depth do a good job of explaining LINQ, and what it all means. Plus the rest of the book teaches a lof of other very useful C# knowledge.

Categories

Resources