Why does IEnumerable's Where pass first argument as "this" - c#

I've just started to study on LINQ. The following example uses Where which is one of the standard query operators.
string[] names = { "Tom", "Dick", "Harry" };
IEnumerable<string> filteredNames = System.Linq.Enumerable.Where(names, n => n.Length >= 4);
I did some research on how it works and found this source:
public static partial class Enumerable
{
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
if (source is Iterator<TSource>) return ((Iterator<TSource>)source).Where(predicate);
if (source is TSource[]) return new WhereArrayIterator<TSource>((TSource[])source, predicate);
if (source is List<TSource>) return new WhereListIterator<TSource>((List<TSource>)source, predicate);
return new WhereEnumerableIterator<TSource>(source, predicate);
}
/* ... */
}
I don't understand why its first parameter, this IEnumerable<TSource> source, is prefixed with the this keyword. I know that extension methods allow an existing type to be extended with new methods without
altering the definition of the original type and that the type
of the first parameter will be the type that is extended.
Can you explain its logic beneath?

Because it is an Extension Method.
So that instead of
IEnumerable<string> filteredNames = System.Linq.Enumerable.Where(names, n => n.Length >= 4);
You can also use it like:
IEnumerable<string> filteredNames = names.Where(n => n.Length >= 4);
The Reason it is an extension method is that IEnumerable, List, ... existed long before Linq (which was introduced in .Net 3.5) and its job is just to extend finding, filtering, ordering, ... them. So it is logical to have it as an Extention method rather than a separate library. And also consider that this way you can use chaining, which woudln'd be possible if it wasn't an extension:
name.Where(x => x.Length > 4).Select(x => x.Substring(4));
Compare it to:
System.Linq.Enumerable.Select(System.Linq.Enumerable.Where(name, x => x.Length > 4), x => x.Substring(4));
And this is only a very simple one, consider how dirty it gets with larger, complex queries.

Since it's an Extension Method. It means that Where is not a method on IEnumerable but when you reference Linq namespace ,Where method is added to IEnumerable.
for more info read this :
Extension Methods

First of all, The simple answer is because it is an extension method. The definition of extension method is that Their first parameter specifies which type the method operates on, and the parameter is preceded by the this modifier.
Secondly, I disagree Ashkan Mobayen Khiabani's answers last part. you must have a look fluent implementation without extension method.

Related

Linq: Extension method on IEnumerable to automatically do null-checks when performing selects

When performing a Select on an IEnumerable I believe it is good practice to check for null references, so I often have a Where before my Select like this:
someEnumerable.Where(x => x != null).Select(x => x.SomeProperty);
This gets more complicated when accessing sub-properties:
someEnumerable.Where(x => x != null && x.SomeProperty != null).Select(x => x.SomeProperty.SomeOtherProperty);
To follow this pattern I need to do a lof of calls to Where. I would like to create an extension method on IEnumerable that perform such null-checks automatically depending on what is referenced in the Select. Like this:
someEnumerable.SelectWithNullCheck(x => x.SomeProperty);
someEnumerable.SelectWithNullCheck(x => x.SomeProperty.SomeOtherProperty);
Can this be done? Is it fx. possible to retrieve the selected properties from the selector parameter when creating an extension method such as this?
public static IEnumerable<TResult> SelectWithNullCheck<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
return source.Where(THIS IS WHERE THE AUTOMATIC NULL-CHECKS HAPPEN).Select(selector);
}
EDIT: I use C# 5.0 with .NET Framework 4.5
As you are using C# 5.0 you can write your extension method the following way:
public static IEnumerable<TResult> SelectWithNullCheck<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TResult> selector)
{
return source.Where(x => x != null).Select(selector).Where(x => x != null);
}
Before and after the projection (Select call) to apply a check that the result is not null.
Then usage will be:
someEnumerable.SelectWithNullCheck(x => x.SomeProperty)
.SelectWithNullCheck(y => y.SomeOtherProperty);
Notice that the type of item in each call is different.
If you do want it similar to this:
someEnumerable.SelectWithNullCheck(x => x.SomeProperty.SomeOtherProperty);
Then you'd need to go with #Treziac's suggestion and use the ?. operator (introduced in C# 6.0) and then filter nulls out:
public static IEnumerable<TResult> SelectWithNullCheck<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
return source.Select(selector).Where( x=> x != null);
}
someEnumerable.SelectWithNullCheck(x => x?.SomeProperty?.SomeOtherProperty);
You can use Expression based solution. Below is basic and workable solution for field / property chain call. It will work for very deep call chains. It is not perfect. For example it won't work if there is method call in the chain (obj.Prop1.MethodCall().Prop2).
Expression based solutions are in general slower, because of the need to compile the lambda expression to delegate, which should be taken in consideration.
Performance stats:
Testes with collection of 200k objects with nested call level of 2 (obj.Prop1.Prop2) where all objects fail for the condition.
LINQ Where with C# 6 ?. operator : 2 - 4 ms
Exception based (try / catch): 14,000 - 15,000 ms
Expression based: 4 - 10 ms
NOTE: The expression based solution will add overhead of several ms for every call, this number won't depend on the collection size, because the expression will be compiled for every call which is an expensive operation. You can think for cache mechanism if you are interested.
Source for expression based solution::
public static IEnumerable<T> IgnoreIfNull<T, TProp>(this IEnumerable<T> sequence, Expression<Func<T, TProp>> expression)
{
var predicate = BuildNotNullPredicate(expression);
return sequence.Where(predicate);
}
private static Func<T, bool> BuildNotNullPredicate<T, TProp>(Expression<Func<T, TProp>> expression)
{
var root = expression.Body;
if (root.NodeType == ExpressionType.Parameter)
{
return t => t != null;
}
var pAccessMembers = new List<Expression>();
while (root.NodeType == ExpressionType.MemberAccess)
{
var mExpression = root as MemberExpression;
pAccessMembers.Add(mExpression);
root = mExpression.Expression;
}
pAccessMembers.Reverse();
var body = pAccessMembers
.Aggregate(
(Expression)Expression.Constant(true),
(f, s) =>
{
if (s.Type.IsValueType)
{
return f;
}
return Expression.AndAlso(
left: f,
right: Expression.NotEqual(s, Expression.Constant(null))
);
});
var lambda = Expression.Lambda<Func<T, bool>>(body, expression.Parameters[0]);
var func = lambda.Compile();
return func;
}
This is how is used:
var sequence = ....
var filtered = sequence.IgnoreIfNull(x => x.Prop1.Prop2.Prop3 ... etc);
Why not using ?. operator ?
someEnumerable.Where(x => x?.SomeProperty != null).Select(x => x.SomeProperty.SomeOtherProperty);
(note that this may return null values)
or
someEnumerable.Select(x => x?.SomeProperty?.SomeOtherProperty).Where(x => x != null);
(this will not return any null values)
It's not really good or bad practice, it depends of what you want in your return
Another option is to split the selection null check into a custom operator (e.g. WhereNotNull). Combine this with the ?. operator, solves your problem in an imho very expressive way.
public static IEnumerable<TSource> WhereNotNull<TSource>(this IEnumerable<TSource> source)
{
return source.Where(x=> x != null);
}
This allows you to write:
someEnumerable.Select(x => x?.SomeProperty?.SomeOtherProperty)
.WhereNotNull();
If not you could always chain the selects (for version prior to C# 6):
someEnumerable.Select(x => x.SomeProperty)
.Select(x => x.SomeOtherProperty)
.WhereNotNull();
Given you absolutely want to access x.SomeProperty.SomeOtherProperty the last option would be to catch the NullReferenceException.
public static IEnumerable<TResult> SelectWithNullCheck<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
return source.Select(x =>
{
try
{
return selector(x);
}
catch(NullReferenceException ex)
{
return default(TResult);
}
})
.Where(x=> default(TResult) != x);
}

Inner method in Linq query that return IQueryable

I would like to do following thing in Entity Framework code first:
Data.GroupBy(x => x.Person.LastName)
.Select(x => x.OtherGrouping(...)).Where(...).GroupBy(...)...etc
public static class Extensions
{
public static IQueryable<IGrouping<T, T>> OtherGrouping<T>(this IQueryable<T> collection /**** Here I want to pass consts and lambdas\expressions*****/)
{
// Grouping, filtration and other stuff here
return collection.GroupBy(x => x);
}
}
The problem is .net compiles OtherGrouping method, it doesn't represent like an expression, and couldn't be transformed to the SQL.
I found LinqKit library which suppose to help in such cases, but I coudn't figure out how to apply it in my specific case. It works fine for simple cases, when I just have expression like x => x+2, but I get stucked with return IQueryable. Probably it is possible to write expression tree completely by hand but I don't want to go so deep.
Any ideas how it might be done or where I can read about it?
Here is what I tried to do based on Rob's comment
void Main()
{
AddressBases.GroupBy(x => x.ParentId).Select(x => new {x.Key, Items = x.AsQueryable().OtherGrouping(a => a.Country) }).Dump();
}
public static class Extensions
{
public static IQueryable<IGrouping<TKey, TSource>> OtherGrouping<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector)
{
return source.Provider.CreateQuery<IGrouping<TKey, TSource>>(
source.GroupBy(keySelector).Expression);
}
}
And I got an exception:
NotSupportedException: LINQ to Entities does not recognize the method 'System.Linq.IQueryable`1[System.Linq.IGrouping`2[System.String,InsuranceData.EF.AddressBase]] OtherGrouping[AddressBase,String](System.Linq.IQueryable`1[InsuranceData.EF.AddressBase], System.Linq.Expressions.Expression`1[System.Func`2[InsuranceData.EF.AddressBase,System.String]])' method, and this method cannot be translated into a store expression.
Couldn't figure out yet why I have OtherGrouping method in expression tree? OtherGrouping method should just attach another grouping to the expression and pass it to the provider, but not put itself to the tree.
Your current code after the edit is correct - where you're going wrong is a common issue with the syntactic sugar we're given by C#.
If you write the following:
void Main()
{
var res = Containers.OtherGrouping(c => c.ContainerID);
res.Expression.Dump();
res.Dump();
}
public static class Extensions
{
public static IQueryable<IGrouping<TKey, TSource>> OtherGrouping<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector)
{
return source.Provider.CreateQuery<IGrouping<TKey, TSource>>(
source.GroupBy(keySelector).Expression
);
}
}
You'll see that we get the output:
Table(Container).GroupBy(c => c.ContainerID)
And the result executes with an issue, properly grouping by our predicate. However, you're invoking OtherGrouping while inside an expression, when you're passing it to Select. Let's try a simple case without OtherGrouping:
var res = Containers.OtherGrouping(c => c.ContainerID)
.Select(x => new
{
x.Key,
Items = x.GroupBy(c => c.ContainerID),
});
res.Expression.Dump();
What you're providing to Select is an expression tree. That is, the inner GroupBy's method is never actually invoked. If you change the query to x.AsQueryable().OtherGrouping and put a breakpoint in OtherGrouping, it will only be hit the first time.
In addition to that, x is IEnumerable<T>, not IQueryable<T>. Invoking AsQueryable() gives you an IQueryable, but it also gives you a new query provider. I'm not 100% sure as to the inner workings of entity framework, but I'd wager to say that this is invalid - as we no longer have the database as a target for the provider. Indeed, with linq2sql (using LINQPad), we get an error saying .AsQueryable() is not supported.
So, what can you do? Unfortunately there's no clean way to do this. Since the expression tree is already built before OtherGrouping has a chance, it doesn't matter what the body of OtherGrouping is. We'll need to change the expression tree after it's built, but before it's executed.
For that, you'll need to write an expression visitor which will look for .OtherGrouping expression calls, and replace it with Queryable.GroupBy

Why doesn't IOrderedEnumerable retain order after where filtering

I've created a simplification of the issue. I have an ordered IEnumerable, I'm wondering why applying a where filter could unorder the objects
This does not compile while it should have the potential to
IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
//Error Cannot Implicitly conver IEnumerable<int> To IOrderedEnumerable<int>
tmp = tmp.Where(x => x > 1);
I understand that there would be no gaurenteed execution order if coming from an IQueryable such as using linq to some DB Provider.
However, when dealing with Linq To Object what senario could occur that would unorder your objects, or why wasn't this implemented?
EDIT
I understand how to properly order this that is not the question. My Question is more of a design question. A Where filter on linq to objects should enumerate the give enumerable and apply filtering. So why is that we can only return an IEnumerable instead of an IOrderedEnumerable?
EDIT
To Clarify the senario in when this would be userful. I'm building Queries based on conditions in my code, I want to reuse as much code as possible. I have a function that is returning an OrderedEnumerable, however after applying the additional where I would have to reorder this even though it would be in its original ordered state
Rene's answer is correct, but could use some additional explanation.
IOrderedEnumerable<T> does not mean "this is a sequence that is ordered". It means "this is a sequence that has had an ordering operation applied to it and you may now follow that up with a ThenBy to impose additional ordering requirements."
The result of Where does not allow you to follow it up with ThenBy, and therefore you may not use it in a context where an IOrderedEnumerable<T> is required.
Make sense?
But of course, as others have said, you almost always want to do the filtering first and then the ordering. That way you are not spending time putting items into order that you are just going to throw away.
There are of course times when you do have to order and then filter; for example, the query "songs in the top ten that were sung by a woman" and the query "the top ten songs that were sung by a woman" are potentially very different! The first one is sort the songs -> take the top ten -> apply the filter. The second is apply the filter -> sort the songs -> take the top ten.
The signature of Where() is this:
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
So this method takes an IEnumerable<int> as first argument. The IOrderedEnumerable<int> returned from OrderBy implements IEnumerable<int> so this is no problem.
But as you can see, Where returns an IEnumerable<int> and not an IOrderedEnumerable<int>. And this cannot be casted into one another.
Anyway, the object in that sequence will still have the same order. So you could just do it like this
IEnumerable<int> tmp = new List<int>().OrderBy(x => x).Where(x => x > 1);
and get the sequence you expected.
But of course you should (for performance reasons) filter your objects first and sort them afterwards when there are fewer objects to sort:
IOrderedEnumerable<int> tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);
The tmp variable's type is IOrderedEnumerable.
Where() is a function just like any other with a return type, and that return type is IEnumerable. IEnumerable and IOrderedEnumerable are not the same.
So when you do this:
tmp = tmp.Where(x => x > 1);
You are trying to assign the result of a Where() function call, which is an IEnuemrable, to the tmp variable, which is an IOrderedEnumerable. They are not directly compatible, there is no implicit cast, and so the compiler sends you an error.
The problem is you are being too specific with the tmp variable's type. You can make one simple change that will make this all work by being just be a little less specific with your tmp variable:
IEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1);
Because IOrderedEnumerable inherits from IEnumerable, this code will all work. As long as you don't want to call ThenBy() later on, this should give you exactly the same results as you expect without any other loss of ability to use the tmp variable later.
If you really need an IOrderedEnumerable, you can always just call .OrderBy(x => x) again:
IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1).OrderBy(x => x);
And again, in most cases (not all, but most) you want to get your filtering out of the way before you start sorting. In other words, this is even better:
var tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);
why wasn't this implemented?
Most likely because the LINQ designers decided that the effort to implement, test, document etc. isn't worth enough compared to the potential use cases. In fact your are the first one I hear complaining about that.
But if it's so important to you, you can add that missing functionality yourself (similar to #Jon Skeet MoreLINQ extension library). For instance, something like this:
namespace MyLinq
{
public static class Extensions
{
public static IOrderedEnumerable<T> Where<T>(this IOrderedEnumerable<T> source, Func<T, bool> predicate)
{
return new WhereOrderedEnumerable<T>(source, predicate);
}
class WhereOrderedEnumerable<T> : IOrderedEnumerable<T>
{
readonly IOrderedEnumerable<T> source;
readonly Func<T, bool> predicate;
public WhereOrderedEnumerable(IOrderedEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (predicate == null) throw new ArgumentNullException(nameof(predicate));
this.source = source;
this.predicate = predicate;
}
public IOrderedEnumerable<T> CreateOrderedEnumerable<TKey>(Func<T, TKey> keySelector, IComparer<TKey> comparer, bool descending) =>
new WhereOrderedEnumerable<T>(source.CreateOrderedEnumerable(keySelector, comparer, descending), predicate);
public IEnumerator<T> GetEnumerator() => Enumerable.Where(source, predicate).GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
}
}
And putting it into action:
using System;
using System.Collections.Generic;
using System.Linq;
using MyLinq;
var test = Enumerable.Range(0, 100)
.Select(n => new { Foo = 1 + (n / 20), Bar = 1 + n })
.OrderByDescending(e => e.Foo)
.Where(e => (e.Bar % 2) == 0)
.ThenByDescending(e => e.Bar) // Note this compiles:)
.ToList();

LINQ to Entities does not recognize the method 'Boolean Contains[Int32]

I have the following extension methods in which I am using to do a Contains on LINQ-To-Entities:
public static class Extensions
{
public static IQueryable<TEntity> WhereIn<TEntity, TValue>
(
this ObjectQuery<TEntity> query,
Expression<Func<TEntity, TValue>> selector,
IEnumerable<TValue> collection
)
{
if (selector == null) throw new ArgumentNullException("selector");
if (collection == null) throw new ArgumentNullException("collection");
if (!collection.Any())
return query.Where(t => false);
ParameterExpression p = selector.Parameters.Single();
IEnumerable<Expression> equals = collection.Select(value =>
(Expression)Expression.Equal(selector.Body,
Expression.Constant(value, typeof(TValue))));
Expression body = equals.Aggregate((accumulate, equal) =>
Expression.Or(accumulate, equal));
return query.Where(Expression.Lambda<Func<TEntity, bool>>(body, p));
}
//Optional - to allow static collection:
public static IQueryable<TEntity> WhereIn<TEntity, TValue>
(
this ObjectQuery<TEntity> query,
Expression<Func<TEntity, TValue>> selector,
params TValue[] collection
)
{
return WhereIn(query, selector, (IEnumerable<TValue>)collection);
}
}
When I call the extenion method to check if a list of ids is in a particular table, it works and I get back the List of ids, like this:
List<int> Ids = _context.Persons
.WhereIn(x => x.PersonId, PersonIds)
.Select(x => x.HeaderId).ToList();
When I execute the next statement, it complains that LINQ-To-Entities does not recogonize Contains(int32), but I thought I am not going against the entity anymore, but a collection of ints.
predicate = predicate.And(x=> Ids.Contains(x.HeaderId));
If I have a comma separated string such as "1,2,3", then the following works:
predicate = predicate.And(x=>x.Ids.Contains(x.HeaderId));
I am trying to take the List returned and create comma separated list of strings, the problem here is that now when I do predicate = predicate.And(x=>sb.Contains(x.HeaderId.ToString());, it complains that it does not like ToString().
I also tried doing:
predicate = predicate.And(x=>Extensions.WhereIn(Ids, x.id));, but it can't resolve WhereIn. It says I must add `<>`, but I am not sure what to add here and how implement it.
Where is nothing wrong with your WhereIn, and you are correct: when you use Ids, you are not going against the entity anymore, but a collection of ints.
Problem is when you're using .And on predicate: LINQ-To-Entities tries to convert everything inside those brackets into Entities methods, and there is no corresponding Contains method.
Solution:
Instead of
predicate = predicate.And(x=> Ids.Contains(x.HeaderId));
use
predicate = predicate.And(Contains<XClassName, int>(x.HeaderId));
where Contains defined as follows:
private static Expression<Func<TElement, bool>> Contains<TElement, TValue>(Expression<Func<TElement, TValue>> valueSelector, List<TValue> values)
{
if (null == valueSelector) { throw new ArgumentNullException("valueSelector"); }
if (null == values) { throw new ArgumentNullException("values"); }
if (!values.Any())
return e => false;
var equals = values.Select(value => (Expression)Expression.Equal(valueSelector.Body, Expression.Constant(value, typeof(TValue))));
return Expression.Lambda<Func<TElement, bool>>(#equals.Aggregate(Expression.Or), valueSelector.Parameters.Single());
}
and XClassName is the name of the class of your x
You cant use array like that, you need to previsit this lambda in order to expand it to primitives. Alternatively you can change underlying provider so it knows how to generate IN statement , as it doesnt by default.
Didnt find post where one guys actually implement it, will updated once I did.
Basically when you use your extension method it is like
x=>arr.Contains(x)
So if you try to execute such lambda agains your entityset etc it will throw you exception saying that parameters can only be primitives.
The reason is that underlying provider doesnt know how to convert .Contains method for array as function parameter into sql query. And in order to solve that you have two options
teach it how to use T[] as parameter and use Contains with this parameter
update your extension method in order to generate new lamda which will use 'allowed' building blocks, ie expressions using primitive types like int, string, guid etc.
Check this article
http://msdn.microsoft.com/en-us/library/bb882521(v=vs.90).aspx
Replace your:
List<int> Ids = _context.Persons
.WhereIn(x => x.PersonId, PersonIds)
.Select(x => x.HeaderId).ToList();
with
var Ids = _context.Persons
.WhereIn(x => x.PersonId, PersonIds)
.Select(x => x.HeaderId).ToList();
and then try.

Two candidates in method Enumerable.Where

Has anyone encountered this problem? I have two same candidates to method Enumerable.Where
And what is the Func'2 and Func'3?
When i trying to filter enumerable
var subItems = itemsToShow.Where(item => item.Visible);
I have an error:
Cannot resolve method 'Where(lambda expression)', candidates are
System.Collection.Generic.IEnumerable<T> Where<T>(this System.Collection.Generic.IEnumerable<T>, System.Func'2) (in calss Enumerable)
System.Collection.Generic.IEnumerable<T> Where<T>(this System.Collection.Generic.IEnumerable<T>, System.Func'3) (in calss Enumerable)
On .Net 3.5 this work perfect
A quick look at the MSDN tells you that there are in fact two overloads.
One just filters based on a predicate, and the second overload also takes the index of the item in the enumeration into account.
Func'3 and Func'2 meens that it is a generic class with 2 and 3 type parameters.
I assume that first is for Func<T, bool> where T is your earlier defined type.
and Func<T, int, bool> the same plus indexer.
Func<T, int, bool> - it is a predicate that accepts two arguments of types T and int and returns bool.
Just build the solution and see the detailed error. Mine was a nullable boolean.
It happened to me because I was trying to use .Contains on a List type, while what I needed was .Any
for example
var myObjectsList = new List<MyClass>();
// instead of this
myObjectList.Contains(x => x.Id == 1)
// use this
myObjectList.Any(x => x.Id == 1)
Try casting to an IQueryable. Like so: itemsToShow.AsQueryable()

Categories

Resources