Related
Consider the following code, where dbContext is a SQL Server database context and Examples is a DbSet:
this.dbContext.Examples.Take(5).ToList();
Enumerable.Take(this.dbContext.Examples, 5).ToList();
The first line works as expected and is converted to SQL in the following manner:
SELECT TOP(5) * FROM Examples
However, the second line first fetches all rows and applies the Take operator afterwards. Why is that?
Since I am using expressions to build a dynamic lambda I have to use the second approach (Enumerable.Take):
var call = Expression.Call(
typeof(Enumerable),
"Take",
new[]{ typeof(Examples) },
contextParam,
Expression.Constant(5)
);
Unfortunately, the first approach does not work when working with expressions and the current architecture of the program forces me to build a lambda dynamically.
Why does the second approach fetches all rows and how can I prevent it in order to use it in expressions efficiently?
You're not calling the same method. The first line is invoking Queryable.Take, not Enumerable.Take.
Since DbSet implements both IQueryable<> and IEnumerable<>, but IQueryable<> implements IEnumerable<>, the compiler treats IQueryable<> as a more specific type. So when it's resolving the Take extension method to call, it determines that Queryable.Take(...) is the right one, because it requires an IQueryable<> as the first parameter.
This is important because the IQueryable<> interface is what allows LINQ queries to be built as expression trees that get evaluated into SQL. The moment you switch to treating an IQueryable<> as an IEnumerable<>, you lose that behavior and switch to only being able to iterate over the results of whatever query had been built prior to that.
Try this:
Queryable.Take(this.dbContext.Examples, 5).ToList();
or this:
var call = Expression.Call(
typeof(Queryable),
"Take",
new[]{ typeof(Examples) },
contextParam,
Expression.Constant(5)
);
It works because in the first statement
dbContext.Examples.Take(5).ToList();
You are invoking the .Take(5) on an IQueryable interface, on which the LINQ to SQL provider can execute proper SQL statement against the database.
If you need the query to happen on the database side, you have to construct the query on the IQueryable interface instance.
Enumerable.Take is an IEnumerable reference, the execution of the Take method will happen in memory, after you have fetch all of the data from the database.
"this.dbContext.Examples" does get all the data then Enumerable.Take filter and take the top 5 from it.
Below is an example from my textbook:
var allGenres = from genre in myEntities.Genres.Include("Reviews")
orderby genre.Name
select new { genre.Name, genre.Reviews };
Repeater1.DataSource = allGenres.ToList();
Repeater1.DataBind();
and the book says:
as soon as you call ToList(), the query is executed and the relevant genres and reviews are retrieved from the database and assigned to the DataSource property
so my question is, if I get rid of Repeater1.DataSource = allGenres.ToList();, what does var allGenres contain? since the query hasn't been executed?
There are three stages to understand.
First, when the query is created.
Second, when the query variable is iterated over (deferred execution).
Third, forcing a query for immediate results.
var allGenres = from genre in myEntities.Genres.Include("Reviews")
orderby genre.Name
select new { genre.Name, genre.Reviews };
In this code, the query is only created. Its dead as a person in a cemetery.
If you need deferred execution, then you can iterate over the results using for loop etc.
To force immediate execution, you can use conversion operators like
ToList, ToArray, ToLookup, and ToDictionary.
Hope it helps.
You can put a breakpoint on this line:
var allGenres = from genre in myEntities.Genres.Include("Reviews")
orderby genre.Name
select new { genre.Name, genre.Reviews };
and notice nothing happens. After the following line, voila SQL Profiler will display the sql query happening:
Repeater1.DataSource = allGenres.ToList();
Your object allGenres is an object that implements IQueryable<...>. This means that it represents a sequence, and it has functions to get the first element of the sequence, and once you've got the element you can get the next one, until there are no more elements.
The <...> part defines what kind of elements are in the sequence. So IQueryable<Book> says that you can query for a sequence of Books, which you can enumerate one after another. Every element of the sequence will be an object of class Book
This enumeration is provided using a base interface of IQueryable, namely IEnumerable<Book>.
At the lowest level, this Enumeration is done as follows:
IEnumerable<Book> books = ... // for example new List<Book>(), or new Book[10]
IEnumerator<Book> bookEnumerator = books.Getenumerator();
// as long as there are books, print the title:
// the first MoveNext() moves to the first element
// every other MoveNext() move to the next element
// it returns false if there is no such element (no first, or no next)
while (bookEnumerator.MoveNext())
{
// the enumerator points to the next element. Property Current contains this element
Book book = bookEnumerator.Current;
Console.WriteLine(book.Title);
}
Normally we won't use this low level functionality. You'll see the foreach var more often:
foreach (Book book in books)
{
Console.WriteLine(book.Title);
}
foreach will do the GetEnumerator() / MoveNext() / Current for you
Note that the IEnumerable does not represent the enumeration itself, it represents the ability to enumerate. Quite often people are not that precise, and call the IEnumerable the sequence itself. But remember: to access the elements of the sequence you need to Enumerate over them (either by using MoveNext, or by calling foreach).
An IQueryable<...> seems very similar as an IEnumerable<...>. The difference is that it is usually meant to be processed by a different process, like a database management system, or a server on a different computer, it can also represent the lines in a CSV file, or whatever. The purpose of the IQueryable is to separate how the data is fetched from the manipulation of the fetched data.
Just like an IEnumerable holds the ability to enumerate, the IQueryable hold the ability to query data.
For this, the IQueryable has an Expression and a Provider. The Expression defines in a generic way what data must be fetched. The Provider knows who must provide the data (the database), and what language this data provider needs (SQL).
Perhaps you have noticed there are two kinds of LINQ statements. The ones that return an IQueryable and the ones that don't. The first group are functions like Where, Select, Join, GroupBy. They all return an IQueryable of some kind.
As long as you concatenate functions of this group, the Expression is changed. The query is not executed yet. The return value is still an object that represents the ability to query. These functions use deferred execution (or lazy execution), meaning that the query is not executed yet. You'll recognize these functions because they return IQueryable<...>. The remarks section of the description of these function also mentions that execution is deferred.
Only after you call GetEnumerator() / MoveNext(), either directly, or indirectly using foreach the query is executed.
If you start enumerating, the Expression is sent to the Provider, who will translate the Expression into the language that the executor of the query understands (SQL) and will order the executor to execute the query. The fetched data is converted to an IEnumerable<...> which is then enumerated as if the data was local.
It depends a bit on who made the Provider, but sometimes the Provider is really smart. it doesn't fetch all millions of Products from your products database, but it fetches a Page of Products. While you enumerate over this page the Provider fetches the next page. This will improve processing speed because there won't be fetched much more products than you actually will use, besides you can start enumerating before all Products are fetched.
I mentioned the LINQ functions that use deferred execution (= return IQueryable). The other group of functions will execute the query. This group of functions contain functions like ToList, ToDictionary, Max, FirstOrDefault, Count They don't return IQueryable<...>, but some TResult. If you look at the source code (google for "reference source queryable tolist"), you will see that they will do this by calling foreach or GetEnumerator.
Because the Expression must be translated to SQL, the query has less possibilities than an IEnumerable<...>. For instance, you can't use any locally defined methods in your query. If you do that, you will get a run-time exception as soon as the query is executed, telling you that the expression can't be translated into SQL.
What kind of expressions can be executed depends on who must execute your query. There is a list of Supported and Unsupported LINQ methods for ling-to-entities.
If you are using Entity Framework the allGeneres will be an instance of IQueryable<Genre> interface. The behavior is called deferred execution. For More reference https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/query-execution#deferred-query-execution
What is the difference between returning IQueryable<T> vs. IEnumerable<T>, when should one be preferred over the other?
IQueryable<Customer> custs = from c in db.Customers
where c.City == "<City>"
select c;
IEnumerable<Customer> custs = from c in db.Customers
where c.City == "<City>"
select c;
Will both be deferred execution and when should one be preferred over the other?
Yes, both will give you deferred execution.
The difference is that IQueryable<T> is the interface that allows LINQ-to-SQL (LINQ.-to-anything really) to work. So if you further refine your query on an IQueryable<T>, that query will be executed in the database, if possible.
For the IEnumerable<T> case, it will be LINQ-to-object, meaning that all objects matching the original query will have to be loaded into memory from the database.
In code:
IQueryable<Customer> custs = ...;
// Later on...
var goldCustomers = custs.Where(c => c.IsGold);
That code will execute SQL to only select gold customers. The following code, on the other hand, will execute the original query in the database, then filtering out the non-gold customers in the memory:
IEnumerable<Customer> custs = ...;
// Later on...
var goldCustomers = custs.Where(c => c.IsGold);
This is quite an important difference, and working on IQueryable<T> can in many cases save you from returning too many rows from the database. Another prime example is doing paging: If you use Take and Skip on IQueryable, you will only get the number of rows requested; doing that on an IEnumerable<T> will cause all of your rows to be loaded in memory.
The top answer is good but it doesn't mention expression trees which explain "how" the two interfaces differ. Basically, there are two identical sets of LINQ extensions. Where(), Sum(), Count(), FirstOrDefault(), etc all have two versions: one that accepts functions and one that accepts expressions.
The IEnumerable version signature is: Where(Func<Customer, bool> predicate)
The IQueryable version signature is: Where(Expression<Func<Customer, bool>> predicate)
You've probably been using both of those without realizing it because both are called using identical syntax:
e.g. Where(x => x.City == "<City>") works on both IEnumerable and IQueryable
When using Where() on an IEnumerable collection, the compiler passes a compiled function to Where()
When using Where() on an IQueryable collection, the compiler passes an expression tree to Where(). An expression tree is like the reflection system but for code. The compiler converts your code into a data structure that describes what your code does in a format that's easily digestible.
Why bother with this expression tree thing? I just want Where() to filter my data.
The main reason is that both the EF and Linq2SQL ORMs can convert expression trees directly into SQL where your code will execute much faster.
Oh, that sounds like a free performance boost, should I use AsQueryable() all over the place in that case?
No, IQueryable is only useful if the underlying data provider can do something with it. Converting something like a regular List to IQueryable will not give you any benefit.
Yes, both use deferred execution. Let's illustrate the difference using the SQL Server profiler....
When we run the following code:
MarketDevEntities db = new MarketDevEntities();
IEnumerable<WebLog> first = db.WebLogs;
var second = first.Where(c => c.DurationSeconds > 10);
var third = second.Where(c => c.WebLogID > 100);
var result = third.Where(c => c.EmailAddress.Length > 11);
Console.Write(result.First().UserName);
In SQL Server profiler we find a command equal to:
"SELECT * FROM [dbo].[WebLog]"
It approximately takes 90 seconds to run that block of code against a WebLog table which has 1 million records.
So, all table records are loaded into memory as objects, and then with each .Where() it will be another filter in memory against these objects.
When we use IQueryable instead of IEnumerable in the above example (second line):
In SQL Server profiler we find a command equal to:
"SELECT TOP 1 * FROM [dbo].[WebLog] WHERE [DurationSeconds] > 10 AND [WebLogID] > 100 AND LEN([EmailAddress]) > 11"
It approximately takes four seconds to run this block of code using IQueryable.
IQueryable has a property called Expression which stores a tree expression which starts being created when we used the result in our example (which is called deferred execution), and at the end this expression will be converted to an SQL query to run on the database engine.
Both will give you deferred execution, yes.
As for which is preferred over the other, it depends on what your underlying datasource is.
Returning an IEnumerable will automatically force the runtime to use LINQ to Objects to query your collection.
Returning an IQueryable (which implements IEnumerable, by the way) provides the extra functionality to translate your query into something that might perform better on the underlying source (LINQ to SQL, LINQ to XML, etc.).
A lot has been said previously, but back to the roots, in a more technical way:
IEnumerable is a collection of objects in memory that you can enumerate - an in-memory sequence that makes it possible to iterate through (makes it way easy for within foreach loop, though you can go with IEnumerator only). They reside in the memory as is.
IQueryable is an expression tree that will get translated into something else at some point with ability to enumerate over the final outcome. I guess this is what confuses most people.
They obviously have different connotations.
IQueryable represents an expression tree (a query, simply) that will be translated to something else by the underlying query provider as soon as release APIs are called, like LINQ aggregate functions (Sum, Count, etc.) or ToList[Array, Dictionary,...]. And IQueryable objects also implement IEnumerable, IEnumerable<T> so that if they represent a query the result of that query could be iterated. It means IQueryable don't have to be queries only. The right term is they are expression trees.
Now how those expressions are executed and what they turn to is all up to so called query providers (expression executors we can think them of).
In the Entity Framework world (which is that mystical underlying data source provider, or the query provider) IQueryable expressions are translated into native T-SQL queries. Nhibernate does similar things with them. You can write your own one following the concepts pretty well described in LINQ: Building an IQueryable Provider link, for example, and you might want to have a custom querying API for your product store provider service.
So basically, IQueryable objects are getting constructed all the way long until we explicitly release them and tell the system to rewrite them into SQL or whatever and send down the execution chain for onward processing.
As if to deferred execution it's a LINQ feature to hold up the expression tree scheme in the memory and send it into the execution only on demand, whenever certain APIs are called against the sequence (the same Count, ToList, etc.).
The proper usage of both heavily depends on the tasks you're facing for the specific case. For the well-known repository pattern I personally opt for returning IList, that is IEnumerable over Lists (indexers and the like). So it is my advice to use IQueryable only within repositories and IEnumerable anywhere else in the code. Not saying about the testability concerns that IQueryable breaks down and ruins the separation of concerns principle. If you return an expression from within repositories consumers may play with the persistence layer as they would wish.
A little addition to the mess :) (from a discussion in the comments))
None of them are objects in memory since they're not real types per se, they're markers of a type - if you want to go that deep. But it makes sense (and that's why even MSDN put it this way) to think of IEnumerables as in-memory collections whereas IQueryables as expression trees. The point is that the IQueryable interface inherits the IEnumerable interface so that if it represents a query, the results of that query can be enumerated. Enumeration causes the expression tree associated with an IQueryable object to be executed.
So, in fact, you can't really call any IEnumerable member without having the object in the memory. It will get in there if you do, anyways, if it's not empty. IQueryables are just queries, not the data.
In general terms I would recommend the following:
Return IQueryable<T> if you want to enable the developer using your method to refine the query you return before executing.
Return IEnumerable if you want to transport a set of Objects to enumerate over.
Imagine an IQueryable as that what it is - a "query" for data (which you can refine if you want to). An IEnumerable is a set of objects (which has already been received or was created) over which you can enumerate.
In general you want to preserve the original static type of the query until it matters.
For this reason, you can define your variable as 'var' instead of either IQueryable<> or IEnumerable<> and you will know that you are not changing the type.
If you start out with an IQueryable<>, you typically want to keep it as an IQueryable<> until there is some compelling reason to change it. The reason for this is that you want to give the query processor as much information as possible. For example, if you're only going to use 10 results (you've called Take(10)) then you want SQL Server to know about that so that it can optimize its query plans and send you only the data you'll use.
A compelling reason to change the type from IQueryable<> to IEnumerable<> might be that you are calling some extension function that the implementation of IQueryable<> in your particular object either cannot handle or handles inefficiently. In that case, you might wish to convert the type to IEnumerable<> (by assigning to a variable of type IEnumerable<> or by using the AsEnumerable extension method for example) so that the extension functions you call end up being the ones in the Enumerable class instead of the Queryable class.
There is a blog post with brief source code sample about how misuse of IEnumerable<T> can dramatically impact LINQ query performance: Entity Framework: IQueryable vs. IEnumerable.
If we dig deeper and look into the sources, we can see that there are obviously different extension methods are perfomed for IEnumerable<T>:
// Type: System.Linq.Enumerable
// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll
public static class Enumerable
{
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
return (IEnumerable<TSource>)
new Enumerable.WhereEnumerableIterator<TSource>(source, predicate);
}
}
and IQueryable<T>:
// Type: System.Linq.Queryable
// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll
public static class Queryable
{
public static IQueryable<TSource> Where<TSource>(
this IQueryable<TSource> source,
Expression<Func<TSource, bool>> predicate)
{
return source.Provider.CreateQuery<TSource>(
Expression.Call(
null,
((MethodInfo) MethodBase.GetCurrentMethod()).MakeGenericMethod(
new Type[] { typeof(TSource) }),
new Expression[]
{ source.Expression, Expression.Quote(predicate) }));
}
}
The first one returns enumerable iterator, and the second one creates query through the query provider, specified in IQueryable source.
The main difference between “IEnumerable” and “IQueryable” is about where the filter logic is executed. One executes on the client side (in memory) and the other executes on the database.
For example, we can consider an example where we have 10,000 records for a user in our database and let's say only 900 out which are active users, so in this case if we use “IEnumerable” then first it loads all 10,000 records in memory and then applies the IsActive filter on it which eventually returns the 900 active users.
While on the other hand on the same case if we use “IQueryable” it will directly apply the IsActive filter on the database which directly from there will return the 900 active users.
I would like to clarify a few things due to seemingly conflicting responses (mostly surrounding IEnumerable).
(1) IQueryable extends the IEnumerable interface. (You can send an IQueryable to something which expects IEnumerable without error.)
(2) Both IQueryable and IEnumerable LINQ attempt lazy loading when iterating over the result set. (Note that implementation can be seen in interface extension methods for each type.)
In other words, IEnumerables are not exclusively "in-memory". IQueryables are not always executed on the database. IEnumerable must load things into memory (once retrieved, possibly lazily) because it has no abstract data provider. IQueryables rely on an abstract provider (like LINQ-to-SQL), although this could also be the .NET in-memory provider.
Sample use case
(a) Retrieve list of records as IQueryable from EF context. (No records are in-memory.)
(b) Pass the IQueryable to a view whose model is IEnumerable. (Valid. IQueryable extends IEnumerable.)
(c) Iterate over and access the data set's records, child entities and properties from the view. (May cause exceptions!)
Possible Issues
(1) The IEnumerable attempts lazy loading and your data context is expired. Exception thrown because provider is no longer available.
(2) Entity Framework entity proxies are enabled (the default), and you attempt to access a related (virtual) object with an expired data context. Same as (1).
(3) Multiple Active Result Sets (MARS). If you are iterating over the IEnumerable in a foreach( var record in resultSet ) block and simultaneously attempt to access record.childEntity.childProperty, you may end up with MARS due to lazy loading of both the data set and the relational entity. This will cause an exception if it is not enabled in your connection string.
Solution
I have found that enabling MARS in the connection string works unreliably. I suggest you avoid MARS unless it is well-understood and explicitly desired.
Execute the query and store results by invoking resultList = resultSet.ToList() This seems to be the most straightforward way of ensuring your entities are in-memory.
In cases where the you are accessing related entities, you may still require a data context. Either that, or you can disable entity proxies and explicitly Include related entities from your DbSet.
I recently ran into an issue with IEnumerable v. IQueryable. The algorithm being used first performed an IQueryable query to obtain a set of results. These were then passed to a foreach loop, with the items instantiated as an Entity Framework (EF) class. This EF class was then used in the from clause of a Linq to Entity query, causing the result to be IEnumerable.
I'm fairly new to EF and Linq for Entities, so it took a while to figure out what the bottleneck was. Using MiniProfiling, I found the query and then converted all of the individual operations to a single IQueryable Linq for Entities query. The IEnumerable took 15 seconds and the IQueryable took 0.5 seconds to execute. There were three tables involved and, after reading this, I believe that the IEnumerable query was actually forming a three table cross-product and filtering the results.
Try to use IQueryables as a rule-of-thumb and profile your work to make your changes measurable.
We can use both for the same way, and they are only different in the performance.
IQueryable only executes against the database in an efficient way. It means that it creates an entire select query and only gets the related records.
For example, we want to take the top 10 customers whose name start with ‘Nimal’. In this case the select query will be generated as select top 10 * from Customer where name like ‘Nimal%’.
But if we used IEnumerable, the query would be like select * from Customer where name like ‘Nimal%’ and the top ten will be filtered at the C# coding level (it gets all the customer records from the database and passes them into C#).
In addition to first 2 really good answers (by driis & by Jacob) :
IEnumerable
interface is in the System.Collections namespace.
The IEnumerable object represents a set of data in memory and can move on this data only forward. The query represented by the IEnumerable object is executed immediately and completely, so the application receives data quickly.
When the query is executed, IEnumerable loads all the data, and if we need to filter it, the filtering itself is done on the client side.
IQueryable interface is located in the System.Linq namespace.
The IQueryable object provides remote access to the database and allows you to navigate through the data either in a direct order from beginning to end, or in the reverse order. In the process of creating a query, the returned object is IQueryable, the query is optimized. As a result, less memory is consumed during its execution, less network bandwidth, but at the same time it can be processed slightly more slowly than a query that returns an IEnumerable object.
What to choose?
If you need the entire set of returned data, then it's better to use IEnumerable, which provides the maximum speed.
If you DO NOT need the entire set of returned data, but only some filtered data, then it's better to use IQueryable.
In addition to the above, it's interesting to note that you can get exceptions if you use IQueryable instead of IEnumerable:
The following works fine if products is an IEnumerable:
products.Skip(-4);
However if products is an IQueryable and it's trying to access records from a DB table, then you'll get this error:
The offset specified in a OFFSET clause may not be negative.
This is because the following query was constructed:
SELECT [p].[ProductId]
FROM [Products] AS [p]
ORDER BY (SELECT 1)
OFFSET #__p_0 ROWS
and OFFSET can't have a negative value.
Case 1:
I am Joined two different DB Context by ToList() method in Both Context.
Case 2:
And also tried Joining first Db Context with ToList() and second with AsQueryable().
Both worked for me. All I want to know is the difference between those Joinings regarding Performance and Functionality. Which one is better ?
var users = (from usr in dbContext.User.AsNoTracking()
select new
{
usr.UserId,
usr.UserName
}).ToList();
var logInfo= (from log in dbContext1.LogInfo.AsNoTracking()
select new
{
log.UserId,
log.LogInformation
}).AsQueryable();
var finalQuery= (from usr in users
join log in logInfo on usr.UserId equals log.UserId
select new
{
usr.UserName,
log.LogInformation
}.ToList();
I'll elaborate answer that was given by Jehof in his comment. It is true that this join will be executed in the memory. And there are 2 reasons why it happens.
Firstly, this join cannot be performed in a database because you are joining an object in a memory (users) with a deferred query (logInfo). Based on that it is not possible to generate a query that could be send to a database. It means that before performing the actual join a deferred query is executed and all logs are retrieved from a database. To sum up, in this scenario 2 queries are executed in a database and join happens in memory. It doesn't matter if you use ToList + AsQueryable or ToList + ToList in this case.
Secondly, in your scenario this join can be performed ONLY in a memory. Even if you use AsQueryable with the first context and with the second context it will not work. You will get System.NotSupportedException exception with the message:
The specified LINQ expression contains references to queries that are associated with different contexts.
I wonder why you're using 2 DB contexts. Is it really needed? As I explained because of that you lost a possibility to take full advantage of deferred queries (lazy evaluation features).
If you really have to use 2 DB contexts, I'll consider adding some filters (WHERE conditions) to queries responsible for reading users and logs from DB. Why? For small number of records there is no problem. However, for large amount of data it is not efficient to perform joins in memory. For this purpose databases were created.
It hasn't been explained yet why the statements actually work and why EF doesn't throw an exception that you can only use sequences of primitive types in a LINQ statement.
If you swap both lists ...
var finalQuery= (from log in logInfo
join usr in users on log.UserId equals usr.UserId
...
EF will throw
Unable to create a constant value of type 'User'. Only primitive types or enumeration types are supported in this context.
So why does your code work?
That will become clear if we convert your statement to method syntax (which the runtime does under the hood):
users.Join(logInfo, usr => usr.UserId, log => log.UserId
(usr,log) => new
{
usr.UserName,
log.LogInformation
}
Since users is an IEnumerable, the extension method Enumerable.Join is resolved as the appropriate method. This method accepts an IEnumerable as the second list to be joined. Therefore, logInfo is implicitly cast to IEnumerable, so it runs as a separate SQL statement before it partakes in the join.
In the version from log in logInfo join usr ..., Queryable.Join is used. Now usr is converted into an IQueryable. This turns the whole statement into one expression that EF unsuccessfully tries to translate into one SQL statement.
Now a few words on
Which one is better?
The best option is the one that does just enough to make it work. That means that
You can remove AsQueryable(), because logInfo already is an IQueryable and it is cast to IEnumerable anyway.
You can replace ToList() by AsEnumerable(), because ToList() builds a redundant intermediate result, while AsEnumerable() only changes the runtime type of users, without triggering its execution yet.
ToList()
Execute the query immediately
You will get all the elements ready in memory
AsQueryable()
lazy (execute the query later)
Parameter: Expression<Func<TSource, bool>>
Convert Expression into T-SQL (with specific provider), query remotely and load result to your application memory.
That’s why DbSet (in Entity Framework) also inherits IQueryable to get efficient query.
It does not load every record. E.g. if Take(5), it will generate select top 5 * SQL in the background.
Given the following LINQ to SQL query:
var test = from i in Imports
where i.IsActive
select i;
The interpreted SQL statement is:
SELECT [t0].[id] AS [Id] .... FROM [Imports] AS [t0] WHERE [t0].[isActive] = 1
Say I wanted to perform some action in the select that cannot be converted to SQL. Its my understanding that the conventional way to accomplish this is to do AsEnumerable() thus converting it to a workable object.
Given this updated code:
var test = from i in Imports.AsEnumerable()
where i.IsActive
select new
{
// Make some method call
};
And updated SQL:
SELECT [t0].[id] AS [Id] ... FROM [Imports] AS [t0]
Notice the lack of a where clause in the executed SQL statement.
Does this mean the entire "Imports" table is cached into memory?
Would this slow performance at all if the table contained a large amount of records?
Help me to understand what is actually happening behind the scenes here.
The reason for AsEnumerable is to
AsEnumerable(TSource)(IEnumerable(TSource))
can be used to choose between query
implementations when a sequence
implements IEnumerable(T) but also has
a different set of public query
methods available
So when you were calling the Where method before, you were calling a different Where method from the IEnumerable.Where. That Where statement was for LINQ to convert to SQL, the new Where is the IEnumerable one that takes an IEnumerable, enumerates it and yields the matching items. Which explains why you see the different SQL being generated. The table will be taken in full from the database before the Where extension will be applied in your second version of the code. This could create a serious bottle neck, because the entire table has to be in memory, or worse the entire table would have to travel between servers. Allow SQL server to execute the Where and do what it does best.
At the point where the enumeration is enumerated through, the database will then be queried, and the entire resultset retrieved.
A part-and-part solution can be the way. Consider
var res = (
from result in SomeSource
where DatabaseConvertableCriterion(result)
&& NonDatabaseConvertableCriterion(result)
select new {result.A, result.B}
);
Let's say also that NonDatabaseConvertableCriterion requires field C from result. Because NonDatabaseConvertableCriterion does what its name suggests, this has to be performed as an enumeration. However, consider:
var partWay =
(
from result in SomeSource
where DatabaseConvertableCriterion(result)
select new {result.A, result.B, result.C}
);
var res =
(
from result in partWay.AsEnumerable()
where NonDatabaseConvertableCriterion select new {result.A, result.B}
);
In this case, when res is enumerated, queried or otherwise used, as much work as possible will be passed to the database, which will return enough to continue the job. Assuming that it is indeed really impossible to rewrite so that all the work can be sent to the database, this may be a suitable compromise.
There are three implementations of AsEnumerable.
DataTableExtensions.AsEnumerable
Extends a DataTable to give it an IEnumerable interface so you can use Linq against the DataTable.
Enumerable.AsEnumerable<TSource> and ParallelEnumerable.AsEnumerable<TSource>
The AsEnumerable<TSource>(IEnumerable<TSource>) method has no effect
other than to change the compile-time type of source from a type that
implements IEnumerable<T> to IEnumerable<T> itself.
AsEnumerable<TSource>(IEnumerable<TSource>) can be used to choose
between query implementations when a sequence implements
IEnumerable<T> but also has a different set of public query methods
available. For example, given a generic class Table that implements
IEnumerable<T> and has its own methods such as Where, Select, and
SelectMany, a call to Where would invoke the public Where method of
Table. A Table type that represents a database table could have a
Where method that takes the predicate argument as an expression tree
and converts the tree to SQL for remote execution. If remote execution
is not desired, for example because the predicate invokes a local
method, the AsEnumerable<TSource> method can be used to hide the
custom methods and instead make the standard query operators
available.
In other words.
If I have an
IQueryable<X> sequence = ...;
from a LinqProvider, like Entity Framework, and I do,
sequence.Where(x => SomeUnusualPredicate(x));
that query will be composed and run on the server. This will fail at runtime because the EntityFramework doesn't know how to convert SomeUnusualPredicate into SQL.
If I want that to run the statement with Linq to Objects instead, I do,
sequence.AsEnumerable().Where(x => SomeUnusualPredicate(x));
now the server will return all the data and the Enumerable.Where from Linq to Objects will be used instead of the Query Provider's implementation.
It won't matter that Entity Framework doesn't know how to interpret SomeUnusualPredicate, my function will be used directly. (However, this may be an inefficient approach since all rows will be returned from the server.)
I believe the AsEnumerable just tells the compiler which extension methods to use (in this case the ones defined for IEnumerable instead of those for IQueryable).
The execution of the query is still deferred until you call ToArray or enumerate on it.