C# extension methods types

C# extension methods types - c#

In C#, There are two sets of extension methods (such as Where) which works on top of IEnumerable and IQueryable. which of them used for manipulate in-memory objects and which is used for working with database.? Please help

IEnumerable<T>.Where method extension.which accepts a Func<TSource, bool> predicate parameter,
forcing the filtering to happen in memory.
While querying data from the database,IEnumerable executes select query on the server-side,
loads data in-memory on the client-side and then filters the data.
For the IEnumerable<T> case, it will be LINQ-to-object, meaning that all objects matching the original query will have to be loaded into memory from the database.
IEnumerable is suitable for querying data from in-memory collections like List, Array and so on.
IQueryable<T>.Where method extension, which accepts a Expression<Func<TSource, bool>> predicate parameter.
Notice that this is an expression, not a delegate, which allows it to translate the where condition to
a database condition.
While querying data from a database, IQueryable executes a select query on server-side with all filters.
The difference is that IQueryable<T> is the interface that allows LINQ-to-SQL (LINQ.-to-anything really) to work.
So if you further refine your query on an IQueryable<T>, that query will be executed in the database, if possible.
IQueryable is suitable for querying data from out-memory (like remote database, service) collections.

IEnumerable<T> exposes the enumerator, which supports a simple iteration over a collection of a specified type. so it's for in-memory objects
IQueryable<T> provides functionality to evaluate queries against a specific data source wherein the type of data is known. so it's for working with database

The answer of Reza Jenabi says the most important part.
I just want to add that sometimes it's very similar in code and the error may take some time to notice. If you write a method like :
public List<Thing> GetList(Func<Thing, bool> filter)
{
return dbContext.Things.Where(filter).ToList();
}
You're "filter" won't be translated to SQL. All the content of the Things table will be loaded in memory, then only the filter on loaded data will apply.
Not noticeable on an almost empty table, but awfully costly on a table with thousands of records.
So take care to always use Expression<Func> when you want it to be executed on SQL server side :
public List<Thing> GetList(Expression<Func<Thing, bool>> filter)
{
return dbContext.Things.Where(filter).ToList();
}
Knowing that both methods will be called exactly the same way, for example :
Repository.GetList(t => t.IsMyThing || t.WhateverYouWant);

Related

Combining C# Expressions with ADO.NET Datatables

I have generic queries to which the return type can be different. Because of this I cannot use TVFs and thus I am using Datatables.
I also want to extend the query of the datatable.
I am trying to do this in the following way:
var data = GetDataTable($"SELECT * FROM {tablename}").AsEnumerable().AsQueryable().GetFilteredList(filters);
The following is the function definition of GetFilteredList:
IQueryable<T> GetFilteredList<T>(this IQueryable<T> items, List<PostedFilter> filters)
The logic within the functions GetDataTable and GetFilteredList is correct as they are already in use for mutltiple years. They are however used seperately, as they come from different libraries. The filters parameter contains strings which are the names of properties of the queries. This way the query can be expanded before being executed. This works perfectly fine with static mvc queries on typed EDMX objects.
This code however, does not work for my Datatable. It does not generate any errors, but the filters do not reduce the data either. (I assume this is, among other reasons, because the query is materialized before the AsQueryable function is called)
Does anyone know a way in which I can create the logic which I am trying to implement? (By this I mean creating one large query that is only materialized after the query has been built up completely)

Could it be that GetDataTable().AsEnumarable() returns a list of DataRow's, while the GetFilteredList is typed to an entity class of some sort?
If that's the case then you should somehow convert your 'filters' to a string in the form of WHERE (Filter1 = 'value1') AND ... and put it in your SQL string you're passing to GetDataTable().

Force Entity Framework 6.1 to use exists instead of populating entire child object graph [duplicate]

What is the difference between returning IQueryable<T> vs. IEnumerable<T>, when should one be preferred over the other?
IQueryable<Customer> custs = from c in db.Customers
where c.City == "<City>"
select c;
IEnumerable<Customer> custs = from c in db.Customers
where c.City == "<City>"
select c;
Will both be deferred execution and when should one be preferred over the other?

Yes, both will give you deferred execution.
The difference is that IQueryable<T> is the interface that allows LINQ-to-SQL (LINQ.-to-anything really) to work. So if you further refine your query on an IQueryable<T>, that query will be executed in the database, if possible.
For the IEnumerable<T> case, it will be LINQ-to-object, meaning that all objects matching the original query will have to be loaded into memory from the database.
In code:
IQueryable<Customer> custs = ...;
// Later on...
var goldCustomers = custs.Where(c => c.IsGold);
That code will execute SQL to only select gold customers. The following code, on the other hand, will execute the original query in the database, then filtering out the non-gold customers in the memory:
IEnumerable<Customer> custs = ...;
// Later on...
var goldCustomers = custs.Where(c => c.IsGold);
This is quite an important difference, and working on IQueryable<T> can in many cases save you from returning too many rows from the database. Another prime example is doing paging: If you use Take and Skip on IQueryable, you will only get the number of rows requested; doing that on an IEnumerable<T> will cause all of your rows to be loaded in memory.

The top answer is good but it doesn't mention expression trees which explain "how" the two interfaces differ. Basically, there are two identical sets of LINQ extensions. Where(), Sum(), Count(), FirstOrDefault(), etc all have two versions: one that accepts functions and one that accepts expressions.
The IEnumerable version signature is: Where(Func<Customer, bool> predicate)
The IQueryable version signature is: Where(Expression<Func<Customer, bool>> predicate)
You've probably been using both of those without realizing it because both are called using identical syntax:
e.g. Where(x => x.City == "<City>") works on both IEnumerable and IQueryable
When using Where() on an IEnumerable collection, the compiler passes a compiled function to Where()
When using Where() on an IQueryable collection, the compiler passes an expression tree to Where(). An expression tree is like the reflection system but for code. The compiler converts your code into a data structure that describes what your code does in a format that's easily digestible.
Why bother with this expression tree thing? I just want Where() to filter my data.
The main reason is that both the EF and Linq2SQL ORMs can convert expression trees directly into SQL where your code will execute much faster.
Oh, that sounds like a free performance boost, should I use AsQueryable() all over the place in that case?
No, IQueryable is only useful if the underlying data provider can do something with it. Converting something like a regular List to IQueryable will not give you any benefit.

Yes, both use deferred execution. Let's illustrate the difference using the SQL Server profiler....
When we run the following code:
MarketDevEntities db = new MarketDevEntities();
IEnumerable<WebLog> first = db.WebLogs;
var second = first.Where(c => c.DurationSeconds > 10);
var third = second.Where(c => c.WebLogID > 100);
var result = third.Where(c => c.EmailAddress.Length > 11);
Console.Write(result.First().UserName);
In SQL Server profiler we find a command equal to:
"SELECT * FROM [dbo].[WebLog]"
It approximately takes 90 seconds to run that block of code against a WebLog table which has 1 million records.
So, all table records are loaded into memory as objects, and then with each .Where() it will be another filter in memory against these objects.
When we use IQueryable instead of IEnumerable in the above example (second line):
In SQL Server profiler we find a command equal to:
"SELECT TOP 1 * FROM [dbo].[WebLog] WHERE [DurationSeconds] > 10 AND [WebLogID] > 100 AND LEN([EmailAddress]) > 11"
It approximately takes four seconds to run this block of code using IQueryable.
IQueryable has a property called Expression which stores a tree expression which starts being created when we used the result in our example (which is called deferred execution), and at the end this expression will be converted to an SQL query to run on the database engine.

Both will give you deferred execution, yes.
As for which is preferred over the other, it depends on what your underlying datasource is.
Returning an IEnumerable will automatically force the runtime to use LINQ to Objects to query your collection.
Returning an IQueryable (which implements IEnumerable, by the way) provides the extra functionality to translate your query into something that might perform better on the underlying source (LINQ to SQL, LINQ to XML, etc.).

A lot has been said previously, but back to the roots, in a more technical way:
IEnumerable is a collection of objects in memory that you can enumerate - an in-memory sequence that makes it possible to iterate through (makes it way easy for within foreach loop, though you can go with IEnumerator only). They reside in the memory as is.
IQueryable is an expression tree that will get translated into something else at some point with ability to enumerate over the final outcome. I guess this is what confuses most people.
They obviously have different connotations.
IQueryable represents an expression tree (a query, simply) that will be translated to something else by the underlying query provider as soon as release APIs are called, like LINQ aggregate functions (Sum, Count, etc.) or ToList[Array, Dictionary,...]. And IQueryable objects also implement IEnumerable, IEnumerable<T> so that if they represent a query the result of that query could be iterated. It means IQueryable don't have to be queries only. The right term is they are expression trees.
Now how those expressions are executed and what they turn to is all up to so called query providers (expression executors we can think them of).
In the Entity Framework world (which is that mystical underlying data source provider, or the query provider) IQueryable expressions are translated into native T-SQL queries. Nhibernate does similar things with them. You can write your own one following the concepts pretty well described in LINQ: Building an IQueryable Provider link, for example, and you might want to have a custom querying API for your product store provider service.
So basically, IQueryable objects are getting constructed all the way long until we explicitly release them and tell the system to rewrite them into SQL or whatever and send down the execution chain for onward processing.
As if to deferred execution it's a LINQ feature to hold up the expression tree scheme in the memory and send it into the execution only on demand, whenever certain APIs are called against the sequence (the same Count, ToList, etc.).
The proper usage of both heavily depends on the tasks you're facing for the specific case. For the well-known repository pattern I personally opt for returning IList, that is IEnumerable over Lists (indexers and the like). So it is my advice to use IQueryable only within repositories and IEnumerable anywhere else in the code. Not saying about the testability concerns that IQueryable breaks down and ruins the separation of concerns principle. If you return an expression from within repositories consumers may play with the persistence layer as they would wish.
A little addition to the mess :) (from a discussion in the comments))
None of them are objects in memory since they're not real types per se, they're markers of a type - if you want to go that deep. But it makes sense (and that's why even MSDN put it this way) to think of IEnumerables as in-memory collections whereas IQueryables as expression trees. The point is that the IQueryable interface inherits the IEnumerable interface so that if it represents a query, the results of that query can be enumerated. Enumeration causes the expression tree associated with an IQueryable object to be executed.
So, in fact, you can't really call any IEnumerable member without having the object in the memory. It will get in there if you do, anyways, if it's not empty. IQueryables are just queries, not the data.

In general terms I would recommend the following:
Return IQueryable<T> if you want to enable the developer using your method to refine the query you return before executing.
Return IEnumerable if you want to transport a set of Objects to enumerate over.
Imagine an IQueryable as that what it is - a "query" for data (which you can refine if you want to). An IEnumerable is a set of objects (which has already been received or was created) over which you can enumerate.

In general you want to preserve the original static type of the query until it matters.
For this reason, you can define your variable as 'var' instead of either IQueryable<> or IEnumerable<> and you will know that you are not changing the type.
If you start out with an IQueryable<>, you typically want to keep it as an IQueryable<> until there is some compelling reason to change it. The reason for this is that you want to give the query processor as much information as possible. For example, if you're only going to use 10 results (you've called Take(10)) then you want SQL Server to know about that so that it can optimize its query plans and send you only the data you'll use.
A compelling reason to change the type from IQueryable<> to IEnumerable<> might be that you are calling some extension function that the implementation of IQueryable<> in your particular object either cannot handle or handles inefficiently. In that case, you might wish to convert the type to IEnumerable<> (by assigning to a variable of type IEnumerable<> or by using the AsEnumerable extension method for example) so that the extension functions you call end up being the ones in the Enumerable class instead of the Queryable class.

There is a blog post with brief source code sample about how misuse of IEnumerable<T> can dramatically impact LINQ query performance: Entity Framework: IQueryable vs. IEnumerable.
If we dig deeper and look into the sources, we can see that there are obviously different extension methods are perfomed for IEnumerable<T>:
// Type: System.Linq.Enumerable
// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll
public static class Enumerable
{
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
return (IEnumerable<TSource>)
new Enumerable.WhereEnumerableIterator<TSource>(source, predicate);
}
}
and IQueryable<T>:
// Type: System.Linq.Queryable
// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll
public static class Queryable
{
public static IQueryable<TSource> Where<TSource>(
this IQueryable<TSource> source,
Expression<Func<TSource, bool>> predicate)
{
return source.Provider.CreateQuery<TSource>(
Expression.Call(
null,
((MethodInfo) MethodBase.GetCurrentMethod()).MakeGenericMethod(
new Type[] { typeof(TSource) }),
new Expression[]
{ source.Expression, Expression.Quote(predicate) }));
}
}
The first one returns enumerable iterator, and the second one creates query through the query provider, specified in IQueryable source.

The main difference between “IEnumerable” and “IQueryable” is about where the filter logic is executed. One executes on the client side (in memory) and the other executes on the database.
For example, we can consider an example where we have 10,000 records for a user in our database and let's say only 900 out which are active users, so in this case if we use “IEnumerable” then first it loads all 10,000 records in memory and then applies the IsActive filter on it which eventually returns the 900 active users.
While on the other hand on the same case if we use “IQueryable” it will directly apply the IsActive filter on the database which directly from there will return the 900 active users.

I would like to clarify a few things due to seemingly conflicting responses (mostly surrounding IEnumerable).
(1) IQueryable extends the IEnumerable interface. (You can send an IQueryable to something which expects IEnumerable without error.)
(2) Both IQueryable and IEnumerable LINQ attempt lazy loading when iterating over the result set. (Note that implementation can be seen in interface extension methods for each type.)
In other words, IEnumerables are not exclusively "in-memory". IQueryables are not always executed on the database. IEnumerable must load things into memory (once retrieved, possibly lazily) because it has no abstract data provider. IQueryables rely on an abstract provider (like LINQ-to-SQL), although this could also be the .NET in-memory provider.
Sample use case
(a) Retrieve list of records as IQueryable from EF context. (No records are in-memory.)
(b) Pass the IQueryable to a view whose model is IEnumerable. (Valid. IQueryable extends IEnumerable.)
(c) Iterate over and access the data set's records, child entities and properties from the view. (May cause exceptions!)
Possible Issues
(1) The IEnumerable attempts lazy loading and your data context is expired. Exception thrown because provider is no longer available.
(2) Entity Framework entity proxies are enabled (the default), and you attempt to access a related (virtual) object with an expired data context. Same as (1).
(3) Multiple Active Result Sets (MARS). If you are iterating over the IEnumerable in a foreach( var record in resultSet ) block and simultaneously attempt to access record.childEntity.childProperty, you may end up with MARS due to lazy loading of both the data set and the relational entity. This will cause an exception if it is not enabled in your connection string.
Solution
I have found that enabling MARS in the connection string works unreliably. I suggest you avoid MARS unless it is well-understood and explicitly desired.
Execute the query and store results by invoking resultList = resultSet.ToList() This seems to be the most straightforward way of ensuring your entities are in-memory.
In cases where the you are accessing related entities, you may still require a data context. Either that, or you can disable entity proxies and explicitly Include related entities from your DbSet.

I recently ran into an issue with IEnumerable v. IQueryable. The algorithm being used first performed an IQueryable query to obtain a set of results. These were then passed to a foreach loop, with the items instantiated as an Entity Framework (EF) class. This EF class was then used in the from clause of a Linq to Entity query, causing the result to be IEnumerable.
I'm fairly new to EF and Linq for Entities, so it took a while to figure out what the bottleneck was. Using MiniProfiling, I found the query and then converted all of the individual operations to a single IQueryable Linq for Entities query. The IEnumerable took 15 seconds and the IQueryable took 0.5 seconds to execute. There were three tables involved and, after reading this, I believe that the IEnumerable query was actually forming a three table cross-product and filtering the results.
Try to use IQueryables as a rule-of-thumb and profile your work to make your changes measurable.

We can use both for the same way, and they are only different in the performance.
IQueryable only executes against the database in an efficient way. It means that it creates an entire select query and only gets the related records.
For example, we want to take the top 10 customers whose name start with ‘Nimal’. In this case the select query will be generated as select top 10 * from Customer where name like ‘Nimal%’.
But if we used IEnumerable, the query would be like select * from Customer where name like ‘Nimal%’ and the top ten will be filtered at the C# coding level (it gets all the customer records from the database and passes them into C#).

In addition to first 2 really good answers (by driis & by Jacob) :
IEnumerable
interface is in the System.Collections namespace.
The IEnumerable object represents a set of data in memory and can move on this data only forward. The query represented by the IEnumerable object is executed immediately and completely, so the application receives data quickly.
When the query is executed, IEnumerable loads all the data, and if we need to filter it, the filtering itself is done on the client side.
IQueryable interface is located in the System.Linq namespace.
The IQueryable object provides remote access to the database and allows you to navigate through the data either in a direct order from beginning to end, or in the reverse order. In the process of creating a query, the returned object is IQueryable, the query is optimized. As a result, less memory is consumed during its execution, less network bandwidth, but at the same time it can be processed slightly more slowly than a query that returns an IEnumerable object.
What to choose?
If you need the entire set of returned data, then it's better to use IEnumerable, which provides the maximum speed.
If you DO NOT need the entire set of returned data, but only some filtered data, then it's better to use IQueryable.

In addition to the above, it's interesting to note that you can get exceptions if you use IQueryable instead of IEnumerable:
The following works fine if products is an IEnumerable:
products.Skip(-4);
However if products is an IQueryable and it's trying to access records from a DB table, then you'll get this error:
The offset specified in a OFFSET clause may not be negative.
This is because the following query was constructed:
SELECT [p].[ProductId]
FROM [Products] AS [p]
ORDER BY (SELECT 1)
OFFSET #__p_0 ROWS
and OFFSET can't have a negative value.

LINQ to Entities - Entity Framework

I'm looking to get a better understanding on when we should look to use IEnumerable over IQueryablewith LINQ to Entities.
With really basic calls to the database, IQueryable is way quicker, but when do i need to think about using an IEnumerable in its place?
Where is an IEnumerable optimal over an IQueryable??

Basically, IQueryables are executed by a query provider (for example a database) and some operations cannot be or should not be done by the database. For example, if you want to call a C# function (here as an example, capitalize a name correctly) using a value you got from the database you may try something like;
db.Users.Select(x => Capitalize(x.Name)) // Tries to make the db call Capitalize.
.ToList();
Since the Select is executed on an IQueryable, and the underlying database has no idea about your Capitalize function, the query will fail. What you can do instead is to get the correct data from the database and convert the IQueryable to an IEnumerable (which is basically just a way to iterate through collections in-memory) to do the rest of the operation in local memory, as in;
db.Users.Select(x => x.Name) // Gets only the name from the database
.AsEnumerable() // Do the rest of the operations in memory
.Select(x => Capitalize(x)) // Capitalize in memory
.ToList();
The most important thing when it comes to performance of IQueryable vs. IEnumerable from the side of EF, is that you should always try to filter the data using an IQueryable to get as little data as possible to convert to an IEnumerable. What the AsEnumerable call basically does is to tell the database "give me the data as it is filtered now", and if you didn't filter it, you'll get everything fetched to memory, even data you may not need.

IEnumerable represents a sequence of elements which you enumerate one by one until you find the answer you need, so for example if I wanted all entities that had some property greater than 10, I'd need to go through each one in turn and return only those that matched. Pulling every row of a database table into memory in order to do this would not maybe be a great idea.
IQueryable on the other hand represents a set of elements on which operations like filtering can be deferred to the underlying data source, so in the filtering case, if I were to implement IQueryable on top of a custom data source (or use LINQ to Entities!) then I could give the hard work of filtering / grouping etc to the data source (e.g. a database).
The major downside of IQueryable is that implementing it is pretty hard - queries are constructed as Expression trees which as the implementer you then have to parse in order to resolve the query. If you're not planning to write a provider though then this isn't going to hurt you.
Another aspect of IQueryable that it's worth being aware of (although this is really just a generic caveat about passing processing off to another system that may make different assumptions about the world) is that you may find things like string comparison work in the manner they are supported in the source system, not in the manner they are implemented by the consumer, e.g. if your source database is case-insensitive but your default comparison in .NET is case-sensitive.

Why are Func<> and Expression<Func<>> Interchangeable? Why does one work in my case?

I have a data access class that took me a while to get working. For my app, I need to get different types of SQL Server tables where the WHERE clause only differs by the column name: some columns are read_time, others are ReadTime, and others are LastModifiedTime. So I thought I'd pass in the WHERE clause so I didn't need to create a new method for 50 different tables. It looks simple, and it works, but I don't understand something.
This method, with Expression<> as the parameter, works:
internal List<T> GetObjectsGreaterThanReadTime<T>(Expression<Func<T, bool>> whereClause) where T : class
{
Table<T> table = this.Database.GetTable<T>();
IEnumerable<T> objects = table.Where(whereClause);
return objects.ToList();
}
Now, I was trying it this way (below) for a while, and it would just hang on the last line (ToList()). First, why would this compile? I mean, why can Expression and Func be used interchangeably as a parameter? Then, why does Expression work, and the Func version just hangs?
Note: The only difference between the above method and this one is the method parameter (Expression vs. Func).
internal List<T> GetObjectsGreaterThanReadTime<T>(Func<T, bool> whereClause) where T : class
{
Table<T> table = this.Database.GetTable<T>();
IEnumerable<T> objects = table.Where(whereClause);
return objects.ToList();
}

The Expression version calls Queryable.Where which generates an expression tree, which (when enumerated by ToList) is translated to sql and executed on the database server. Presumably, the database server will avail itself of an index based on the filter criteria, to avoid reading the whole table.
The Func version calls Enumerable.Where which (when enumerated by ToList) loads the whole table (what you perceive as a hang) and then runs the filter criteria against the in-memory objects.

Find the Name of Objects referenced in IQueryable and IEnumerable in ENtity Framework

I am creating a generic function that have the function definition as:
public static List<T> Func_IEnumerable<T>(this IEnumerable<T> q, ObjectContext dc, string CacheId)
and one function as
public static List<T> Func_IQueryable<T>(this IQueryable<T> q, ObjectContext dc, string CacheId)
The question is that I want to find out the tables name,procedure name,function name and/or view name referenced in the Ienumerable or IQueryable Query
Is it possible with the Linq framework
And if not then we may convert the IEnumerable into System.Data.Objects.ObjectQuery and finally using ToTraceString to get the pure SQL.
Now from Pure Sql can we get the object names.
Whether Sql Server has some functions to do the same if not, then how should I parse it to get desired results.
Thanks,
Any help is appreciated.

If you want to use SqlCacheDependency (featrue heavily dependent on SQL) why are you using EF (feature which tries to hide SQL as much as possible) in the first place? You are combining two features which were not designed to work together - that happens quite often in .NET framework.
IQueryable can be converted to ObjectQuery only if the object is ObjectQuery = it was created as query on exposed ObejctSet and you have never called ToList, AsEnumerable or other executing method on it. This also means that calling to stored procedure cannot be converted to ObjectQuery.
Real IEnumerable (executed query) cannot be converted to ObjectQuery and you cannot get any information about source of the result set from IEnumerable. This is case for all queries where ToList or AsEnumerable was called and for example for all calls to mapped stored procedures.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.