What is the right way to enumerate using LINQ? - c#

Code:
var result = db.rows.Take(30).ToList().Select(a => AMethod(a));
db.rows.Take(30) is Linq-To-SQL
I am using ToList() to enumerate the results, so the rest of the query isn't translated to SQL
Which is the fastest way of doing that? ToArray()?

Use Enumerable.AsEnumerable:
var result = db.rows
.Take(30)
.AsEnumerable()
.Select(a => AMethod(a));

Use Enumerable.AsEnumerable() if you don't want to execute the database query immidiatly because because AsEnumerable() will still deffer the database query execution until you start enumerating the LINQ to Object query.
If you are sure that you will require the data and/or want to immidiatly execute the database query, use Enumerable.ToList() or Enumerable.ToArray(). The performance difference should be not to big.
I assume the rows are read into a variable sized container at first in both call, because the number of rows is not yet known. So I tend to say that ToList() could be a bit faster, because the rows could be directly read into the list while ToArray() probably reads the rows into a kind of list at first and then copies to an array after all rows have been transfered.

Related

How to bring C# AsQueryable() object into memory

I want to use Entity Framework to pull a database list into memory with AsQueryable(), as when I use ToList() LINQ doesn't work anymore as it returns a list of 0 elements.
It is possible to do something like
var vwEntityList = dc.vwEntity.ToList().AsQueryable()
? I've tried this but LINQ fails when I query the object. If I use just .AsQueryable() it works but is really slow querying the database for every operation, which I want to avoid, I want to bring the EF view to memory as a queryable object.
When I have my list and try to use LINQ on it like
var newlist = vwEntityList.Where(x => x.deal_status == "B").ToList();
newlist is a list with 0 items, when I know there should be items in the list by running a similar query against the database, and it also works when using just .AsQueryable()
Thanks.
Based on what you've provided it sounds like this works (but is too slow for your liking)
var list = context.vwEntity.Where(x => x.deal_status == "B").ToList();
but this does not:
var totalList = context.vwEntity.ToList();
var list = totalList.Where(x => x.deal_status == "B").ToList();
The reason this would be the case is database collation. Some database such as SQL Server use a case-insensitive comparison for strings by default. So if a record had a deal_status of "b", then the first query would return that record because in SQL "B" == "b", however once loaded into memory, a Linq2Object where clause will do a case-sensitive comparison. "B" <> "b". SQL Server databases can be configured with case sensitive collations, so it is dangerous to assume strings will always be treated case-insensitive.
Linq expressions work against IEnumerable so you don't need to force a List back to IQueryable.
If you are working with Linq2Object, or against a case-sensitive database you should ensure case-insensitive matching is done where you want rather than relying on collation:
// this should work...
var totalList = context.vwEntity.ToList();
var list = totalList.Where(x => x.deal_status.ToUpper() == "B").ToList();
That said, reading an entire table/view into memory then querying against the objects should never really be "faster" than querying the entities unless perhaps you need to do a lot of different queries and want to cache the table in memory. Querying across strings can commonly hit performance snags if the fields being queried are not indexed. I would suggest getting familiar with a query profiler for whatever database you are running against to capture the exact queries that EF is executing to identify performance bottlenecks. When facing slow queries, I don't think I've ever recommended "load the entire 200k records into memory first" as a solution. :)

Linq Queries containing "AsEnumerable()" equivalence

Are this 2 queries functionally equivalent?
1)
var z=Categories
.Where(s=>s.CategoryName.Contains("a"))
.OrderBy(s => s.CategoryName).AsEnumerable()
.Select((x,i)=>new {x.CategoryName,Rank=i});
2)
var z=Categories.AsEnumerable()
.Where(s=>s.CategoryName.Contains("a"))
.OrderBy(s => s.CategoryName)
.Select((x,i)=>new {x.CategoryName,Rank=i});
I mean, does the order of "AsNumerable()" in the query change the number of data items retrieved from the client, or the way they are retrieved?
Thank you for you help.
Are this 2 queries functionally equivalent?
If by equivalent you means to the final results, then probably yes (depending how the provider implements those operations), the difference is in the second query you are using in-memory extensions.
I mean, does the order of "AsNumerable()" in the query change the
number of data items retrieved from the client, or the way they are
retrieved?
Yes, in the first query, Where and OrderBy will be translated to SQL and the Select will be executed in memory.
In your second query all the information from the database is brought to memory, then is filtered and transformed in memory.
Categories is probably an IQueryable, so you will be using the extensions in Queryable class. this version of the extensions receive a Expression as parameter, and these expression trees is what allows transform your code to sql queries.
AsEnumerable() returns the object as an IEnumerable, so you will be using the extensions in Enumerable class that are executed directly in memory.
Yes they do the same thing but in different ways. The first query do all the selection,ordering and conditions in the SQL database itself.
However the second code segment fetches all the rows from the database and store it in the memory. Then after it sorts, orders, and apply conditions to the fetched data i.e now in the memory.
AsEnumerable() breaks the query into two parts:
The Inside-Part(query before AsEnumerable) is executed as LINQ-to-SQL
The Outside-Part(query after AsEnumerable) is executed as LINQ-to-Objects

Linq to entities is very slow using .Take() method

I have a table of 200,000 record where I am getting only the top 10 using .Take() but it is taking about 10 seconds to get the data.
My question is: does the .Take() method get all the data from the database and filter the top 10 on the client side?
Here is my code:
mylist = (from mytable in db.spdata().OrderByDescending(f => f.Weight)
group feed by mytable.id into g
select g.FirstOrDefault()).Take(10).ToList();
spdata() is a function Import from stored procedure.
Thanks
The stored procedure probably returns a lot of data to the client which is very slow. You cannot remote a query to an sproc. That would be possible using a view or a table-valued function.
There's no way to use an sproc in a query. You can only execute it by itself.
Your intention probably was to execute the Take(10) on the server. For that to work you need to switch to an inline query, a view or a TVF.
The extension method Take does not fetch all the results from the database. That is not how Takeworks.
However your db.spdata() call probably does fetch all rows.
I'm not 100% sure but as I remember you get an IEnumerable result when you call an SP using EF DataContext...
There are a couple of was to optimize the performance:
Pass the search criteria s as SP params and do the filtering in the stored procedure.
Or if you have a quite simple query in the SP where you are not declaring any variables and where you are just joining some tables then:
Create an indexed view where specify the query that you need and call the Take method on it.
What this will give you? You can map to the created view and EF will now be returning an IQueryable result and not an IEnumerable. This will optimize the sql command and rather the receiving all of the data and then taking the 10 elements that you need, a sql command that just retrieves the 10 elements will be formed.
I also advice you to see what is the deference between IEnumerable vs IQueryable.
It does, because you are sorting the data before grouping, which is not possible to do in SQL.
You should use an aggregate to get the highest weight from each group, then sort the weights to get the ten largest:
mylist = (
from mytable in db.spdata()
group feed by mytable.id into g
select g.Max(f => f.Weight)
).OrderByDescending(w => w).Take(10).ToList();

Convert IEnumerable<T> to List <T> Performance

I have a web service that returns List<SampleClass>. I take data from database using LINQ-to-SQL as IEnumerable<SampleClass>. I then convert IEnumerable<SampleClass> to List<SampleClass>.
LINQ-to-SQL operation performance is OK, but IEnumerable<SampleClass> to List<SampleClass> takes some time to do the operation. Are there any solutions to get best performance from IEnumerable<SampleClass> to List<SampleClass>?
I read more than 3000 records from my database.
Thank you
IEnumerable to List takes some time to do
the operation.
The reason you are getting the delay is because, when you do ToList, that is the time, when the actual query gets executed and fetches records from the database.
This is called Deferred execution.
var query = db.yourTable.Where(r=> r.ID > 10);
var List = query.ToList(); //this is where the actual query gets executed.
When ever you iterate over the query using ToList, ToArray , Count() etc, that is when the actual query gets executed.
Are there any solutions to get best performance from
IEnumerable to List? I read more than 3000
records from my database.
Without improving the query, No, you can't. But do you really need to fetch 3000 records, you may look into paging using Skip and Take

NHibernate - Equivalent of CountDistinct projection using LINQ

I'm in the midst of trying to replace a the Criteria queries I'm using for a multi-field search page with LINQ queries using the new LINQ provider. However, I'm running into a problem getting record counts so that I can implement paging. I'm trying to achieve a result
equivalent to that produced by a CountDistinct projection from the Criteria API using LINQ. Is there a way to do this?
The Distinct() method provided by LINQ doesn't seem to behave the way I would expect, and appending ".Distinct().Count()" to the end of a LINQ query grouped by the field I want a distinct count of (an integer ID column) seems to return a non-distinct count of those values.
I can provide the code I'm using if needed, but since there are so many fields, it's
pretty long, so I didn't want to crowd the post if it wasn't needed.
Thanks!
I figured out a way to do this, though it may not be optimal in all situations. Just doing a .Distinct() on the LINQ query does, in fact, produce a "distinct" in the resulting SQL query when used without .Count(). If I cause the query to be enumerated by using .Distinct().ToList() and then use the .Count() method on the resulting in-memory collection, I get the result I want.
This is not exactly equivalent to what I was originally doing with the Criteria query, since the counting is actually being done in the application code, and the entire list of IDs must be sent from the DB to the application. In my case, though, given the small number of distinct IDs, I think it will work, and won't be too much of a performance bottleneck.
I do hope, however, that a true CountDistinct() LINQ operation will be implemented in the future.
You could try selecting the column you want a distinct count of first. It would look something like: Select(p => p.id).Distinct().Count(). As it stands, you're distincting the entire object, which will compare the reference of the object and not the actual values.

Categories

Resources