LINQ - Speeding up query that has a join to a huge table - c#

I have these two tables
ExpiredAccount Account
-------------- ---------------
ExpiredAccountID AccountID
AccountID (fk) AccountName
... ...
Basically, I want to return a list of ExpiredAccounts displaying the AccountName in the result.
I currently do this using
var expiredAccounts = (from x in ExpiredAccount
join m in Account on x.AccountID equals m.AccountID
select m.AccountName).ToList()
This works fine. However, this takes too long.
There's not a lot of records in expiredAccounts (<200).
The Account table on the otherhand has over 300,000 records.
Is there anyway I could speed up my query, or alternatively, another way to do this more efficiently with or without using LINQ?

Firstly, assuming you are using Entity Framework, you don't need to be using the join at all. You could simply do:
var expiredAccounts = (from x in ExpiredAccount
select x.Account.AccountName).ToList()
However, I don't think they will generate a different query plan on the database. But my guess is that you don't have an index on AccountID in the Account table (although that seems unlikely).
One thing you can do is use ToTraceString (for example: http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/4a17b992-05ca-4e3b-9910-0018e7cc9c8c/) to get the SQL which is being run. Then you can open SQL Management Studio and run that with the execution plan option turned on and it will show you what the execution plan was and what indexes need to be added to make it better.

You can try using Contains method:
var expiredAccounts = (from m in Account where ExpiredAccount.Select(x => x.AccountId)
.Contains(m.AccountId)
select m.AccountName).ToList()
It should generate IN clause in SQL query that will be performed agains database.

Related

A simple join consumes too much memory - LINQ

I have this join :
var andlist = (from cust in custFinal
join serv in db.Service on cust.ID equals serv.CustID
select new JoinObj
{
Name = cust.name,
ServiceID = serv.ID,
});
custFinal is a list of Customers that contains only one object. db.Service is a DbSet and there are only four rows whose custID equals customer object's ID in Service table. When I use ToList() or Count(), used memory quickly exceeds 1GB and I'm getting outOfMemory exception. Can you tell me what is wrong with this code? Thanks in advance.
The reason is you don't really perform join on server. custFinal as you said is just in-memory list, not a database table or query. So it is IEnumerable, not IQueryable. When you perform a join - it calls IEnumerable.Join, not IQueryable.Join method. The latter would build a query but the former will just pull all arguments into memory and perform join in-memory. So in result - whole Service table in pulled into memory and joined there (easy to check if you log EF context queries - you will see that it just performs select all from Service query).
If you change the order of arguments in a join so that IQueryable.Join would be executed - that won't help either, because you cannot join database table with in-memory list with Entity Framework anyway. So you have to find another way, for example:
var ids = custFinal.Select(c => c.ID).ToArray();
var matchingServices = db.Service.Where(serv => ids.Contains(serv.CustID)).Select(c => new {c.ServiceID, c.CustID}).ToArray();
// now filter `custFinal` based on `matchingServices`, in memory.
That will perform CustID IN (...) query instead of a join. If you insist on having a join - you will have to do that with raw sql, without entity framework (you will also need to create custom table type in sql server, if you use SQL server).

Entity Framework 6 - MySQL Query Generates Unnecessary SQL

This is my first time using EF 6 as well as MySQL. I came across an annoyance while updating my LINQ statement from explicitly using joins to using navigation properties to fetch related data.
Here is the statement I'm executing to get a user and all the user's locations.
AspNetUsers.Include("UserLocations")
.Select(u => new {
FullName = u.FullName,
Locations = u.UserLocations.Select(l => l.Title)
})
This statement, using LinqPad4, generates the following SQL:
Why does it join using a select statement instead of doing a join on the table itself, and why does it add all the location columns to the join when the only column needed is the Title?
Wouldn't the following SQL query be better:
SELECT
u.FullName,
l.Title
FROM AspNetUsers u
JOIN UserLocations ul ON u.Id = ul.UserId
JOIN Locations l ON ul.LocationId = l.LocationId;
This is my first time using EF, I have read that in the past the SQL generated has not been so great. I was wandering if this is just one of those cases or if there is something I could do to minimize the SQL generated.
Thank you in advance!

EF and LINQ query to database speed

I just wondering if I am wasting my time or is there anything I could to improve this query which in turn will improve performance.
Inside a Repository, I am trying to get the 10 most recent items
public List<entity> GetTop10
{
get
{
return m_Context.entity.Where(x => x.bool == false).OrderByDescending(x => x.Created).Take(10).ToList();
}
}
But this is taking a long time as the table its querying has over 11000 rows in it. So my question is, is there anyway I could speed up this kind of query?
I am trying to get my SQL hat on regarding performance, I know the order would slow it down, but how I could I achieve the same result?
Thanks
The particular query you posted is a potential candidate for using a filtered index. Say you have a SQL table:
CREATE TABLE Employees
(
ID INT IDENTITY(1,1) PRIMARY KEY,
Name NVARCHAR(100),
IsAlive BIT
)
You can imagine that generally you only want to query on employees that have not (yet) died so will end up with SQL like this:
SELECT Name FROM Employees WHERE IsAlive = 1
So, why not create a filtered index:
CREATE INDEX IX_Employees_IsAliveTrue
ON Employees(IsAlive)
WHERE IsAlive = 1
So now if you query the table it will use this index which may only be a small portion of your table, especially if you've had a recent zombie invasion and 90% of your staff are now the walking dead.
However, an Entity Framework like this:
var nonZombies = from e in db.Employees
where e.IsAlive == true
select e;
May not be able to use the index (SQL has a problem with filtered indexes and parameterised queries). To get round this, you can create a view in your database:
CREATE VIEW NonZombies
AS
SELECT ID, Name, IsAlive FROM Employees WHERE IsAlive = 1
Now you can add that to your framework (how you do this will vary depending on if you are using code/model/database first) and you will now be able to decide which employees deserve urgent attention (like priority access to food and weapons):
var nonZombies = from e in db.NonZombies
select e;
From your LINQ query will be created SQL SELECT similar to this:
SELECT TOP(10) * FROM entity
WHERE bool = 0
ORDER BY Created DESC
Similar because instead of the '*' will server select concrete columns to map these to entity object.
If this is too slow for you. The error is in the database, not in the EntityFramework.
So try adding some indexes to your table.

sql Top 1 vs System.Linq firstordefault

I am rewriting an SProc in c#. the problem is that in SProc there is a query like this:
select top 1 *
from ClientDebt
where ClinetID = 11234
order by Balance desc
For example :I have a client with 3 debts, all of them have same balance. the debt ids are : 1,2,3
c# equivalent of that query is :
debts.OrderByDescending(d => d.Balance)
.FirstOrDefault()
debts represent clients 3 debts
the interesting part is that sql return debt with Id 2 but c# code returns Id 1.
The Id 1 make sense for me But in order to keep code functionality the same I need to change the c# code to return middle one.
I do not sure what is the logic behind sql top 1 where several rows match the query.
The query will select one debt and update the database. I would like the linq to return the same result with sql
Thanks
debts.OrderByDescending(d => d.Balance).ThenByDescending(d => d.Id)
.FirstOrDefault()
You can start SQL Profiler, execute stored procedure, review result, and then catch query which application send through linq, and again review result.
Also, you can easily view execution plan of you procedure, and try it to optimize, but with linq query, you cannot easily do this.
AFAIK, IN SQL if you select rows without ORDER BY, it orders the resultset based on the primary key.
With Order BY CLAUSE [field], implicitly next order is [primarykey].

Order by a field which is a Navigation Property to an Entity - Linq to Entity

I've got a scenario where I will need to order by on a column which is a navigation property for the Users entity inside my EF model.
The entities:
Users --> Countries 1:n relationship
A simple SQL query would be as follows:
SELECT UserId, u.Name, c.Name
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
So then I tried to replicate the above SQL query using Linq to Entities as follows - (Lazy Loading is enabled)
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native SQL query above does.
However I continued a bit more and did the following:
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
But ordering on the enumeratedUser object for about 50 records took approx. 7seconds
Is there a better way how to omit the Enumerable and without returning an anonymous type?
Thanks
EDIT
I just forgot to say that the EF provider is a MySQL one not a MS SQL. In fact I just tried the same query on a replicated database in MS SQL and the query works fine i.e. the country name is ordered correctly, so it looks like I have no other option apart from getting the result set from MySQL and execute the order by from the memory on the enumerable object
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
This is LINQ to Objects not LINQ to Entities.
Above Order By clause will call OrderBy defined in Enumerable
That is ordering will be done in memory. Hence it will take long time
Edit
It looks like a MySQL related issue
You may try something like this.
var users = from user in entities.users
join country in entities.Country on user.CountryId equals country.Id
orderby country.Name
select user;
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native
SQL query above does.
Yes, it does not return Countries but only Users sorted by the name of country.
When this query is executed, the following sql is sent to DB.
SELECT u.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
As you can see, the result does not include any fields of countries. As you mentioned the lazy loading, countires are loaded through it when needed. At this time, countries are ordered as the order you call it through the lazy loading. You can access countries through the Local property of a entity set.
This point tells you that if you want user sorted by the name of country and also countires sorted by the name, you need the eagerly loading as #Dennis mentioned like:
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
This is converted to the following sql.
SELECT u.*, c.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
Have you tried using Include?
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
SOLUTION
Since I had both columns named Name in both Countries and Users table MySQL Connector was generating this output when order by country.Name was executed:
SELECT `Extent1`.`Username`, `Extent1`.`Name`, `Extent1`.`Surname`, `Extent1`.`CountryId`
FROM `users` AS `Extent1` INNER JOIN `countries` AS `Extent2` ON `Extent1`.`CountryId` = `Extent2`.`CountryId`
ORDER BY `Name` ASC
therefore this will result in ordering on the users.Name rather countries.Name
However MySQL have release version 6.4.3 .NET connector which has resolved a bunch of issues one of them being:
We are also including some SQL generation improvements related to our entity framework provider. Source: http://forums.mysql.com/read.php?3,425992
Thank you for all your input. I tried to be clear as much as possible to help others which might encounter my same issue.

Categories

Resources