I want to know which one is better for performance:
//Logical
var query = from i in db.Item
from c in db.Category
where i.FK_IdCategory == c.IdCategory
Select new{i.name, c.name};
or
//Join
var query2 = from i in db.Item
join c in db.Category
on c.ID equals i.FK_IdCategory
Select new{i.name, c.name};
Performance of the two queries really depends on which LINQ provider and which RDBMS you're using. Assuming SQL Server, the first would generate the following query:
select i.name, c.name
from Item i, Category c
where i.FK_idCategory = c.IdCategory
Whereas the second would generate:
select i.name, c.name
from Item i
inner join Category c
on i.FK_idCategory = c.IdCategory
Which operate exactly the same in SQL Server as is explained in: Explicit vs implicit SQL joins
This depends on the ORM you're using and how intelligent it is at optimizing your queries for your backend.
Entity Framework can generate some pretty awful SQL if you don't do your linq perfectly, so I'd assume query2 is better.
The only way for you to know for sure would be to inspect the SQL being generated by the two queries.
Eyeballing it, it looks like query1 would result in both tables being pulled in their entirety and then being filtered against each other in your application, while query2 will for sure generate an INNER JOIN in the query, which will let SQL Server do what it does best - set logic.
Is that FK_IdCategory field a member of an actual foreign key index on that table? If not, make it so (and include the name column as an included column in the index) and your query will be very highly performant.
With linq2Sql or EntityFramework, you would probably do something like this:
var query = from i in db.Item
select new {i.name, i.Category.Name}
This will generate a proper SQL inner join.
I do assume that there is a foreign key relation between Item and Category defined.
Related
I have two table in sql. Document and User. Document have relation to User and I want to get users that I sent document recently.
I need to sort by the date document was sent and get unique (distinct) user with relation to this document
This is my linq queries
var recentClients = documentCaseRepository.Entities
.Where(docCase => docCase.AssignedByAgentId == WC.UserContext.UserId)
.OrderByDescending(userWithDate => userWithDate.LastUpdateDate)
.Take(1000) // I need this because if I comment this line then EF generate completely different sql query.
.Select(doc => new { doc.AssignedToClient.Id, doc.AssignedToClient.FirstName, doc.AssignedToClient.LastName })
.Distinct()
.Take(configuration.MaxRecentClientsResults)
.ToList();
and generated sql query is:
SELECT DISTINCT TOP(5) [t].*
FROM (
SELECT TOP(1000) [docCase.AssignedToClient].[Id]
FROM [DocumentCase] AS [docCase]
INNER JOIN [User] AS [docCase.AssignedToClient]
ON ([docCase].[AssignedToClientId] = [docCase.AssignedToClient].[Id])
WHERE [docCase].[AssignedByAgentId] = 3
ORDER BY [docCase].[LastUpdateDate] DESC
)
AS [t]
Every thing is correct for now. But if I delete this line
.Take(1000) // I need this because...
EF generated completely different query such as:
SELECT DISTINCT TOP(5)
[docCase.AssignedToClient].[Id]
FROM [DocumentCase] AS [docCase]
INNER JOIN [User] AS [docCase.AssignedToClient]
ON ([docCase].[AssignedToClientId] = [docCase.AssignedToClient].[Id])
WHERE [docCase].[AssignedByAgentId] = 3
My question is: why EF not generated orderby clause and subquery with distinct?
This is a BUG EF or I'm doing something wrong? And what I must do to generate in linq this sql query ()
SELECT DISTINCT TOP 5 [t].*
FROM ( SELECT [docCase.AssignedToClient].[Id]
FROM [DocumentCase] AS [docCase]
INNER JOIN [User] AS [docCase.AssignedToClient]
ON [docCase].[AssignedToClientId] = [docCase.AssignedToClient].[Id]
WHERE [docCase].[AssignedByAgentId] = 1
ORDER BY [docCase].[LastUpdateDate] DESC
) AS [t]
OrderBy information not always retained across other operators such as Distinct. Entity Framework does not document (to my knowledge) how exactly OrderBy is propagated.
This kind of makes sense because some operators have undefined output order. The fact that ordering is retained in many situations is a convenience for the developer.
Move the OrderBy to the end of the query (or at least past the Distinct).
The reason for the difference in queries is that Distinct messes up result order. So when you first execute OrderBy and then Distinct, you can just es well not execute OrderBy, because this order is lost anyway. So EF can just optimize it away.
Calling Take in between causes the result set to be semantically different: You first order the items, take the first 1000 items of that order and then call Distinct on them.
What you can change in your query depends mainly on the result you want to achieve. Maybe you want to first make the result set distinct then order by date and finally take the amount of items. Other options are also thinkable based on your requirements.
I have this join :
var andlist = (from cust in custFinal
join serv in db.Service on cust.ID equals serv.CustID
select new JoinObj
{
Name = cust.name,
ServiceID = serv.ID,
});
custFinal is a list of Customers that contains only one object. db.Service is a DbSet and there are only four rows whose custID equals customer object's ID in Service table. When I use ToList() or Count(), used memory quickly exceeds 1GB and I'm getting outOfMemory exception. Can you tell me what is wrong with this code? Thanks in advance.
The reason is you don't really perform join on server. custFinal as you said is just in-memory list, not a database table or query. So it is IEnumerable, not IQueryable. When you perform a join - it calls IEnumerable.Join, not IQueryable.Join method. The latter would build a query but the former will just pull all arguments into memory and perform join in-memory. So in result - whole Service table in pulled into memory and joined there (easy to check if you log EF context queries - you will see that it just performs select all from Service query).
If you change the order of arguments in a join so that IQueryable.Join would be executed - that won't help either, because you cannot join database table with in-memory list with Entity Framework anyway. So you have to find another way, for example:
var ids = custFinal.Select(c => c.ID).ToArray();
var matchingServices = db.Service.Where(serv => ids.Contains(serv.CustID)).Select(c => new {c.ServiceID, c.CustID}).ToArray();
// now filter `custFinal` based on `matchingServices`, in memory.
That will perform CustID IN (...) query instead of a join. If you insist on having a join - you will have to do that with raw sql, without entity framework (you will also need to create custom table type in sql server, if you use SQL server).
This is my first time using EF 6 as well as MySQL. I came across an annoyance while updating my LINQ statement from explicitly using joins to using navigation properties to fetch related data.
Here is the statement I'm executing to get a user and all the user's locations.
AspNetUsers.Include("UserLocations")
.Select(u => new {
FullName = u.FullName,
Locations = u.UserLocations.Select(l => l.Title)
})
This statement, using LinqPad4, generates the following SQL:
Why does it join using a select statement instead of doing a join on the table itself, and why does it add all the location columns to the join when the only column needed is the Title?
Wouldn't the following SQL query be better:
SELECT
u.FullName,
l.Title
FROM AspNetUsers u
JOIN UserLocations ul ON u.Id = ul.UserId
JOIN Locations l ON ul.LocationId = l.LocationId;
This is my first time using EF, I have read that in the past the SQL generated has not been so great. I was wandering if this is just one of those cases or if there is something I could do to minimize the SQL generated.
Thank you in advance!
I'm using Entity Framework 6, DotConnect for Oracle and i have these 2 queries:
First one, using a simple join (LINQ and Output SQL):
LINQ:
var joinQuery = Db.Products
.Join(Db.Product_Categories.AsEnumerable(), p => p.ProductID,
pc => pc.CategoryID, (pc, p) => new { pc, p })
.ToList();
Output SQL:
SELECT * FROM Products
Second, using Include:
LINQ:
var includeQuery = Db.Products.Include("Product_Categories").ToList();
Output SQL:
SELECT * FROM Products
LEFT OUTER JOIN Product_Categories
ON Products.CategoryID = Product_Categories.CategoryID
I am in doubt if i can always use "Include" method for left joins. This method is not clear for my.
In the first example the join should not have .AsEnumerable() on the end of it. By doing that you are causing EF to go and get all the records from Product_Categories and then doing the join in memory which can be very inefficient as it doesn't use any kind of index.
The second option you have isn't pure LINQ. Include is an EF-specific extension method that is not available in other providers.
So if you want common LINQ you could use with other DB providers go with option 1. If you want simpler syntax and okay with being EF specific option 2 might be better.
I've got a scenario where I will need to order by on a column which is a navigation property for the Users entity inside my EF model.
The entities:
Users --> Countries 1:n relationship
A simple SQL query would be as follows:
SELECT UserId, u.Name, c.Name
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
So then I tried to replicate the above SQL query using Linq to Entities as follows - (Lazy Loading is enabled)
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native SQL query above does.
However I continued a bit more and did the following:
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
But ordering on the enumeratedUser object for about 50 records took approx. 7seconds
Is there a better way how to omit the Enumerable and without returning an anonymous type?
Thanks
EDIT
I just forgot to say that the EF provider is a MySQL one not a MS SQL. In fact I just tried the same query on a replicated database in MS SQL and the query works fine i.e. the country name is ordered correctly, so it looks like I have no other option apart from getting the result set from MySQL and execute the order by from the memory on the enumerable object
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
This is LINQ to Objects not LINQ to Entities.
Above Order By clause will call OrderBy defined in Enumerable
That is ordering will be done in memory. Hence it will take long time
Edit
It looks like a MySQL related issue
You may try something like this.
var users = from user in entities.users
join country in entities.Country on user.CountryId equals country.Id
orderby country.Name
select user;
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native
SQL query above does.
Yes, it does not return Countries but only Users sorted by the name of country.
When this query is executed, the following sql is sent to DB.
SELECT u.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
As you can see, the result does not include any fields of countries. As you mentioned the lazy loading, countires are loaded through it when needed. At this time, countries are ordered as the order you call it through the lazy loading. You can access countries through the Local property of a entity set.
This point tells you that if you want user sorted by the name of country and also countires sorted by the name, you need the eagerly loading as #Dennis mentioned like:
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
This is converted to the following sql.
SELECT u.*, c.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
Have you tried using Include?
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
SOLUTION
Since I had both columns named Name in both Countries and Users table MySQL Connector was generating this output when order by country.Name was executed:
SELECT `Extent1`.`Username`, `Extent1`.`Name`, `Extent1`.`Surname`, `Extent1`.`CountryId`
FROM `users` AS `Extent1` INNER JOIN `countries` AS `Extent2` ON `Extent1`.`CountryId` = `Extent2`.`CountryId`
ORDER BY `Name` ASC
therefore this will result in ordering on the users.Name rather countries.Name
However MySQL have release version 6.4.3 .NET connector which has resolved a bunch of issues one of them being:
We are also including some SQL generation improvements related to our entity framework provider. Source: http://forums.mysql.com/read.php?3,425992
Thank you for all your input. I tried to be clear as much as possible to help others which might encounter my same issue.