This is my first time using EF 6 as well as MySQL. I came across an annoyance while updating my LINQ statement from explicitly using joins to using navigation properties to fetch related data.
Here is the statement I'm executing to get a user and all the user's locations.
AspNetUsers.Include("UserLocations")
.Select(u => new {
FullName = u.FullName,
Locations = u.UserLocations.Select(l => l.Title)
})
This statement, using LinqPad4, generates the following SQL:
Why does it join using a select statement instead of doing a join on the table itself, and why does it add all the location columns to the join when the only column needed is the Title?
Wouldn't the following SQL query be better:
SELECT
u.FullName,
l.Title
FROM AspNetUsers u
JOIN UserLocations ul ON u.Id = ul.UserId
JOIN Locations l ON ul.LocationId = l.LocationId;
This is my first time using EF, I have read that in the past the SQL generated has not been so great. I was wandering if this is just one of those cases or if there is something I could do to minimize the SQL generated.
Thank you in advance!
Related
I am using .Net Core with C# and Entity Framework. I need to get data for reporting and it requires joins with several tables.
My question is, what is the best way to call the SQL query and pass the result to a client application?
If the number of tables used is less than 3 or 4, we can use the EF join to get the data from multiple tables.
Sample linq Join
var q=(from pd in dataContext.tblProducts
join od in dataContext.tblOrders on pd.ProductID equals od.ProductID
join ct in dataContext.tblCustomers
on new {a=od.CustomerID,b=od.ContactNo} equals new {a=ct.CustID,b=ct.ContactNo}
orderby od.OrderID
select new {
od.OrderID,
pd.ProductID,
pd.Name,
pd.UnitPrice,
od.Quantity,
od.Price,
Customer=ct.Name //define anonymous type Customer
}).ToList();
If the number of tables used is more than that I suggest you to implement a Stored Procedure and use it to get the data. You can still use the EF to call the SPROC and bind the data to any custom DTO class.
I'm using Entity Framework 6, DotConnect for Oracle and i have these 2 queries:
First one, using a simple join (LINQ and Output SQL):
LINQ:
var joinQuery = Db.Products
.Join(Db.Product_Categories.AsEnumerable(), p => p.ProductID,
pc => pc.CategoryID, (pc, p) => new { pc, p })
.ToList();
Output SQL:
SELECT * FROM Products
Second, using Include:
LINQ:
var includeQuery = Db.Products.Include("Product_Categories").ToList();
Output SQL:
SELECT * FROM Products
LEFT OUTER JOIN Product_Categories
ON Products.CategoryID = Product_Categories.CategoryID
I am in doubt if i can always use "Include" method for left joins. This method is not clear for my.
In the first example the join should not have .AsEnumerable() on the end of it. By doing that you are causing EF to go and get all the records from Product_Categories and then doing the join in memory which can be very inefficient as it doesn't use any kind of index.
The second option you have isn't pure LINQ. Include is an EF-specific extension method that is not available in other providers.
So if you want common LINQ you could use with other DB providers go with option 1. If you want simpler syntax and okay with being EF specific option 2 might be better.
I want to know which one is better for performance:
//Logical
var query = from i in db.Item
from c in db.Category
where i.FK_IdCategory == c.IdCategory
Select new{i.name, c.name};
or
//Join
var query2 = from i in db.Item
join c in db.Category
on c.ID equals i.FK_IdCategory
Select new{i.name, c.name};
Performance of the two queries really depends on which LINQ provider and which RDBMS you're using. Assuming SQL Server, the first would generate the following query:
select i.name, c.name
from Item i, Category c
where i.FK_idCategory = c.IdCategory
Whereas the second would generate:
select i.name, c.name
from Item i
inner join Category c
on i.FK_idCategory = c.IdCategory
Which operate exactly the same in SQL Server as is explained in: Explicit vs implicit SQL joins
This depends on the ORM you're using and how intelligent it is at optimizing your queries for your backend.
Entity Framework can generate some pretty awful SQL if you don't do your linq perfectly, so I'd assume query2 is better.
The only way for you to know for sure would be to inspect the SQL being generated by the two queries.
Eyeballing it, it looks like query1 would result in both tables being pulled in their entirety and then being filtered against each other in your application, while query2 will for sure generate an INNER JOIN in the query, which will let SQL Server do what it does best - set logic.
Is that FK_IdCategory field a member of an actual foreign key index on that table? If not, make it so (and include the name column as an included column in the index) and your query will be very highly performant.
With linq2Sql or EntityFramework, you would probably do something like this:
var query = from i in db.Item
select new {i.name, i.Category.Name}
This will generate a proper SQL inner join.
I do assume that there is a foreign key relation between Item and Category defined.
I have these two tables
ExpiredAccount Account
-------------- ---------------
ExpiredAccountID AccountID
AccountID (fk) AccountName
... ...
Basically, I want to return a list of ExpiredAccounts displaying the AccountName in the result.
I currently do this using
var expiredAccounts = (from x in ExpiredAccount
join m in Account on x.AccountID equals m.AccountID
select m.AccountName).ToList()
This works fine. However, this takes too long.
There's not a lot of records in expiredAccounts (<200).
The Account table on the otherhand has over 300,000 records.
Is there anyway I could speed up my query, or alternatively, another way to do this more efficiently with or without using LINQ?
Firstly, assuming you are using Entity Framework, you don't need to be using the join at all. You could simply do:
var expiredAccounts = (from x in ExpiredAccount
select x.Account.AccountName).ToList()
However, I don't think they will generate a different query plan on the database. But my guess is that you don't have an index on AccountID in the Account table (although that seems unlikely).
One thing you can do is use ToTraceString (for example: http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/4a17b992-05ca-4e3b-9910-0018e7cc9c8c/) to get the SQL which is being run. Then you can open SQL Management Studio and run that with the execution plan option turned on and it will show you what the execution plan was and what indexes need to be added to make it better.
You can try using Contains method:
var expiredAccounts = (from m in Account where ExpiredAccount.Select(x => x.AccountId)
.Contains(m.AccountId)
select m.AccountName).ToList()
It should generate IN clause in SQL query that will be performed agains database.
I've got a scenario where I will need to order by on a column which is a navigation property for the Users entity inside my EF model.
The entities:
Users --> Countries 1:n relationship
A simple SQL query would be as follows:
SELECT UserId, u.Name, c.Name
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
So then I tried to replicate the above SQL query using Linq to Entities as follows - (Lazy Loading is enabled)
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native SQL query above does.
However I continued a bit more and did the following:
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
But ordering on the enumeratedUser object for about 50 records took approx. 7seconds
Is there a better way how to omit the Enumerable and without returning an anonymous type?
Thanks
EDIT
I just forgot to say that the EF provider is a MySQL one not a MS SQL. In fact I just tried the same query on a replicated database in MS SQL and the query works fine i.e. the country name is ordered correctly, so it looks like I have no other option apart from getting the result set from MySQL and execute the order by from the memory on the enumerable object
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
This is LINQ to Objects not LINQ to Entities.
Above Order By clause will call OrderBy defined in Enumerable
That is ordering will be done in memory. Hence it will take long time
Edit
It looks like a MySQL related issue
You may try something like this.
var users = from user in entities.users
join country in entities.Country on user.CountryId equals country.Id
orderby country.Name
select user;
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native
SQL query above does.
Yes, it does not return Countries but only Users sorted by the name of country.
When this query is executed, the following sql is sent to DB.
SELECT u.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
As you can see, the result does not include any fields of countries. As you mentioned the lazy loading, countires are loaded through it when needed. At this time, countries are ordered as the order you call it through the lazy loading. You can access countries through the Local property of a entity set.
This point tells you that if you want user sorted by the name of country and also countires sorted by the name, you need the eagerly loading as #Dennis mentioned like:
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
This is converted to the following sql.
SELECT u.*, c.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
Have you tried using Include?
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
SOLUTION
Since I had both columns named Name in both Countries and Users table MySQL Connector was generating this output when order by country.Name was executed:
SELECT `Extent1`.`Username`, `Extent1`.`Name`, `Extent1`.`Surname`, `Extent1`.`CountryId`
FROM `users` AS `Extent1` INNER JOIN `countries` AS `Extent2` ON `Extent1`.`CountryId` = `Extent2`.`CountryId`
ORDER BY `Name` ASC
therefore this will result in ordering on the users.Name rather countries.Name
However MySQL have release version 6.4.3 .NET connector which has resolved a bunch of issues one of them being:
We are also including some SQL generation improvements related to our entity framework provider. Source: http://forums.mysql.com/read.php?3,425992
Thank you for all your input. I tried to be clear as much as possible to help others which might encounter my same issue.