A simple join consumes too much memory - LINQ - c#

I have this join :
var andlist = (from cust in custFinal
join serv in db.Service on cust.ID equals serv.CustID
select new JoinObj
{
Name = cust.name,
ServiceID = serv.ID,
});
custFinal is a list of Customers that contains only one object. db.Service is a DbSet and there are only four rows whose custID equals customer object's ID in Service table. When I use ToList() or Count(), used memory quickly exceeds 1GB and I'm getting outOfMemory exception. Can you tell me what is wrong with this code? Thanks in advance.

The reason is you don't really perform join on server. custFinal as you said is just in-memory list, not a database table or query. So it is IEnumerable, not IQueryable. When you perform a join - it calls IEnumerable.Join, not IQueryable.Join method. The latter would build a query but the former will just pull all arguments into memory and perform join in-memory. So in result - whole Service table in pulled into memory and joined there (easy to check if you log EF context queries - you will see that it just performs select all from Service query).
If you change the order of arguments in a join so that IQueryable.Join would be executed - that won't help either, because you cannot join database table with in-memory list with Entity Framework anyway. So you have to find another way, for example:
var ids = custFinal.Select(c => c.ID).ToArray();
var matchingServices = db.Service.Where(serv => ids.Contains(serv.CustID)).Select(c => new {c.ServiceID, c.CustID}).ToArray();
// now filter `custFinal` based on `matchingServices`, in memory.
That will perform CustID IN (...) query instead of a join. If you insist on having a join - you will have to do that with raw sql, without entity framework (you will also need to create custom table type in sql server, if you use SQL server).

Related

C# ToList inside LINQ

I'm working with LINQ and I wonder what's the difference between the two codes below. The result seems same but does using ToList() inside the query like student1 makes one more access to Database?
var students1 = (from stud in dbContext.Students.Where(s => s.LastName == "Doe").ToList()
join class in dbContext.Classes
on ...).ToList();
var students2 = (from stud in dbContext.Students.Where(s => s.LastName == "Doe")
join class in dbContext.Classes
on ...).ToList();
ToList materializes the query. Note that when accessing the database you are working with IQueryable and your join and where clauses get passed to the database. If you materialize the queryable by calling ToList the database does not process the join but you join the inmemory data with LinqToObjects instead of with LinqToSql.
LINQ to SQL is a facility for managing and accessing relational data as objects.
It connects to a database, converts LINQ constructs into SQL
submits the SQL
transforms results into objects
Even tracks changes and automatically requests database updates

Is it possible to get result from joined SQL query from several tables in C# with Entity Framework?

I am using .Net Core with C# and Entity Framework. I need to get data for reporting and it requires joins with several tables.
My question is, what is the best way to call the SQL query and pass the result to a client application?
If the number of tables used is less than 3 or 4, we can use the EF join to get the data from multiple tables.
Sample linq Join
var q=(from pd in dataContext.tblProducts
join od in dataContext.tblOrders on pd.ProductID equals od.ProductID
join ct in dataContext.tblCustomers
on new {a=od.CustomerID,b=od.ContactNo} equals new {a=ct.CustID,b=ct.ContactNo}
orderby od.OrderID
select new {
od.OrderID,
pd.ProductID,
pd.Name,
pd.UnitPrice,
od.Quantity,
od.Price,
Customer=ct.Name //define anonymous type Customer
}).ToList();
If the number of tables used is more than that I suggest you to implement a Stored Procedure and use it to get the data. You can still use the EF to call the SPROC and bind the data to any custom DTO class.

LINQ Logical join VS inner join

I want to know which one is better for performance:
//Logical
var query = from i in db.Item
from c in db.Category
where i.FK_IdCategory == c.IdCategory
Select new{i.name, c.name};
or
//Join
var query2 = from i in db.Item
join c in db.Category
on c.ID equals i.FK_IdCategory
Select new{i.name, c.name};
Performance of the two queries really depends on which LINQ provider and which RDBMS you're using. Assuming SQL Server, the first would generate the following query:
select i.name, c.name
from Item i, Category c
where i.FK_idCategory = c.IdCategory
Whereas the second would generate:
select i.name, c.name
from Item i
inner join Category c
on i.FK_idCategory = c.IdCategory
Which operate exactly the same in SQL Server as is explained in: Explicit vs implicit SQL joins
This depends on the ORM you're using and how intelligent it is at optimizing your queries for your backend.
Entity Framework can generate some pretty awful SQL if you don't do your linq perfectly, so I'd assume query2 is better.
The only way for you to know for sure would be to inspect the SQL being generated by the two queries.
Eyeballing it, it looks like query1 would result in both tables being pulled in their entirety and then being filtered against each other in your application, while query2 will for sure generate an INNER JOIN in the query, which will let SQL Server do what it does best - set logic.
Is that FK_IdCategory field a member of an actual foreign key index on that table? If not, make it so (and include the name column as an included column in the index) and your query will be very highly performant.
With linq2Sql or EntityFramework, you would probably do something like this:
var query = from i in db.Item
select new {i.name, i.Category.Name}
This will generate a proper SQL inner join.
I do assume that there is a foreign key relation between Item and Category defined.

LINQ - Speeding up query that has a join to a huge table

I have these two tables
ExpiredAccount Account
-------------- ---------------
ExpiredAccountID AccountID
AccountID (fk) AccountName
... ...
Basically, I want to return a list of ExpiredAccounts displaying the AccountName in the result.
I currently do this using
var expiredAccounts = (from x in ExpiredAccount
join m in Account on x.AccountID equals m.AccountID
select m.AccountName).ToList()
This works fine. However, this takes too long.
There's not a lot of records in expiredAccounts (<200).
The Account table on the otherhand has over 300,000 records.
Is there anyway I could speed up my query, or alternatively, another way to do this more efficiently with or without using LINQ?
Firstly, assuming you are using Entity Framework, you don't need to be using the join at all. You could simply do:
var expiredAccounts = (from x in ExpiredAccount
select x.Account.AccountName).ToList()
However, I don't think they will generate a different query plan on the database. But my guess is that you don't have an index on AccountID in the Account table (although that seems unlikely).
One thing you can do is use ToTraceString (for example: http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/4a17b992-05ca-4e3b-9910-0018e7cc9c8c/) to get the SQL which is being run. Then you can open SQL Management Studio and run that with the execution plan option turned on and it will show you what the execution plan was and what indexes need to be added to make it better.
You can try using Contains method:
var expiredAccounts = (from m in Account where ExpiredAccount.Select(x => x.AccountId)
.Contains(m.AccountId)
select m.AccountName).ToList()
It should generate IN clause in SQL query that will be performed agains database.

Order by a field which is a Navigation Property to an Entity - Linq to Entity

I've got a scenario where I will need to order by on a column which is a navigation property for the Users entity inside my EF model.
The entities:
Users --> Countries 1:n relationship
A simple SQL query would be as follows:
SELECT UserId, u.Name, c.Name
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
So then I tried to replicate the above SQL query using Linq to Entities as follows - (Lazy Loading is enabled)
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native SQL query above does.
However I continued a bit more and did the following:
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
But ordering on the enumeratedUser object for about 50 records took approx. 7seconds
Is there a better way how to omit the Enumerable and without returning an anonymous type?
Thanks
EDIT
I just forgot to say that the EF provider is a MySQL one not a MS SQL. In fact I just tried the same query on a replicated database in MS SQL and the query works fine i.e. the country name is ordered correctly, so it looks like I have no other option apart from getting the result set from MySQL and execute the order by from the memory on the enumerable object
var enumeratedUsers = entities.users.AsEnumerable();
users = enumeratedUsers.OrderBy(fields => fields.country.Name).ToList();
This is LINQ to Objects not LINQ to Entities.
Above Order By clause will call OrderBy defined in Enumerable
That is ordering will be done in memory. Hence it will take long time
Edit
It looks like a MySQL related issue
You may try something like this.
var users = from user in entities.users
join country in entities.Country on user.CountryId equals country.Id
orderby country.Name
select user;
entities.users.OrderBy(field => field.country.Name).ToList();
But this query does not return my countries sorted by their name as the native
SQL query above does.
Yes, it does not return Countries but only Users sorted by the name of country.
When this query is executed, the following sql is sent to DB.
SELECT u.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
As you can see, the result does not include any fields of countries. As you mentioned the lazy loading, countires are loaded through it when needed. At this time, countries are ordered as the order you call it through the lazy loading. You can access countries through the Local property of a entity set.
This point tells you that if you want user sorted by the name of country and also countires sorted by the name, you need the eagerly loading as #Dennis mentioned like:
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
This is converted to the following sql.
SELECT u.*, c.*
FROM users u join countries c on u.CountryId = c.CountryId
ORDER BY c.Name asc;
Have you tried using Include?
entities.users.Include["country"].OrderBy(field => field.country.Name).ToList();
SOLUTION
Since I had both columns named Name in both Countries and Users table MySQL Connector was generating this output when order by country.Name was executed:
SELECT `Extent1`.`Username`, `Extent1`.`Name`, `Extent1`.`Surname`, `Extent1`.`CountryId`
FROM `users` AS `Extent1` INNER JOIN `countries` AS `Extent2` ON `Extent1`.`CountryId` = `Extent2`.`CountryId`
ORDER BY `Name` ASC
therefore this will result in ordering on the users.Name rather countries.Name
However MySQL have release version 6.4.3 .NET connector which has resolved a bunch of issues one of them being:
We are also including some SQL generation improvements related to our entity framework provider. Source: http://forums.mysql.com/read.php?3,425992
Thank you for all your input. I tried to be clear as much as possible to help others which might encounter my same issue.

Categories

Resources