I'm trying to write a LINQ-to-entities query that will take an ICollection navigation property of my main object and attach some metadata to each of them which is determined through joining each of them to another DB table and using an aggregate function. So the main object is like this:
public class Plan
{
...
public virtual ICollection<Room> Rooms { get; set; }
}
And my query is this:
var roomData = (
from rm in plan.Rooms
join conf in context.Conferences on rm.Id equals conf.RoomId into cjConf
select new {
RoomId = rm.Id,
LastUsedDate = cjConf.Count() == 0 ? (DateTime?)null : cjConf.Max(conf => conf.EndTime)
}
).ToList();
What I want is for it to generate some efficient SQL that uses the aggregate function MAX to calculate the LastUsedDate, like this:
SELECT
rm.Id, MAX(conf.EndTime) AS LastUsedDate
FROM
Room rm
LEFT OUTER JOIN
Conference conf ON rm.Id = conf.RoomId
WHERE
rm.Id IN ('a967c9ce-5608-40d0-a586-e3297135d847', '2dd6a82d-3e76-4441-9a40-133663343d2b', 'bb302bdb-6db6-4470-a24c-f1546d3e6191')
GROUP BY
rm.id
But when I profile SQL Server it shows this query from EF:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[RoomId] AS [RoomId],
[Extent1].[ProviderId] AS [ProviderId],
[Extent1].[StartTime] AS [StartTime],
[Extent1].[EndTime] AS [EndTime],
[Extent1].[Duration] AS [Duration],
[Extent1].[ParticipantCount] AS [ParticipantCount],
[Extent1].[Name] AS [Name],
[Extent1].[ServiceType] AS [ServiceType],
[Extent1].[Tag] AS [Tag],
[Extent1].[InstantMessageCount] AS [InstantMessageCount]
FROM [dbo].[Conference] AS [Extent1]
So it is selecting everything from Conference and doing the Max() calculation in memory, which is very inefficient. How can I get EF to generate the proper SQL query with the aggregate function in?
The equivalent LINQ to Entities query which closely translates to the SQL query you are after is like this:
var roomIds = plan.Rooms.Select(rm => rm.Id);
var query =
from rm in context.Rooms
join conf in context.Conferences on rm.Id equals conf.RoomId
into rmConf from rm in rmConf.DefaultIfEmpty() // left join
where roomIds.Contains(rm.Id)
group conf by rm.Id into g
select new
{
RoomId = g.Key,
LastUsedDate = g.Max(conf => (DateTime?)conf.EndTime)
};
The trick is to start the query from EF IQueryable, thus allowing it to be fully translated to SQL, rather than from plan.Rooms as in the query in question which is IEnumerable and makes the whole query execute in memory (context.Conferences is treated as IEnumerable and causes loading the whole table in memory).
The SQL IN clause is achieved by in memory IEnumerable<Guid> and Contains method.
Finally, there is no need to check the count. SQL naturally handles nulls, all you need is to make sure to call the nullable Max overload, which is achieved with the (DateTime?)conf.EndTime cast. There is no need to check conf for null as in LINQ to Objects because LINQ to Entities/SQL handles that naturally as well (as soon the receiver variable is nullable).
Since plan.Rooms isn't IQueryable with a query provider attached, the join statement is compiled as Enumarable.Join. This means that context.Conferences is implicitly cast to IEumerable and its content is pulled into memory before other operators are applied to it.
You can fix this by not using join:
var roomIds = plan.Rooms.Select(r => r.Id).ToList();
var maxPerRoom = context.Conferences
.Where(conf => roomIds.Contains(conf.RoomId))
.GroupBy(conf => conf.RoomId)
.Select(g => new
{
RoomId = g.Key,
LastUsedDate = g.Select(conf => conf.EndTime)
.DefaultIfEmpty()
.Max()
}
).ToList();
var roomData = (
from rm in plan.Rooms
join mx in maxPerRoom on rm.Id equals mx.RoomId
select new
{
RoomId = rm.Id,
LastUsedDate = mx.LastUsedDate
}
).ToList();
This first step collects the LastUsedDate data from the context and then joins with the plan.Rooms collection in memory. This last step isn't even necessary if you're not interested in returning/displaying anything else than the room's Id, but that's up to you.
Related
I have a linq query which gave me the warning but it still works. I want to get rid of the warning.
uses First/FirstOrDefault/Last/LastOrDefault operation without OrderBy and filter which may lead to unpredictable results.
The linq query is
var list = (from u in _db.user
join r in _db.resource on u.userId equals r.userId
join t in _db.team on u.bossId equals t.bossId
where r.pid == pid
select new MyDto
{
pid = pid,
userId = u.userId,
teamId = t.teamId,
name = t.name
}).GroupBy(d => d.userId).Select(x => x.First()).OrderBy(y => y.userId).ToList();
I use EntityFramework Core 2.1
UPDATE:
I changed the code by the comments.
var list = (from u in _db.user
join r in _db.resource on u.userId equals r.userId
join t in _db.team on u.bossId equals t.bossId
where r.pid == pid
select new MyDto
{
pid = pid,
userId = u.userId,
teamId = t.teamId,
name = t.name
})
.GroupBy(d => d.userId)
.Select(x => x.OrderBy(y => y.userId)
.First())
.ToList();
Then there is a different warning.
The LINQ expression 'GroupBy([user].userId, new MyDto() {pid =
Convert(_8_locals1_pid_2, Int16), userId = [user].UserId, .....) could
not be translated and will be evaluated locally.
We have this expression
.Select(x => x.First())
Which record will be first for that expression? There's no way to know, because at this point the OrderBy() clause which follows hasn't processed yet. You could get different results each time you run the same query on the same data, depending on what order the records were returned from the database. The results are not predictable, exactly as the error message said.
But surely the database will return them in the same order each time? No, you can't assume that. The order of results in an SQL query is not defined unless there is an ORDER BY clause with the query. Most of the time you'll get primary key ordering (which does not have to match insert order!), but there are lots of things that can change this: matching a different index, JOIN to a table with a different order or different index, parallel execution with another query on the same table + round robin index walking, and much more.
To fix this, you must call OrderBy() before you can call First().
Looking a little deeper, this is not even part of the SQL. This work is happening on your client. That's not good, because any indexes on the table are no longer available. It should be possible to do all this work on the database server, but selecting the first record of a group may mean you need a lateral join/APPLY or row_number() windowing function, which are hard to reproduce with EF. To completely remove all warnings, you may have to write a raw SQL statement:
select userId, teamId, name, pid
from (
select u.userId, t.teamId, t.name, r.pid, row_number() over (order by u.userId) rn
from User u
inner join resource r on r.userId = u.userId
inner join team t on t.bossId = u.bossId
where r.pid = #pid
) d
where d.rn = 1
Looking around, it is possible to use row_number() in EF, but at this point I personally find the SQL much easier to work with. My view is ORMs don't help for these more complicated queries, because you still have to know the SQL you want, and you also have to know the intricacies of the ORM in order to build it. In other words, the tool that was supposed to make your job easier made it harder instead.
I wrote below query in MsSQL now I want to write this query using C# linq
SELECT JD.*
FROM Job_Details JD
INNER JOIN MstCustomer Cust ON JD.Cust_ID = Cust.Cust_ID
WHERE Cust.SAP = 'Yes'
A fairly simple join would do.
from jd in Job_Details
join cust in MstCustomer
on jd.Cust_ID equals cust.Cust_ID
where cust.SAP == 'Yes'
select jd
You asked for it using a lambda expression
You only want the Customers with Cust.SAP equal to "Yes", but you don't want the SAP in the end result. Hence it is more efficient to join only with the customers you actually want in your final result. Therefore do the Where before the Join:
IQueryable<JobDetail> jobDetails = ...
IQueryable<Customer> mstCustomers = ...
// Step 1: filter only the Yes customers:
var yesCustomers = mstCustomers.Where(customer => customer.SAP == "Yes");
// Step 2: Perform the join and select the properties you want:
var result = jobDetails.Join(yesCustomers, // Join jobDetails and yesCustomers
jobDetail => jobDetail.Cust_Id, // from every jobDetail take the Cust_Id
customer = customer.Cust_Id, // from every customer take the Cust_Id
(jobDetail, customer) => new // when they match use the matching items
{ // to create a new object
// select only the properties
// from jobDetail and customer
// you really plan to use
})
TODO: if desired, make it one big LINQ statement. Note that this doesn't influence the performance very much, as these statements do not perform the query. They only change the Expression of the query. Only items that do not return IQueryable perform the query: ToList / FirstOrDefault / Any / Max / ...
I need help with a search method for searching the tables for a matching text.
This works, except that the join needs to be LEFT OUTER JOIN otherwise I dont get any results if the pageId is missing in any of the tables.
This solution takes to long time to run, I would appreciate if someone can help me out with a better solution to handle this task.
public async Task<IEnumerable<Result>> Search(string query)
{
var temp = await (from page in _context.Pages
join pageLocation in _context.PageLocations on page.Id equals pageLocation.PageId
join location in _context.Locations on pageLocation.LocationId equals location.Id
join pageSpecialty in _context.PageSpecialties on page.Id equals pageSpecialty.PageId
join specialty in _context.Specialties on pageSpecialty.SpecialtyId equals specialty.Id
where
page.Name.ToLower().Contains(query)
|| location.Name.ToLower().Contains(query)
|| specialty.Name.ToLower().Contains(query)
select new Result
{
PageId = page.Id,
Name = page.Name,
Presentation = page.Presentation,
Rating = page.Rating
}).ToListAsync();
var results = new List<Result>();
foreach (var t in temp)
{
if (!results.Exists(p => p.PageId == t.PageId))
{
t.Locations = GetLocations(t.PageId);
t.Specialties = GetSpecialties(t.PageId);
results.Add(t);
}
}
return results;
}
Using navigation properties, the query could look like:
var temp = await (from page in _context.Pages
where Name.Contains(query)
|| page.PageLocation.Any(pl => pl.Location.Name.Contains(query))
|| page.PageSpecialties.Any(pl => pl.Specialty.Name.Contains(query))
select new Result
{
PageId = page.Id,
Name = page.Name,
Presentation = page.Presentation,
Rating = page.Rating,
Locations = page.PageLocation.Select(pl => pl.Location),
Specialties = page.PageSpecialties.Select(pl => pl.Specialty)
}).ToListAsync();
This has several benefits:
By the absence of joins, The query returns unique Result objects right away, so you don't need to deduplicate them afterwards.
The locations and specialties are loaded in the same query instead of two queries per Result (aka n+1 problem).
(Likely) ToLower is removed because the search is probably not case sensitive anyway. The query is executed as SQL and most of the times, SQL databases have case-insensitive collations. Removing ToLower makes the query sargable again.
I am pretty new to Entity Framework and LINQ and I have an entity with more than 10+ other associated entities (one-to-many relationships). Now, I'm planning to make a search page in my application in which users could select which fields (i.e. those 10+ tables) they want to be considered when searching.
Now, I'm trying to write a query to achieve the above goal. Any help how I could sort this out using LINQ method syntax? I mean, to write a multiple join query based on user's choice. (i.e. which of Class1, Class2, ... to join with main Entity to finally have all the related fields in one place). Below is a sample code (Just a hunch, in fact)
if(somefilter#1)
result = db.Companies.Join(db.Channels, p => p.Id, k => k.CId,
(p, k) => new {Company = p, Channels=k});
if(somefilter#2)
result = result.Join(db.BusinnessType, ........);
if(somefilter#3)
result = result.Join(db.Values, .......);
For complex queries it may be easier to use the other LINQ notation. You could join multiple entities like this:
from myEntity in dbContext.MyEntities
join myOtherEntity in dbContext.MyOtherEntities on myEntity.Id equals myOtherEntity.MyEntityId
join oneMoreEntity in dbContext.OneMoreEntities on myEntity.Id equals oneMoreEntity.MyEntityId
select new {
myEntity.Id,
myEntity.Name,
myOtherEntity.OtherProperty,
oneMoreEntity.OneMoreProperty
}
You can join in other entities by adding more join statements.
You can select properties of any entity from your query. The example I provided uses a dynamic class, but you can also define a class (like MyJoinedEntity) into which you can select instead. To do it you would use something like:
...
select new MyJoinedEntity {
Id = myEntity.Id,
Name = myEntity.Name,
OtherProperty = myOtherEntity.OtherProperty,
OneMoreProperty = oneMoreEntity.OneMoreProperty
}
EDIT:
In case when you want to have conditional joins you can define MyJoinedEntity with all the properties you will need if you were to join everything. Then break up the join into multiple methods. Like this:
public IEnumerable<MyJoinedEntity> GetEntities() {
var joinedEntities = from myEntity in dbContext.MyEntities
join myOtherEntity in dbContext.MyOtherEntities on myEntity.Id equals myOtherEntity.MyEntityId
join oneMoreEntity in dbContext.OneMoreEntities on myEntity.Id equals oneMoreEntity.MyEntityId
select new MyJoinedEntity {
Id = myEntity.Id,
Name = myEntity.Name,
OtherProperty = myOtherEntity.OtherProperty,
OneMoreProperty = oneMoreEntity.OneMoreProperty
};
if (condition1) {
joinedEntities = JoinWithRelated(joinedEntities);
}
}
public IEnumerable<MyJoinedEntity> JoinWithRelated(IEnumerable<MyJoinedEntity> joinedEntities) {
return from joinedEntity in joinedEntities
join relatedEntity in dbContext.RelatedEntities on joinedEntity.Id equals relatedEntity.MyEntityId
select new MyJoinedEntity(joinedEntity) {
Comments = relatedEntity.Comments
};
}
I have a sql statement like this:
DECLARE #destinations table(destinationId int)
INSERT INTO #destinations
VALUES (414),(416)
SELECT *
FROM GroupOrder grp (NOLOCK)
JOIN DestinationGroupItem destItem (NOLOCK)
ON destItem.GroupOrderId = grp.GroupOrderId
JOIN #destinations dests
ON destItem.DestinationId = dests.destinationId
WHERE OrderId = 5662
I am using entity framework and I am having a hard time getting this query into Linq. (The only reason I wrote the query above was to help me conceptualize what I was looking for.)
I have an IQueryable of GroupOrder entities and a List of integers that are my destinations.
After looking at this I realize that I can probably just do two joins (like my SQL query) and get to what I want.
But it seems a bit odd to do that because a GroupOrder object already has a list of DestinationGroupItem objects on it.
I am a bit confused how to use the Navigation property on the GroupOrder when I have an IQueryable listing of GroupOrders.
Also, if possible, I would like to do this in one trip to the database. (I think I could do a few foreach loops to get this done, but it would not be as efficient as a single IQueryable run to the database.)
NOTE: I prefer fluent linq syntax over the query linq syntax. But beggars can't be choosers so I will take whatever I can get.
If you already have the DestinationGroupItem as a Navigation-property, then you already have your SQL-JOIN equivalent - example. Load the related entities with Include. Use List's Contains extension method to see if the desired DestinationId(s) is(are) hit:
var destinations = new List<int> { 414, 416 };
var query = from order in GroupOrder.Include(o => o.DestinationGroupItem) // this is the join via the navigation property
where order.OrderId == 5662 && destinations.Contain(order.DestinationGroupItem.DestinationId)
select order;
// OR
var query = dataContext.GroupOrder
.Include(o => o.DestinationGroupItem)
.Where(order => order.OrderId == 5662 && destinations.Contain(order.DestinationGroupItem.DestinationId));