C# LINQ expression cannot be translated when summing group joins - c#

I have two tables: users and transactions. The transactions table stores user wallet transactions (+ balance, - balance) etc. I'm trying to obtain a list of users based on the current balance of their wallet ordered by largest first. (I can get the total balance by summing the Amount column in the Transactions table).
var query = from u in _context.users
join t in _context.Transactions on u.id equals t.UserId into gj
from txn in gj.DefaultIfEmpty()
where txn.TransactionStatus == Transaction.Status.Success && !u.Deleted.HasValue
select new
{
balance = gj.Sum(a => a.Amount),
user = u,
};
var result = await query.OrderByDescending(t => t.balance).Skip(offset).Take(range).ToListAsync();
I am getting the error:
The LINQ expression 'gj' could not be translated.
This is the equivalent SQL I'm trying to achieve:
SELECT balance, u.* FROM
(SELECT COALESCE(SUM(Amount), 0) as balance, u.id
FROM dbo.users u
LEFT JOIN dbo.Transactions t ON(u.id = t.UserId AND t.TransactionStatus = 0)
WHERE u.Deleted IS NULL
GROUP BY u.id) as tbl
JOIN dbo.users u ON(u.id = tbl.id)
ORDER BY balance DESC

Starting from users and using the navigation props to go through to Trans should make this trivial. Give something like this a go:
users.Where(u => u.Deleted == null)
.Select(u => new {
User = u,
Balance = u.Transactions.Where(t => t.Status == 0).Sum(t => t.Amount)
})
.OrderByDescending(at => at.Balance);
If the logic truly is "list all users but only add up transactions for users that are non deleted, and show deleted users as 0 balance" then:
users.Where(u => u.Deleted == null)
.Select(u => new {
User = u,
Balance = u.Deleted != null ? 0 : u.Transactions.Where(t => t.Status == 0).Sum(t => t.Amount)
})
.OrderByDescending(at => at.Balance);
Try not to write EF queries like "right, in SQL I would have this table and join that table, and it'd be a left join, and that would be summed... so I'll tell EF to do this context set join that context set, and it's a left join, and sum..." - EF will write joins for you; express your requirements in terms of how you want the C# object graph to be manipulated and let EF do the conversion to SQL how it can; use the navigation props between entities so it can work out how you want to bring your data together and arrange the necessary joins. It seldom needs micromanaging in a SQL-ey flavored approach

Related

How to rewrite raw sql query with join to EF Core LINQ?

I have this query:
SELECT *
FROM public.lifecycle_data lifecycle_data
INNER JOIN
(SELECT *
FROM public.users
WHERE id = 123) AS t1 ON lifecycle_data.reference = t1.id
WHERE updated IS NULL
ORDER BY created DESC
I want to rewrite this query with EF Core LINQ and I tried this:
var users = db.LifeCycle
.Where(l => l.Updated == null)
.Join(db.Users,
l => l.Reference,
u => u.Id,
(lifeCycle, user) => new User()
{
Id = lifeCycle.Id,
FieldOne = user.FieldOne,
FieldTwo = user.FieldTwo,
Created = lifeCycle.Created
})
.Where(u => u.Id == 123)
.OrderBy(c => c.Created)
.ToList();
But it's interpreted as:
SELECT
l.id AS "Id", u.field_one AS "FieldOne", u.field_two AS "FieldTwo", l.created AS "Created"
FROM
lifecycle.lifecycle_data AS l
INNER JOIN
users.users AS u ON l.reference = u.id
WHERE
FALSE
ORDER BY l.created
It's the ORM's job to generate the query, especially the JOINs, from the relations between entities. There are no tables in EF (or any other ORM), there are entities. A DbContext isn't a model of the database and LINQ isn't a replacement for SQL.
The equivalent in EF Core 5 and later would be a Filtered Include, assuming User has a LifecycleData collection property:
var users = dbContext.Users
.Include(u=>u.LifeCycleData
.Where(l=>l.Updated==null)
.OrderBy(l=>l.Created))
.Where(u=>u.d==123);

rewrite a correlated subqueries in ef core 2.1

Is there a way i can rewrite this query so it is not a correlated subqueries ?
var query = (from o in dbcontext.Orders
let lastStatus = o.OrderStatus.Where(x => x.OrderId == o.Id).OrderByDescending(x => x.CreatedDate).FirstOrDefault()
where lastStatus.OrderId != 1
select new { o.Name, lastStatus.Id }
).ToList();
This resulted in:
SELECT [o].[Name], (
SELECT TOP(1) [x0].[Id]
FROM [OrderStatus] AS [x0]
WHERE ([x0].[OrderId] = [o].[Id]) AND ([o].[Id] = [x0].[OrderId])
ORDER BY [x0].[CreatedDate] DESC
) AS [Id]
FROM [Orders] AS [o]
WHERE (
SELECT TOP(1) [x].[OrderId]
FROM [OrderStatus] AS [x]
WHERE ([x].[OrderId] = [o].[Id]) AND ([o].[Id] = [x].[OrderId])
ORDER BY [x].[CreatedDate] DESC
) <> 1
I have tried to do a join on a subquery but EF 2.1 is doing weird things... not what I expected;
var query = (from o in dbcontext.Orders
join lastStat in (from os in dbcontext.OrderStatus
orderby os.CreatedDate descending
select new { os }
) on o.Id equals lastStat.os.OrderId
where lastStat.os.StatusId != 1
select new { o.Name, lastStat.os.StatusId }).ToList();
In EF6 replacing
let x = (...).FirstOrDefault()
with
from x in (...).Take(1).DefaultIfEmpty()
usually generates better SQL.
So normally I would suggest
var query = (from o in db.Set<Order>()
from lastStatus in o.OrderStatus
.OrderByDescending(s => s.CreatedDate)
.Take(1)
where lastStatus.Id != 1
select new { o.Name, StatusId = lastStatus.Id }
).ToList();
(no need of DefaultIfEmpty (left join) because the where condition will turn it to inner join anyway).
Unfortunately currently (EF Core 2.1.4) there is a translation issue so the above leads to client evaluation.
The current workaround is to replace the navigation property accessor o.OrderStatus with correlated subquery:
var query = (from o in db.Set<Order>()
from lastStatus in db.Set<OrderStatus>()
.Where(s => o.Id == s.OrderId)
.OrderByDescending(s => s.CreatedDate)
.Take(1)
where lastStatus.Id != 1
select new { o.Name, StatusId = lastStatus.Id }
).ToList();
which produces the following SQL for SqlServer database (lateral join):
SELECT [o].[Name], [t].[Id] AS [StatusId]
FROM [Orders] AS [o]
CROSS APPLY (
SELECT TOP(1) [s].*
FROM [OrderStatus] AS [s]
WHERE [s].[OrderId] = [o].[Id]
ORDER BY [s].[CreatedDate] DESC
) AS [t]
WHERE [t].[Id] <> 1
I will assume that you are actually fetching all the Orders, but only a portion of them (a page or a batch for processing).
In this case, it might be better to split it in two queries (not tested though):
var orders = dbcontext.Orders.Where(o => /* some filter logic */);
var orderIds = orders.Select(o => o.OrderId).ToList();
// get status for latest change - this should query OrderStatus only
var statusNameMap = dbContext.OrderStatus
.Where(os => orderIds.Contains(Id))
.GroupBy(os => os.OrderId)
.Select(grp => grp.OrderByDescending(grp => grp.CreatedDate).First())
.ToDictionary(os => os.OrderId, os => os.StatusId);
// aggregate the results
// the orders might fetch only the needed columns to have less data on the wire
var result = orders.
.ToList()
.Select(o => new { o.Name, statusNameMap[o.OrderId] });
I do not think the queries will be nicer, but it might be easier to understand what is going on here.
If you really have to process all Orders and you have many of them (or many Statuses), you might consider maintaining a LastStatusId column directly in Order table (this should be updated whenever a status is changed).

EF Complex Query Join

The query shown below is very straight forward, it'll simply pull up tasks for a specified customer. What I'd now like to be able to do is take a UserId that is passed into this function and validate that the user has permission to view the task.
var dbData = await db.Tasks
.Where(a => a.CustomerId == customerId)
.OrderBy(a => a.CreatedDateTime).ToListAsync();
There is a property in the Tasks table for OrganizationId. A User can belong to n+1 Organizations via a UserOrganizations table. What is the best way to take the known UserId and validate the the Task.OrganizationId is one of the User's?
If the relations are not already properties on the Tasks class, you can write your join in query-syntax. Something along these lines:
var dbData = await (from t in db.Tasks
join uo in UserOrganizations on t.OrganizationId equals uo.OrganizationId
join u in Users on uo.UserId equals u.UserId
where t.CustomerId == customerId && u.UserId == theUserId
order by t.CreatedDateTime
select t).ToListAsync();
Depending on how your data classes where generated, you might already have navigation properties on the Tasks class, allowing you to do:
var dbData = await db.Tasks
.Where(a => a.CustomerId == customerId && a.Organization.UserOrganizations.Any(uo => uo.UserId == theUserId)
.OrderBy(a => a.CreatedDateTime).ToListAsync();
var dbData = await db.Tasks
.Where(a => a.CustomerId == customerId
&& a.Organization.Users
.Any(u=>u.UserId == customerId)))
.OrderBy(a => a.CreatedDateTime).ToListAsync();
This is given the foreign keys are setup and relationships are navigable through the entities.

Linq Query with double sub queries

I am struggling converting the following SQL query I wrote into Linq. I think I'm on the right track, but I must be missing something.
The error I'm getting right now is:
System.Linq.IQueryable does not contain a definition for .Contains
Which is confusing to me because it should right?
SQL
select Users.*
from Users
where UserID in (select distinct(UserID)
from UserPermission
where SupplierID in (select SupplierID
from UserPermission
where UserID = 6))
LINQ
var Users = (from u in _db.Users
where (from up in _db.UserPermissions select up.UserID)
.Distinct()
.Contains((from up2 in _db.UserPermissions
where up2.UserID == 6
select up2.SupplierID))
select u);
EDIT: I ended up going back to SqlCommand objects as this was something I had to get done today and couldn't waste too much time trying to figure out how to do it the right way with Linq and EF. I hate code hacks :(
I think there is no need to do a distinct here (maybe I am wrong). But here is a simpler version (assuming you have all the navigational properties defined correctly)
var lstUsers = DBContext.Users.Where(
x => x.UserPermissions.Any(
y => y.Suppliers.Any(z => z.UserID == 6)
)
).ToList();
Above if you have UserID field in Supplier entity, if it is NOT you can again use the navigational property as,
var lstUsers = DBContext.Users.Where(
x => x.UserPermissions.Any(
y => y.Suppliers.Any(z => z.User.UserID == 6)
)
).ToList();
Contains() only expects a single element, so it won't work as you have it written. Try this as an alternate:
var Users = _db.Users
.Where(u => _db.UserPermissions
.Select(x => UserID)
.Distinct()
.Where(x => _db.UserPermissions
.Where(y => y.UserID == 6)
.Select(y => y.SupplierID)
.Contains(x))
);
I didn't try on my side but you can try using the let keyword:
var Users = (from u in _db.Users
let distinctUsers = (from up in _db.UserPermissions select up).Distinct()
let subQuery = (from up2 in _db.UserPermissions
where up2.UserID == 6
select up2)
where
distinctUsers.SupplierID== subQuery.SupplierID &&
u.UserID==distinctUsers.UserID
select u);

Is there any way to optimize this LINQ to Entities query?

I was asked to produce a report that is driven by a fairly complex SQL query against a SQL Server database. Since the site of the report was already using Entity Framework 4.1, I thought I would attempt to write the query using EF and LINQ:
var q = from r in ctx.Responses
.Where(x => ctx.Responses.Where(u => u.UserId == x.UserId).Count() >= VALID_RESPONSES)
.GroupBy(x => new { x.User.AwardCity, x.Category.Label, x.ResponseText })
orderby r.FirstOrDefault().User.AwardCity, r.FirstOrDefault().Category.Label, r.Count() descending
select new
{
City = r.FirstOrDefault().User.AwardCity,
Category = r.FirstOrDefault().Category.Label,
Response = r.FirstOrDefault().ResponseText,
Votes = r.Count()
};
This query tallies votes, but only from users who have submitted a certain number of required minimum votes.
This approach was a complete disaster from a performance perspective, so we switched to ADO.NET and the query ran very quickly. I did look at the LINQ generated SQL using the SQL Profiler, and although it looked atrocious as usual I didn't see any clues as to how to optimize the LINQ statement to make it more efficient.
Here's the straight TSQL version:
WITH ValidUsers(UserId)
AS
(
SELECT UserId
FROM Responses
GROUP BY UserId
HAVING COUNT(*) >= 103
)
SELECT d.AwardCity
, c.Label
, r.ResponseText
, COUNT(*) AS Votes
FROM ValidUsers u
JOIN Responses r ON r.UserId = u.UserId
JOIN Categories c ON r.CategoryId = c.CategoryId
JOIN Demographics d ON r.UserId = d.Id
GROUP BY d.AwardCity, c.Label, r.ResponseText
ORDER BY d.AwardCity, s.SectionName, COUNT(*) DESC
What I'm wondering is: is this query just too complex for EF and LINQ to handle efficiently or have I missed a trick?
Using a let to reduce the number of r.First()'s will probably improve performance. It's probably not enough yet.
var q = from r in ctx.Responses
.Where()
.GroupBy()
let response = r.First()
orderby response.User.AwardCity, response.Category.Label, r.Count() descending
select new
{
City = response.User.AwardCity,
Category = response.Category.Label,
Response = response.ResponseText,
Votes = r.Count()
};
Maybe this change improve the performance, removing the resulting nested sql select in the where clause
First get the votes of each user and put them in a Dictionary
var userVotes = ctx.Responses.GroupBy(x => x.UserId )
.ToDictionary(a => a.Key.UserId, b => b.Count());
var cityQuery = ctx.Responses.ToList().Where(x => userVotes[x.UserId] >= VALID_RESPONSES)
.GroupBy(x => new { x.User.AwardCity, x.Category.Label, x.ResponseText })
.Select(r => new
{
City = r.First().User.AwardCity,
Category = r.First().Category.Label,
Response = r.First().ResponseText,
Votes = r.Count()
})
.OrderByDescending(r => r.City, r.Category, r.Votes());

Categories

Resources