Write multiple Sum() SQL query using Lambda in Entity Framework C# - c#

Below is my SQL query that I'm looking to convert to Lambda
SELECT
SUM("Rating" * "Rating") / SUM("Rating")
FROM
public."CustomerRating"
WHERE
"DriverId" = '232'
This is my C# code:
public double GetDriverAvgRating(string id)
{
var driver = _context.CustomerRating
.AsQueryable()
.Where(d => d.DriverId == id);
var avgDriverRating = // i need to perform that query here
return avgDriverRating;
}

I believe EF will be able to correctly translate it as a grouping, something like
var driver = _context.CustomerRating
.Where(cr => cr.DriverId == id)
.GroupBy(cr => cr.DriverId, cr => cr.Rating)
.Select(g => g.Select(r => r * r).Sum() / g.Sum())
.First();
It might have some extra fluff in the query (eg a group by that produces only one group) but I don't expect it'll make any significant difference to the overall performance/planning of the query

Related

LINQ avoid OUTER APPLY Oracle 11g

When running the LINQ Query below against Oracle 11g instance, it will throw an OUTER APPLY not supported error.
var shipmentDetails = (
from r in _db.XXF_SHIPMENT_DETAILs
where r.SHIP_TO == tradingPartnerId && r.PICKUP_DATE >= pickUpDate
select r)
.GroupBy(x => x.HEADERID)
.Select(x => x.FirstOrDefault());
"OUTER APPLY is not supported by Oracle Database 11g and lower. Oracle
12c or higher is required to run this LINQ statement correctly. If you
need to run this statement with Oracle Database 11g or lower, rewrite
it so that it can be converted to SQL, supported by the version of
Oracle you use."
The solution is to use simple statements to achieve the results you are after. Referencing the query above, we...
First, get all the shipments. Use the .ToList() to force query execution
var shipmentDetails = (from r in _db.XXF_SHIPMENT_DETAILs where r.SHIP_TO == tradingPartnerId && r.PICKUP_DATE >= pickUpDate select r).ToList();
Now .GroupBy() and .Select() to filter - but this will be done in memory and not at the server level therefore avoiding the unsupported OUTER APPLY
var uniqueShipmentsWithDistinctHeaderIds = shipmentDetails.GroupBy(x => x.HEADERID).Select(x => x.FirstOrDefault());
You can use the following query which will get latest record from the group:
var filtered = db.XXF_SHIPMENT_DETAILs
.Where(r => r.SHIP_TO == tradingPartnerId && r.PICKUP_DATE >= pickUpDate);
var grouped = fltered
.GroupBy(r => r.HEADERID)
.Select(g => new
{
HEADERID = g.Key,
LastId = g.Max(x => x.Id)
});
var shipmentDetails =
from s in filtered
join g in grouped on s.LastId equals g.Id
select s;
Still not the best as raw SQL and window functions, but should give much better performance than processing data on the client side.

EF Core 2.1 GROUP BY and select first item in each group

Let's imaging a forum having a list of topics and posts in them.
I want to get the list of topics and a title of last post (by date) for each topic.
Is there a way to achieve this using EF Core (2.1)?
In SQL it could be done like
SELECT Posts.Title, Posts.CreatedDate, Posts.TopicId FROM
(SELECT Max(CreatedDate), TopicId FROM Posts GROUP BY TopicId) lastPosts
JOIN Posts ON Posts.CreatedDate = lastPosts.CreatedDate AND Posts.TopicId = lastPosts.TopicId
In EFCore I can select LastDates
_context.Posts.GroupBy(x => x.TopicId, (x, y) => new
{
CreatedDate = y.Max(z => z.CreatedDate),
TopicId = x,
});
And if I run .ToList() the query is correctly translated to GROUP BY.
But I can't go further.
The following is executed in memory, not in SQL (resulting in SELECT * FROM Posts):
.GroupBy(...)
.Select(x => new
{
x.TopicId,
Post = x.Posts.Where(z => z.CreatedDate == x.CreatedDate)
//Post = x.Posts.FirstOrDefault(z => z.CreatedDate == x.CreatedDate)
})
Attempting to JOIN gives NotSupportedException (Could not parse expression):
.GroupBy(...)
.Join(_context.Posts,
(x, y) => x.TopicId == y.TopicId && x.CreatedDate == y.CreatedDate,
(x, post) => new
{
post.Title,
post.CreatedDate,
})
I know I can do it using SELECT N+1 (running a separate query per topic), but I'd like to avoid that.
I don't know since which version of EFCore it's possible, but there's a simpler single-query alternative now:
context.Topic
.SelectMany(topic => topic.Posts.OrderByDescending(z => z.CreatedDate).Take(1),
(topic, post) => new {topic.Id, topic.Title, post.Text, post.CreatedDate})
.OrderByDescending(x => x.CreatedDate)
.ToList();
Basically what I'm doing now is after running
var topics = _context.Posts.GroupBy(x => x.TopicId, (x, y) => new
{
CreatedDate = y.Max(z => z.CreatedDate),
TopicId = x,
}).ToList();
I build the following query:
Expression<Func<Post, bool>> lastPostsQuery = post => false;
foreach (var topic in topics)
{
lastPostsQuery = lastPostsQuery.Or(post => post.TopicId == topic.TopicId && post.CreatedDate = topic.CreatedDate); //.Or is implemented in PredicateBuilder
}
var lastPosts = _context.Posts.Where(lastPostsQuery).ToList();
Which results in one query (instead of N) like SELECT * FROM Posts WHERE (Posts.TopicId == 1 AND Posts.CreatedDate = '2017-08-01') OR (Posts.TopicId == 2 AND Posts.CreatedDate = '2017-08-02') OR ....
Not extremely efficient but since the number of topics per page is quite low it does the trick.
In EF Core 2.1 GroupBy LINQ operator only support translating to the SQL GROUP BY clause in most common cases. Aggregation function like sum, max ...
linq-groupby-translation
You can until full support group by in EF Core use Dapper
I am not sure about version of EFCore it's possible, but you can try something like this: It will first group by then will select max id and return max id record from each group.
var firstProducts = Context.Posts
.GroupBy(p => p.TopicId)
.Select(g => g.OrderByDescending(p => p.id).FirstOrDefault())
.ToList();

Best way to filter query by navigation property(child objects) in LINQ to SQL

I have two options to filter my query:
var users = query.Select(x => x.User).FindUsers(searchString);
query = query.Where(x => users.Contains(x.User));
or
var userIds = query.Select(x => x.User).FindUsers(searchString).Select(x => x.Id);
query = query.Where(x => userIds.Contains(x.UserId));
Description of FindUsers extension:
static IQueryable<User> FindUsers(this IQueryable<User> query, string searchString);
So what I will have in SQL finally? Which of these two requests are better for perfomance?
If someone has another suggestion write it in answer please.
Thanks in advance!
Both query are similar in EF v6 or higher; Contains condition will be translated to Sql EXISTS
Take a loot to the code below :
using (var dbContext = new DbContextTest("DatabaseConnectionString"))
{
var users = dbContext.Users.Select(u => u).Where(x => x.UserId > 0);
var query = dbContext.Users.Where(x => users.Contains(x));
Console.WriteLine(query.ToList());
}
using (var dbContext = new DbContextTest("DatabaseConnectionString"))
{
var ids = dbContext.Users.Select(u => u.UserId).Where(x => x > 0);
var query = dbContext.Users.Where(x => ids.Contains(x.UserId));
Console.WriteLine(query.ToList());
}
The output Sql queries are excatly the same (you can use profiler or EF logger to see that). One thing maybe is important, selecting only the Id's will be more agile for materialisation and cache.
Tip:
If you add ToList() in the Ids query dbContext.Users.Select(u => u.UserId).Where(x => x > 0).ToList(); then this fix will imporve your result performance. The next select query will be translated with SQL "IN" instead of of "EXISTS"! Take a look to this link Difference between EXISTS and IN in SQL? and you can decide what is better for you.
Note:
ToList() will materialize the Id's, that means you will work with another interface then IQueryable!

LINQ Multiple GroupBy Query Performing several times slower than T-SQL

I'm totally new to LINQ.
I have an SQL GroupBy which runs in barely a few milliseconds. But when I try to achieve the same thing via LINQ, it just seems awfully slow.
What I'm trying to achieve is fetch an average monthly duration of a ceratin database update.
In SQL =>
select SUBSTRING(yyyyMMdd, 0,7),
AVG (duration)
from (select (CONVERT(CHAR(8), mmud.logDateTime, 112)) as yyyyMMdd,
DateDIFF(ms, min(mmud.logDateTime), max(mmud.logDateTime)) as duration
from mydb.mydbo.updateData mmud
left
join mydb.mydbo.updateDataKeyValue mmudkv
on mmud.updateDataid = mmudkv.updateDataId
left
join mydb.mydbo.updateDataDetailKey mmuddk
on mmudkv.updateDataDetailKeyid = mmuddk.Id
where dbname = 'MY_NEW_DB'
and mmudkv.value in ('start', 'finish')
group
by (CONVERT(CHAR(8), mmud.logDateTime, 112))
) as resultSet
group
by substring(yyyyMMdd, 0,7)
order
by substring(yyyyMMdd, 0,7)
in LINQ => I first fetch the record from a table that links information of the Database Name and UpdateData and then do filtering and groupby on the related information.
entry.updatedata.Where(
ue => ue.updatedataKeyValue.Any(
uedkv =>
uedkv.Value.ToLower() == "starting update" ||
uedkv.Value.ToLower() == "client release"))
.Select(
ue =>
new
{
logDateTimeyyyyMMdd = ue.logDateTime.Date,
logDateTime = ue.logDateTime
})
.GroupBy(
updateDataDetail => updateDataDetail.logDateTimeyyyyMMdd)
.Select(
groupedupdatedata => new
{
UpdateDateyyyyMM = groupedupdatedata.Key.ToString("yyyyMMdd"),
Duration =
(groupedupdatedata.Max(groupMember => groupMember.logDateTime) -
groupedupdatedata.Min(groupMember => groupMember.logDateTime)
)
.TotalMilliseconds
}
).
ToList();
var updatedataMonthlyDurations =
updatedataInDateRangeWithDescriptions.GroupBy(ue => ue.UpdateDateyyyyMM.Substring(0,6))
.Select(
group =>
new updatedataMonthlyAverageDuration
{
DbName = entry.DbName,
UpdateDateyyyyMM = group.Key.Substring(0,6),
Duration =
group.Average(
gmember =>
(gmember.Duration))
}
).ToList();
I know that GroupBy in LINQ isn't the same as GroupBy in T-SQL, but not sure what happens behind the scenes. Could anyone explain the difference and what happens in memory when I run the LINQ version? After I did the .ToList() after the first GroupBy things got a little faster. But even then this way of finding average duration is really slow.
What would be the best alternative and are there ways of improving a slow LINQ statement using Visual Studio 2012?
Your linq query is doing most of its work in linq-to-objects. You should be constructing a linq-to-entities/sql query that generates the complete query in one shot.
Your query seems to have a redundant group by clause, and I am not sure which table dbname comes from, but the following query should get you on the right track.
var query = from mmud in context.updateData
from mmudkv in context.updateDataKeyValue
.Where(x => mmud.updateDataid == x.updateDataId)
.DefaultIfEmpty()
from mmuddk in context.updateDataDetailKey
.Where(x => mmudkv.updateDataDetailKeyid == x.Id)
.DefaultIfEmpty()
where mmud.dbname == "MY_NEW_DB"
where mmudkv.value == "start" || mmudkv.value == "finish"
group mmud by mmud.logDateTime.Date into g
select new
{
Date = g.Key,
Average = EntityFunctions.DiffMilliseconds(g.Max(x => x.logDateTime), g.Min(x => x.logDateTime)),
};
var queryByMonth = from x in query
group x by new { x.Date.Year, x.Date.Month } into x
select new
{
Year = x.Key.Year,
Month = x.Key.Month,
Average = x.Average(y => y.Average)
};
// Single sql statement is to sent to your database
var result = queryByMonth.ToList();
If you are still having problems, we will need to know if you are using entityframework or linq-to-sql. And you will need to provide your context/model information

Only parameterless constructors and initializers are supported in LINQ to Entities message

I have a method that returns data from an EF model.
I'm getting the above message, but I can't wotk our how to circumvent the problem.
public static IEnumerable<FundedCount> GetFundedCount()
{
var today = DateTime.Now;
var daysInMonth = DateTime.DaysInMonth(today.Year, today.Month);
var day1 = DateTime.Now.AddDays(-1);
var day31 = DateTime.Now.AddDays(-31);
using (var uow = new UnitOfWork(ConnectionString.PaydayLenders))
{
var r = new Repository<MatchHistory>(uow.Context);
return r.Find()
.Where(x =>
x.AppliedOn >= day1 && x.AppliedOn <= day31 &&
x.ResultTypeId == (int)MatchResultType.Accepted)
.GroupBy(x => new { x.BuyerId, x.AppliedOn })
.Select(x => new FundedCount(
x.Key.BuyerId,
x.Count() / 30 * daysInMonth))
.ToList();
}
}
FundedCount is not an EF enity, MatchHistory is, so can't understand why it is complaining.
All advice appreciated.
The reason it is complaining is because it doesn't know how to translate your Select() into a SQL expression. If you need to do a data transformation to a POCO that is not an entity, you should first get the relevant data from EF and then transform it to the POCO.
In your case it should be as simple as calling ToList() earlier:
return r.Find()
.Where(x => x.AppliedOn >= day1 && x.AppliedOn <= day31 &&
x.ResultTypeId == (int)MatchResultType.Accepted)
.GroupBy(x => new { x.BuyerId, x.AppliedOn })
.ToList() // this causes the query to execute
.Select(x => new FundedCount(x.Key.BuyerId, x.Count() / 30 * daysInMonth));
Be careful with this, though, and make sure that you're limiting the size of the data set returned by ToList() as much as possible so that you're not trying to load an entire table into memory.
Message is clear : linq to entities doesn't support objects without a parameterless ctor.
So
Solution1
enumerate before (or use an intermediate anonymous type and enumerate on that one)
.ToList()
.Select(x => new FundedCount(
x.Key.BuyerId,
x.Count() / 30 * daysInMonth))
.ToList();
Solution2
add a parameterless ctor to your FundedCount class (if it's possible)
public FundedCount() {}
and use
.Select(x => new FundedCount{
<Property1> = x.Key.BuyerId,
<Property2> = x.Count() / 30 * daysInMonth
})
.ToList();
It's complaining because it can't convert references to FundedCount to SQL statements.
All LINQ providers convert LINQ statements and expressions to operations that their target can understand. LINQ to SQL and LINQ to EF will convert LINQ to SQL, PLINQ will convert it to Tasks and parallel operations, LINQ to Sharepoint will convert it to CAML etc.
What happens if they can't do the conversion, depends on the provider. Some providers will return intermediate results and convert the rest of the query to a LINQ to Objects query. Others will simply fail with an error message.
Failing with a message is actually a better option when talking to a database. Otherwise the server would have to return all columns to the client when only 1 or 2 would be actually necessary.
In your case you should modify your select to return an anonymous type with the data you want, call ToList() and THEN create the FundedCount objects, eg:
.Select( x=> new {Id=x.Key.BuyerId,Count=x.Count()/30 * daysInMonth)
.ToList()
.Select(y => new FundedCount(y.Id,y.Count))
.ToList();
The first ToList() will force the generation of the SQL statement and execute the query that will return only the data you need. The rest of the query is actually Linq to Objects and will get the data and create the final objects
I had the same exception in GroupBy. I found that the exception "Only parameterless constructors and initializers are supported in LINQ to Entities" is not 100% accurate description.
I had a GroupBy() in my "Linq to EntityFramework query" which used a struct as a Key in GroupBy. That did not work. When I changed that struct to normal class everything worked fine.
Code sample
var affectedRegistrationsGrouped = await db.Registrations
.Include(r => r.Person)
.Where(r =>
//whatever
)
.GroupBy(r => new GroupByKey
{
EventId = r.EventId,
SportId = r.SportId.Value
})
.ToListAsync();
...
...
// this does not work
private struct GroupByKey() {...}
// this works fine
private class GroupByKey() {...}

Categories

Resources