SQL Server query with subquery to LINQ query - c#

I wrote a SQL query that will get the count of tickets, closed tickets and its closure rate (%) and group it monthly basis (current year), but I would like to express this as a LINQ query to achieve the same result.
SELECT *, (ClosedCount * 100 / TicketCount) AS ClosureRate FROM (
SELECT COUNT(Id) as TicketCount, MONTH(InsertDate) as MonthNumber, DATENAME(MONTH, E1.InsertDate) as MonthName,
(SELECT COUNT(Id) FROM EvaluationHistoryTable E2 WHERE TicketStatus = 'CLOSED' AND YEAR(E2.InsertDate) = '2021') AS 'ClosedCount'
FROM EvaluationHistoryTable E1
WHERE YEAR(E1.InsertDate) = 2021
GROUP BY MONTH(InsertDate), DATENAME(MONTH, E1.InsertDate));
This is code that I'm working on:
var ytdClosureRateData = _context.EvaluationHistoryTable
.Where(t => t.InsertDate.Value.Year == DateTime.Now.Year)
.GroupBy(m => new
{
Month = m.InsertDate.Value.Month
})
.Select(g => new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(g.Key.Month),
MonthNumber = g.Key.Month,
ItemCount = g.Count(),
ClosedCount = // problem
ClosureRate = // problem
}).AsEnumerable()
.OrderBy(a => a.MonthNumber)
.ToList();
I am having rtouble trying to express the count of closed tickets (ClosedCount) in linq format, I need the count to calculate the ClosureRate.

This won't be the same SQL but it should produce the same result in memory:
var ytdClosureRateData = _context.EvaluationHistoryTable
.Where(t => t.InsertDate.Value.Year == DateTime.Now.Year)
.GroupBy(m => new
{
Month = m.InsertDate.Value.Month
})
.Select(g => new
{
Month = g.Key.Month,
ItemCount = g.Count(),
ClosedCount = g.Where(t => t.TicketStatus == "CLOSED").Count()
}).OrderBy(a => a.MonthNumber)
.ToList()
.Select(x => new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(x.Month),
MonthNumber = x.Month,
ItemCount = x.ItemCount,
ClosedCount = x.ClosedCount,
ClosureRate = x.ClosedCount * 100D / x.ItemCount
})
.ToList();
Two techniques have been implemented here:
Use Fluent Query to specify the filter to apply for the ClosedCount set, you can combine Fluent and Query syntax to your hearts content, they each have pros and cons, in this instance it just simplifed the syntax to do it this way.
Focus the DB query on only bringing back the data that you need, the rest can be easily calculated in member after the initial DB execution. That is why there are 2 projections here, the first should be expressed purely in SQL, the rest is evaluated as Linq to Objects
The general assumption is that traffic over the wire and serialization are generally the bottle necks for simple queries like this, so we force Linq to Entities (or Linq to SQL) to produce the smallest payload that is practical and build the rest or the values and calculations in memory.
UPDATE:
Svyatoslav Danyliv makes a really good point in this answer
The logic can be simplified, from both an SQL and LINQ perspective by using a CASE expression on the TicketStatus to return 1 or 0 and then we can simply sum that column, which means you can avoid a nested query and can simply join on the results.

Original query can be simplified to this one:
SELECT *,
(ClosedCount * 100 / TicketCount) AS ClosureRate
FROM (
SELECT
COUNT(Id) AS TicketCount,
MONTH(InsertDate) AS MonthNumber,
DATENAME(MONTH, E1.InsertDate) AS MonthName,
SUM(CASE WHEN TicketStatus = 'CLOSED' THEN 1 ELSE 0 END) AS 'ClosedCount'
FROM EvaluationHistoryTable E1
WHERE YEAR(E1.InsertDate) = 2021
GROUP BY MONTH(InsertDate), DATENAME(MONTH, E1.InsertDate));
Which is easily convertible to server-side LINQ:
var grouped =
from eh in _context.EvaluationHistoryTable
where eh.InsertDate.Value.Year == DateTime.Now.Year
group eh by new { eh.InsertDate.Value.Month }
select new
{
g.Key.Month,
ItemCount = g.Count(),
ClosedCount = g.Sum(t => t.TicketStatus == "CLOSED" ? 1 : 0)
};
var query =
from x in grouped
orderby x.Month
select new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(x.Month),
MonthNumber = x.Month,
ItemCount = x.ItemCount,
ClosedCount = x.ClosedCount,
ClosureRate = x.ClosedCount * 100D / x.ItemCount
};
var result = query.ToList();

Related

How do I implement tsql's sum(...) over () in linq to SQL?

I have this functional t-sql query that counts the entries in a group by clause, and at the same time produces a percentage of the count compared to the entire set.
It is blazing fast (~90 ms) in Azure. I'd like to implement in a similar manner with LINQ to SQL, but I can't figure it out...
select f.worktype, f.counted, (100.0 * f.counted)/ (sum(f.counted) over ()) as percentage
from
(SELECT
wa.skillEN AS workType,
count(wa.skillEN) counted
FROM [dbo].WorkAssignments as WA
join [dbo].WorkOrders as WO ON (WO.ID = WA.workorderID)
WHERE wo.dateTimeOfWork < ('1/1/2014')
and wo.dateTimeOfWork > ('1/1/2013')
and wo.statusEN = 'Completed'
group by wa.skillEN) as f
group by f.worktype, f.counted
The LINQ I've been trying in LINQPad...
WorkAssignments
.Where(wa => wa.WorkOrder.DateTimeofWork > DateTime.Now.AddYears(-2)
&& wa.WorkOrder.DateTimeofWork < DateTime.Now)
.GroupBy(wa => wa.SkillEN)
.Select(g => new
{
label = g.Key,
count = g.Count()
})
.GroupBy(g => new {g.label, g.count})
.Select(gg => new
{
label = gg.Key.label,
count = gg.Key.count,
pct = gg.Sum(a => a.count)
})
(The dates in the where clause are slightly different, but I don't think it's relevant)
So, how would I implement the over () feature in LINQ to SQL?

Slow EF query grouping data by Month/Year

I have a record set of approximetly 1 million records. I'm trying to query the records to report monthly figures.
The following MySQL query executes in about 0.3 seconds
SELECT SUM(total), MONTH(create_datetime), YEAR(create_datetime)
FROM orders GROUP BY MONTH(create_datetime), YEAR(create_datetime)
However I am unable to figure out an entity framework lambda expression that can execute any near as fast
The only statement I have come up with that actually works is
var monthlySales = db.Orders
.Select(c => new
{
Total = c.Total,
CreateDateTime = c.CreateDateTime
})
.GroupBy(c => new { c.CreateDateTime.Year, c.CreateDateTime.Month })
.Select(c => new
{
CreateDateTime = c.FirstOrDefault().CreateDateTime,
Total = c.Sum(d => d.Total)
})
.OrderBy(c => c.CreateDateTime)
.ToList();
But it is horribly slow.
How can I get this query to execute as quickly as it does directly in MySQL
When you do ".ToList()" in the middle of query (before doing grouping) EF will effectively query all orders from database in memory and then do grouping in C#. Depending on amount of data in your table, that can take a while and I think this is why your query is so slow.
Try to rewrite your query having only 1 expression that enumerates results (ToList, ToArray, AsEnumerable)
Try this:
var monthlySales = from c in db.Orders
group c by new { y = c.CreateDateTime.Year, m = c.CreateDateTime.Month } into g
select new {
Total = c.Sum(t => t.Total),
Year = g.Key.y,
Month = g.Key.m }).ToList();
I came across this setup which executes quickly
var monthlySales = db.Orders
.GroupBy(c => new { Year = c.CreateDateTime.Year, Month = c.CreateDateTime.Month })
.Select(c => new
{
Month = c.Key.Month,
Year = c.Key.Year,
Total = c.Sum(d => d.Total)
})
.OrderByDescending(a => a.Year)
.ThenByDescending(a => a.Month)
.ToList();

LINQ Multiple GroupBy Query Performing several times slower than T-SQL

I'm totally new to LINQ.
I have an SQL GroupBy which runs in barely a few milliseconds. But when I try to achieve the same thing via LINQ, it just seems awfully slow.
What I'm trying to achieve is fetch an average monthly duration of a ceratin database update.
In SQL =>
select SUBSTRING(yyyyMMdd, 0,7),
AVG (duration)
from (select (CONVERT(CHAR(8), mmud.logDateTime, 112)) as yyyyMMdd,
DateDIFF(ms, min(mmud.logDateTime), max(mmud.logDateTime)) as duration
from mydb.mydbo.updateData mmud
left
join mydb.mydbo.updateDataKeyValue mmudkv
on mmud.updateDataid = mmudkv.updateDataId
left
join mydb.mydbo.updateDataDetailKey mmuddk
on mmudkv.updateDataDetailKeyid = mmuddk.Id
where dbname = 'MY_NEW_DB'
and mmudkv.value in ('start', 'finish')
group
by (CONVERT(CHAR(8), mmud.logDateTime, 112))
) as resultSet
group
by substring(yyyyMMdd, 0,7)
order
by substring(yyyyMMdd, 0,7)
in LINQ => I first fetch the record from a table that links information of the Database Name and UpdateData and then do filtering and groupby on the related information.
entry.updatedata.Where(
ue => ue.updatedataKeyValue.Any(
uedkv =>
uedkv.Value.ToLower() == "starting update" ||
uedkv.Value.ToLower() == "client release"))
.Select(
ue =>
new
{
logDateTimeyyyyMMdd = ue.logDateTime.Date,
logDateTime = ue.logDateTime
})
.GroupBy(
updateDataDetail => updateDataDetail.logDateTimeyyyyMMdd)
.Select(
groupedupdatedata => new
{
UpdateDateyyyyMM = groupedupdatedata.Key.ToString("yyyyMMdd"),
Duration =
(groupedupdatedata.Max(groupMember => groupMember.logDateTime) -
groupedupdatedata.Min(groupMember => groupMember.logDateTime)
)
.TotalMilliseconds
}
).
ToList();
var updatedataMonthlyDurations =
updatedataInDateRangeWithDescriptions.GroupBy(ue => ue.UpdateDateyyyyMM.Substring(0,6))
.Select(
group =>
new updatedataMonthlyAverageDuration
{
DbName = entry.DbName,
UpdateDateyyyyMM = group.Key.Substring(0,6),
Duration =
group.Average(
gmember =>
(gmember.Duration))
}
).ToList();
I know that GroupBy in LINQ isn't the same as GroupBy in T-SQL, but not sure what happens behind the scenes. Could anyone explain the difference and what happens in memory when I run the LINQ version? After I did the .ToList() after the first GroupBy things got a little faster. But even then this way of finding average duration is really slow.
What would be the best alternative and are there ways of improving a slow LINQ statement using Visual Studio 2012?
Your linq query is doing most of its work in linq-to-objects. You should be constructing a linq-to-entities/sql query that generates the complete query in one shot.
Your query seems to have a redundant group by clause, and I am not sure which table dbname comes from, but the following query should get you on the right track.
var query = from mmud in context.updateData
from mmudkv in context.updateDataKeyValue
.Where(x => mmud.updateDataid == x.updateDataId)
.DefaultIfEmpty()
from mmuddk in context.updateDataDetailKey
.Where(x => mmudkv.updateDataDetailKeyid == x.Id)
.DefaultIfEmpty()
where mmud.dbname == "MY_NEW_DB"
where mmudkv.value == "start" || mmudkv.value == "finish"
group mmud by mmud.logDateTime.Date into g
select new
{
Date = g.Key,
Average = EntityFunctions.DiffMilliseconds(g.Max(x => x.logDateTime), g.Min(x => x.logDateTime)),
};
var queryByMonth = from x in query
group x by new { x.Date.Year, x.Date.Month } into x
select new
{
Year = x.Key.Year,
Month = x.Key.Month,
Average = x.Average(y => y.Average)
};
// Single sql statement is to sent to your database
var result = queryByMonth.ToList();
If you are still having problems, we will need to know if you are using entityframework or linq-to-sql. And you will need to provide your context/model information

EF issue with grouping on server

I ran into this problem when I was doing group by with entity framework.
.Net: 4.5, EF: 5.0, Database: Oracle
My problem was when I was grouping on the server and getting back the data, the grouped data (list of entities) was returning the first record over and over for all the grouped data - but the group KEY was correct.
If I don't do a group by the records return as expected, but I have some grouping requirements and my workaround is ... yeah not making me feel that good and the code should work... but it does not.
x.D = string rest is integer/string mix.
Here is the code that did not work:
db.ENTITY_NAME
.Where(x =>
wantedGs.Contains(x.G) &&
wantedAs.Contains(x.A)
)
.GroupBy(x => x.D)
.ToList()
.Select(x => x.FirstOrDefault())
.Select(x => new MyEntity
{
A = x.A,
B = x.B,
C = x.C,
E = x.E,
D = x.D,
F = x.F,
G = x.G
})
.ToList();
Here is the workaround I managed to do what I want:
db.ENTITY_NAME
.Where(x =>
wantedGs.Contains(x.G) &&
wantedAs.Contains(x.A)
)
.Select(x => new
{
x.A,
x.B,
x.C,
x.D,
x.E,
x.F,
x.G
})
.ToList()
.GroupBy(x => x.D)
.Select(x => x.FirstOrDefault())
.Select(x => new MyEntity
{
A = x.A,
B = x.B,
C = x.C,
E = x.E,
D = x.D,
F = x.F,
G = x.G
})
.ToList();
If this doesn't work, please post some sample data that shows the problem
db.ENTITY_NAME
.Where(x =>
wantedGs.Contains(x.G) &&
wantedAs.Contains(x.A)
)
.GroupBy(x => x.D)
.Select(x => x.FirstOrDefault())
.AsEnumerable()
.Select(x => new MyEntity
{
A = x.A,
B = x.B,
C = x.C,
E = x.E,
D = x.D,
F = x.F,
G = x.G
})
.ToList();
I find LINQPad useful in diagnosing this sort of problem. Querying against an Oracle table and switching from the Results tab to the SQL tab, notice how the first example results in one initial SQL select, followed by multiple subsequent select statements that are not going to be useful in achieving the proper grouping required. Looks like a bug to me.
This problem appears to be Oracle-specific (possibly particular client versions). A similar GroupBy on a Microsoft SQL Express database gave the correct results, although there were also multiple SQL selects.
It seems we need to be careful when using GroupBy on database connections; it can be both quicker and more accurate to evaluate early (e.g. conversion to a list) so that we're using LINQ to data from that point on.
Update with repro case:
First the Oracle (9i) table creation and row insertion:
create table payees (
name varchar2(10),
amount number(5));
insert into payees values ('JACK', 150);
insert into payees values ('BARRY', 100);
insert into payees values ('EMMA', 20);
insert into payees values ('FLAVIA', 15);
insert into payees values ('SYLVIA', 300);
commit;
The good and bad LINQ statements (using Oracle 9i client):
var good = Payees.ToList().GroupBy(p => p.Amount / 100);
var bad = Payees.GroupBy(p => p.Amount / 100);
An example of a query I'd anticipated an intelligent LINQ to Oracle driver to use:
select trunc(amount/100) pay_category, name, amount
from payees
order by pay_category;
PAY_CATEGORY NAME AMOUNT
------------ ---------- ----------
0 EMMA 20
0 FLAVIA 15
1 JACK 150
1 BARRY 100
3 SYLVIA 300
The actual strange queries LINQPad reports in the SQL tab, resulting in no useful grouping at all:
SELECT t0.AMOUNT
FROM GENSYS.PAYEES t0
GROUP BY t0.AMOUNT
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [15]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [20]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [100]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [150]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [300]
I may be expecting too much of LINQ to SQL though. (My LINQPad reports the LINQPad driver is IQ V2.0.7.0, if that helps).

Write a comparable LINQ query for aggregate distinct count in sql?

I want to get a count for each month but count should be only at most one per day even if there are multiple occurences . I have the SQL query which works right but having trouble to convert it into LINQ -
select
count(DISTINCT DAY(date)) as Monthly_Count,
MONTH(date) as Month,
YEAR(date)
from
activity
where
id=#id
group by
YEAR(date),
MONTH(date)
Could anyone help me translating the above query to LINQ. Thanks!
Per LINQ to SQL using GROUP BY and COUNT(DISTINCT) given by #Rick, this should work:
var query = from act in db.Activity
where act.Id == id
group act by new { act.Date.Year, act.Date.Month } into g
select new
{
MonthlyCount = g.Select(act => act.Date.Day).Distinct().Count(),
Month = g.Key.Month,
Year = g.Key.Year
};
I don't know if L2S can convert the inner g.Select(act => act.Date.Day).Distinct.Count() properly.
var results = db.activities.Where(a => a.id == myID)
.GroupBy(a => new
{
Month = a.date.Month,
Year = a.date.Year
})
.Select(g => new
{
Month = g.Key.Month,
Year = g.Key.Year,
Monthly_Count = g.Select(d => d.date.Day)
.Distinct()
.Count()
})

Categories

Resources