I ran into this problem when I was doing group by with entity framework.
.Net: 4.5, EF: 5.0, Database: Oracle
My problem was when I was grouping on the server and getting back the data, the grouped data (list of entities) was returning the first record over and over for all the grouped data - but the group KEY was correct.
If I don't do a group by the records return as expected, but I have some grouping requirements and my workaround is ... yeah not making me feel that good and the code should work... but it does not.
x.D = string rest is integer/string mix.
Here is the code that did not work:
db.ENTITY_NAME
.Where(x =>
wantedGs.Contains(x.G) &&
wantedAs.Contains(x.A)
)
.GroupBy(x => x.D)
.ToList()
.Select(x => x.FirstOrDefault())
.Select(x => new MyEntity
{
A = x.A,
B = x.B,
C = x.C,
E = x.E,
D = x.D,
F = x.F,
G = x.G
})
.ToList();
Here is the workaround I managed to do what I want:
db.ENTITY_NAME
.Where(x =>
wantedGs.Contains(x.G) &&
wantedAs.Contains(x.A)
)
.Select(x => new
{
x.A,
x.B,
x.C,
x.D,
x.E,
x.F,
x.G
})
.ToList()
.GroupBy(x => x.D)
.Select(x => x.FirstOrDefault())
.Select(x => new MyEntity
{
A = x.A,
B = x.B,
C = x.C,
E = x.E,
D = x.D,
F = x.F,
G = x.G
})
.ToList();
If this doesn't work, please post some sample data that shows the problem
db.ENTITY_NAME
.Where(x =>
wantedGs.Contains(x.G) &&
wantedAs.Contains(x.A)
)
.GroupBy(x => x.D)
.Select(x => x.FirstOrDefault())
.AsEnumerable()
.Select(x => new MyEntity
{
A = x.A,
B = x.B,
C = x.C,
E = x.E,
D = x.D,
F = x.F,
G = x.G
})
.ToList();
I find LINQPad useful in diagnosing this sort of problem. Querying against an Oracle table and switching from the Results tab to the SQL tab, notice how the first example results in one initial SQL select, followed by multiple subsequent select statements that are not going to be useful in achieving the proper grouping required. Looks like a bug to me.
This problem appears to be Oracle-specific (possibly particular client versions). A similar GroupBy on a Microsoft SQL Express database gave the correct results, although there were also multiple SQL selects.
It seems we need to be careful when using GroupBy on database connections; it can be both quicker and more accurate to evaluate early (e.g. conversion to a list) so that we're using LINQ to data from that point on.
Update with repro case:
First the Oracle (9i) table creation and row insertion:
create table payees (
name varchar2(10),
amount number(5));
insert into payees values ('JACK', 150);
insert into payees values ('BARRY', 100);
insert into payees values ('EMMA', 20);
insert into payees values ('FLAVIA', 15);
insert into payees values ('SYLVIA', 300);
commit;
The good and bad LINQ statements (using Oracle 9i client):
var good = Payees.ToList().GroupBy(p => p.Amount / 100);
var bad = Payees.GroupBy(p => p.Amount / 100);
An example of a query I'd anticipated an intelligent LINQ to Oracle driver to use:
select trunc(amount/100) pay_category, name, amount
from payees
order by pay_category;
PAY_CATEGORY NAME AMOUNT
------------ ---------- ----------
0 EMMA 20
0 FLAVIA 15
1 JACK 150
1 BARRY 100
3 SYLVIA 300
The actual strange queries LINQPad reports in the SQL tab, resulting in no useful grouping at all:
SELECT t0.AMOUNT
FROM GENSYS.PAYEES t0
GROUP BY t0.AMOUNT
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [15]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [20]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [100]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [150]
SELECT t0.AMOUNT, t0.NAME
FROM GENSYS.PAYEES t0
WHERE ((t0.AMOUNT IS NULL AND :n0 IS NULL) OR (t0.AMOUNT = :n0))
-- n0 = [300]
I may be expecting too much of LINQ to SQL though. (My LINQPad reports the LINQPad driver is IQ V2.0.7.0, if that helps).
Related
I wrote a SQL query that will get the count of tickets, closed tickets and its closure rate (%) and group it monthly basis (current year), but I would like to express this as a LINQ query to achieve the same result.
SELECT *, (ClosedCount * 100 / TicketCount) AS ClosureRate FROM (
SELECT COUNT(Id) as TicketCount, MONTH(InsertDate) as MonthNumber, DATENAME(MONTH, E1.InsertDate) as MonthName,
(SELECT COUNT(Id) FROM EvaluationHistoryTable E2 WHERE TicketStatus = 'CLOSED' AND YEAR(E2.InsertDate) = '2021') AS 'ClosedCount'
FROM EvaluationHistoryTable E1
WHERE YEAR(E1.InsertDate) = 2021
GROUP BY MONTH(InsertDate), DATENAME(MONTH, E1.InsertDate));
This is code that I'm working on:
var ytdClosureRateData = _context.EvaluationHistoryTable
.Where(t => t.InsertDate.Value.Year == DateTime.Now.Year)
.GroupBy(m => new
{
Month = m.InsertDate.Value.Month
})
.Select(g => new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(g.Key.Month),
MonthNumber = g.Key.Month,
ItemCount = g.Count(),
ClosedCount = // problem
ClosureRate = // problem
}).AsEnumerable()
.OrderBy(a => a.MonthNumber)
.ToList();
I am having rtouble trying to express the count of closed tickets (ClosedCount) in linq format, I need the count to calculate the ClosureRate.
This won't be the same SQL but it should produce the same result in memory:
var ytdClosureRateData = _context.EvaluationHistoryTable
.Where(t => t.InsertDate.Value.Year == DateTime.Now.Year)
.GroupBy(m => new
{
Month = m.InsertDate.Value.Month
})
.Select(g => new
{
Month = g.Key.Month,
ItemCount = g.Count(),
ClosedCount = g.Where(t => t.TicketStatus == "CLOSED").Count()
}).OrderBy(a => a.MonthNumber)
.ToList()
.Select(x => new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(x.Month),
MonthNumber = x.Month,
ItemCount = x.ItemCount,
ClosedCount = x.ClosedCount,
ClosureRate = x.ClosedCount * 100D / x.ItemCount
})
.ToList();
Two techniques have been implemented here:
Use Fluent Query to specify the filter to apply for the ClosedCount set, you can combine Fluent and Query syntax to your hearts content, they each have pros and cons, in this instance it just simplifed the syntax to do it this way.
Focus the DB query on only bringing back the data that you need, the rest can be easily calculated in member after the initial DB execution. That is why there are 2 projections here, the first should be expressed purely in SQL, the rest is evaluated as Linq to Objects
The general assumption is that traffic over the wire and serialization are generally the bottle necks for simple queries like this, so we force Linq to Entities (or Linq to SQL) to produce the smallest payload that is practical and build the rest or the values and calculations in memory.
UPDATE:
Svyatoslav Danyliv makes a really good point in this answer
The logic can be simplified, from both an SQL and LINQ perspective by using a CASE expression on the TicketStatus to return 1 or 0 and then we can simply sum that column, which means you can avoid a nested query and can simply join on the results.
Original query can be simplified to this one:
SELECT *,
(ClosedCount * 100 / TicketCount) AS ClosureRate
FROM (
SELECT
COUNT(Id) AS TicketCount,
MONTH(InsertDate) AS MonthNumber,
DATENAME(MONTH, E1.InsertDate) AS MonthName,
SUM(CASE WHEN TicketStatus = 'CLOSED' THEN 1 ELSE 0 END) AS 'ClosedCount'
FROM EvaluationHistoryTable E1
WHERE YEAR(E1.InsertDate) = 2021
GROUP BY MONTH(InsertDate), DATENAME(MONTH, E1.InsertDate));
Which is easily convertible to server-side LINQ:
var grouped =
from eh in _context.EvaluationHistoryTable
where eh.InsertDate.Value.Year == DateTime.Now.Year
group eh by new { eh.InsertDate.Value.Month }
select new
{
g.Key.Month,
ItemCount = g.Count(),
ClosedCount = g.Sum(t => t.TicketStatus == "CLOSED" ? 1 : 0)
};
var query =
from x in grouped
orderby x.Month
select new YtdTicketClosureRateModel
{
MonthName = DateTimeFormatInfo.CurrentInfo.GetAbbreviatedMonthName(x.Month),
MonthNumber = x.Month,
ItemCount = x.ItemCount,
ClosedCount = x.ClosedCount,
ClosureRate = x.ClosedCount * 100D / x.ItemCount
};
var result = query.ToList();
I'm trying to convert a sql stored proc to linq. I'm having issues with the groupby and inner joins.
Here is what I've tried:
var r = _context.Table1
.GroupBy(x => new { x.OptionId, x.Years, x.Strike })
.Join(_context.Table2,
oc => oc.OptionId, o => o.OptionId, (oc, o) => new
{
OptionsCosts = oc,
Options = o
}).Where(x => x.Options.OptionType == 1
&& x.Options.QualifierId != null
&& x.Options.CreditingMethod != "xxx")
.Select(y => new DataModel.Table1()
{
Years = y.Select(a => a.OptionsCosts.Years).FirstOrDefault(),
Strike = y.Select(a => a.OptionsCosts.Strike).FirstOrDefault(),
Value = y.Select(a => a.OptionsCosts.Value).FirstOrDefault(),
ChangeUser = y.Select(a => a.OptionsCosts.ChangeUser).FirstOrDefault(),
ChangeDate = DateTime.Now,
OptionId = y.Select(a => a.OptionsCosts.OptionId).FirstOrDefault()
});
Here is the SQL that I'm trying to convert:
SELECT o2.OptionId, o2.Years, o2.Strike, SUM(d2.Weights) as 'TotalWeight', COUNT(*) as 'Counts'
FROM Table1 o2
INNER JOIN #Dates d2 --this is a temp table that just holds dates. I was thinking just a where statement could do it???
ON d2.EffectiveDate = o2.EffectiveDate
INNER JOIN Table2 od2
ON od2.OptionId = o2.OptionId
AND od2.OptionType = 1
AND od2.qualifierid is null
AND od2.CreditingMethod <> 'xxx' --28095
GROUP BY o2.OptionId,o2.Years, o2.Strike
My data is way off so I'm sure I'm doing something wrong.
var table1=_context.Table1
.groupBy(o2=> new{
o2.OptionId
, o2.Years
, o2.Strike
})
.select(s=> new{
s.key.OptionId
, s.key.Years
, s.key.Strike
,TotalWeight=s.sum(x=>x.Weights)
,Counts=o2.count(c=>c.OptionId)
}).tolist();
var result=table1
.Join(_context.Table2,oc => oc.OptionId, o => o.OptionId, (oc, o) => new{ OptionsCosts = oc, Options = o })
.Where(x => x.Options.OptionType == 1
&& x.Options.QualifierId != null
&& x.Options.CreditingMethod != "xxx")
.select(x=> new {
x.oc.OptionId, x.oc.Years, x.oc.Strike, x.oc.TotalWeight, x.oc.Counts
}).tolist();
Small advise, when you rewriting SQL queries, use LINQ Query syntax which is close to SQL and more effective to avoid errors.
var dates = new List<DateTime>() { DateTime.Now }; // fill list
var query =
from o2 in _context.Table1
where dates.Contains(o2.EffectiveDate)
from od2 in _context.Table1.Where(od2 => // another way to join
od2.OptionId == o2.OptionId
&& od2.OptionType == 1
&& od2.qualifierid == null
&& od2.CreditingMethod != "xxx")
group o2 by new { o2.OptionId, o2.Years, o2.Strike } into g
select new
{
g.Key.OptionId,
g.Key.Years,
g.Key.Strike,
Counts = g.Count()
// SUM(d2.Weights) as 'TotalWeight', -this one is not available because dates in memory
};
If you are on start and trying to rewrite procedures on LINQ - EF Core is bad idea. Too limited IQueryable support and usually you will fight for each complex LINQ query.
Try linq2db which has temporary tables support and your stored proc can be rewritten into identical LINQ queries. Or you can use linq2db.EntityFrameworkCore to extend EF Core functionality.
Disclaimer. I’m creator of this extension and one from linq2db creators.
Is there a way i can rewrite this query so it is not a correlated subqueries ?
var query = (from o in dbcontext.Orders
let lastStatus = o.OrderStatus.Where(x => x.OrderId == o.Id).OrderByDescending(x => x.CreatedDate).FirstOrDefault()
where lastStatus.OrderId != 1
select new { o.Name, lastStatus.Id }
).ToList();
This resulted in:
SELECT [o].[Name], (
SELECT TOP(1) [x0].[Id]
FROM [OrderStatus] AS [x0]
WHERE ([x0].[OrderId] = [o].[Id]) AND ([o].[Id] = [x0].[OrderId])
ORDER BY [x0].[CreatedDate] DESC
) AS [Id]
FROM [Orders] AS [o]
WHERE (
SELECT TOP(1) [x].[OrderId]
FROM [OrderStatus] AS [x]
WHERE ([x].[OrderId] = [o].[Id]) AND ([o].[Id] = [x].[OrderId])
ORDER BY [x].[CreatedDate] DESC
) <> 1
I have tried to do a join on a subquery but EF 2.1 is doing weird things... not what I expected;
var query = (from o in dbcontext.Orders
join lastStat in (from os in dbcontext.OrderStatus
orderby os.CreatedDate descending
select new { os }
) on o.Id equals lastStat.os.OrderId
where lastStat.os.StatusId != 1
select new { o.Name, lastStat.os.StatusId }).ToList();
In EF6 replacing
let x = (...).FirstOrDefault()
with
from x in (...).Take(1).DefaultIfEmpty()
usually generates better SQL.
So normally I would suggest
var query = (from o in db.Set<Order>()
from lastStatus in o.OrderStatus
.OrderByDescending(s => s.CreatedDate)
.Take(1)
where lastStatus.Id != 1
select new { o.Name, StatusId = lastStatus.Id }
).ToList();
(no need of DefaultIfEmpty (left join) because the where condition will turn it to inner join anyway).
Unfortunately currently (EF Core 2.1.4) there is a translation issue so the above leads to client evaluation.
The current workaround is to replace the navigation property accessor o.OrderStatus with correlated subquery:
var query = (from o in db.Set<Order>()
from lastStatus in db.Set<OrderStatus>()
.Where(s => o.Id == s.OrderId)
.OrderByDescending(s => s.CreatedDate)
.Take(1)
where lastStatus.Id != 1
select new { o.Name, StatusId = lastStatus.Id }
).ToList();
which produces the following SQL for SqlServer database (lateral join):
SELECT [o].[Name], [t].[Id] AS [StatusId]
FROM [Orders] AS [o]
CROSS APPLY (
SELECT TOP(1) [s].*
FROM [OrderStatus] AS [s]
WHERE [s].[OrderId] = [o].[Id]
ORDER BY [s].[CreatedDate] DESC
) AS [t]
WHERE [t].[Id] <> 1
I will assume that you are actually fetching all the Orders, but only a portion of them (a page or a batch for processing).
In this case, it might be better to split it in two queries (not tested though):
var orders = dbcontext.Orders.Where(o => /* some filter logic */);
var orderIds = orders.Select(o => o.OrderId).ToList();
// get status for latest change - this should query OrderStatus only
var statusNameMap = dbContext.OrderStatus
.Where(os => orderIds.Contains(Id))
.GroupBy(os => os.OrderId)
.Select(grp => grp.OrderByDescending(grp => grp.CreatedDate).First())
.ToDictionary(os => os.OrderId, os => os.StatusId);
// aggregate the results
// the orders might fetch only the needed columns to have less data on the wire
var result = orders.
.ToList()
.Select(o => new { o.Name, statusNameMap[o.OrderId] });
I do not think the queries will be nicer, but it might be easier to understand what is going on here.
If you really have to process all Orders and you have many of them (or many Statuses), you might consider maintaining a LastStatusId column directly in Order table (this should be updated whenever a status is changed).
I'm working with the following dataset:
ID SearchTags
1 Cats,Birds,Dogs,Snakes,Roosters
2 Mice,Chickens,Cats,Lizards
3 Birds,Zebras,Sheep,Horses,Monkeys,Chimps
4 Lions,Tigers,Bears,Chickens
5 Cats,Goats,Pandas
6 Birds,Zebras,Sheep,Horses
7 Rats,Dogs,Hawks,Eagles,Tigers
8 Cats,Tigers,Dogs,Pandas
9 Dogs,Beavers,Sharks,Vultures
10 Cats,Bears,Bats,Leopards,Chickens
I need to query out a list of the most popular SearchTags.
I have a query which will return the most popular SearchTags but it returns the whole list of words. (which I expected). Is it possible to split the SearchTags column on (,) and generate a list of the most popular tags so that I end up with a list/count as follows?:
Cats 5
Dogs 4
Chickens 3
Tigers 3
Bears 2
Sharks 1
etc...
instead of what I get now:
Cats,Birds,Dogs,Snakes,Roosters 1
Dogs,Beavers,Sharks,Vultures 1
Cats,Bears,Bats,Leopards,Chickens 1
etc...
Here's the query that returns the list of words.
SELECT SearchTags, COUNT(*) AS TagCount
FROM Animals
GROUP BY SearchTags
ORDER BY TagCount DESC
I'm using SQL Server. I'd prefer a query but can create a stored procedure if needed.
Thanks for any help you can offer.
You have tagged the question with C# and LINQ, if you have the data in a DataTable then you can do:
DataTable dt = GetDataTableFromDB();
var query = dt.AsEnumerable()
.Select(r => r.Field<string>("SearchTags").Split(','))
.SelectMany(r => r)
.GroupBy(r => r)
.Select(grp => new
{
Key = grp.Key,
Count = grp.Count()
});
If you have LINQ TO SQL set up then you can do:
var query = db.YourTable
.Select(r=> r.SearchTags)
.AsEnumerable()
.Where(r=> !string.IsNullOrWhiteSpace(r))
.Select(r => r.Split(','))
.SelectMany(r => r)
.GroupBy(r => r)
.Select(grp => new
{
Key = grp.Key,
Count = grp.Count()
});
});
This will load all the SearchTags in memory and then you would be able to apply Split.
You can also filter out null or empty string values for SearchTags at your database end like:
var query = db.YourTable
.Where(r=> r.SearchTags != null && r.SearchTags.Trim() != "")
.Select(r=> r.SearchTags)
.AsEnumerable()
.Select(r => r.Split(','))
.SelectMany(r => r)
.GroupBy(r => r)
.Select(grp => new
{
Key = grp.Key,
Count = grp.Count()
});
});
The above will filter out the null or empty strings/only white spaces, from the returned collection at the database end and would work more efficiently.
For filtering out dates do:
DateTime dt = DateTime.Today.AddDays(-14);
var query = db.YourTable
.Where(r=> r.SearchTags != null &&
r.SearchTags.Trim() != "" &&
r.MediaDate >= dt)
.Select(r=> r.SearchTags)
.AsEnumerable()
.Select(r => r.Split(','))
.SelectMany(r => r)
.GroupBy(r => r)
.Select(grp => new
{
Key = grp.Key,
Count = grp.Count()
});
});
Assuming you want TSQL...
There are numerous TSQL functions for splitting strings, but anything using XQuery are by far the fastest versus the plethora of looping functions.
I use something similar to this in a production system on a table with a 10-15K CSV values, and it runs in seconds, versus an old looping function which sometimes took up to a minute.
Anyway, here's a quick demo to get you going.
DECLARE #DATA TABLE (ID INT, SEARCHTAGS VARCHAR(100))
INSERT INTO #DATA
SELECT 1,'Cats,Birds,Dogs,Snakes,Roosters' UNION ALL
SELECT 2,'Mice,Chickens,Cats,Lizards' UNION ALL
SELECT 3,'Birds,Zebras,Sheep,Horses,Monkeys,Chimps' UNION ALL
SELECT 4,'Lions,Tigers,Bears,Chickens' UNION ALL
SELECT 5,'Cats,Goats,Pandas' UNION ALL
SELECT 6,'Birds,Zebras,Sheep,Horses' UNION ALL
SELECT 7,'Rats,Dogs,Hawks,Eagles,Tigers' UNION ALL
SELECT 8,'Cats,Tigers,Dogs,Pandas' UNION ALL
SELECT 9,'Dogs,Beavers,Sharks,Vultures' UNION ALL
SELECT 10,'Cats,Bears,Bats,Leopards,Chickens'
;WITH TagList AS
(
SELECT ID, Split.a.value('.', 'VARCHAR(max)') AS String
FROM (SELECT ID,
CAST ('<M>' + REPLACE(CAST(SEARCHTAGS AS VARCHAR), ',', '</M><M>') + '</M>' AS XML) AS String
FROM #DATA) AS A
CROSS APPLY String.nodes ('/M') AS Split(a)
)
SELECT TOP (10) String, COUNT(*) AS [SearchCount]
FROM TagList
GROUP BY String
ORDER BY [SearchCount] DESC
NB: Anything to do with string manipulation is almost always faster if you can handle it in c#... so the answer by Habib would probably be more efficient than a TSQL solution.
I'm totally new to LINQ.
I have an SQL GroupBy which runs in barely a few milliseconds. But when I try to achieve the same thing via LINQ, it just seems awfully slow.
What I'm trying to achieve is fetch an average monthly duration of a ceratin database update.
In SQL =>
select SUBSTRING(yyyyMMdd, 0,7),
AVG (duration)
from (select (CONVERT(CHAR(8), mmud.logDateTime, 112)) as yyyyMMdd,
DateDIFF(ms, min(mmud.logDateTime), max(mmud.logDateTime)) as duration
from mydb.mydbo.updateData mmud
left
join mydb.mydbo.updateDataKeyValue mmudkv
on mmud.updateDataid = mmudkv.updateDataId
left
join mydb.mydbo.updateDataDetailKey mmuddk
on mmudkv.updateDataDetailKeyid = mmuddk.Id
where dbname = 'MY_NEW_DB'
and mmudkv.value in ('start', 'finish')
group
by (CONVERT(CHAR(8), mmud.logDateTime, 112))
) as resultSet
group
by substring(yyyyMMdd, 0,7)
order
by substring(yyyyMMdd, 0,7)
in LINQ => I first fetch the record from a table that links information of the Database Name and UpdateData and then do filtering and groupby on the related information.
entry.updatedata.Where(
ue => ue.updatedataKeyValue.Any(
uedkv =>
uedkv.Value.ToLower() == "starting update" ||
uedkv.Value.ToLower() == "client release"))
.Select(
ue =>
new
{
logDateTimeyyyyMMdd = ue.logDateTime.Date,
logDateTime = ue.logDateTime
})
.GroupBy(
updateDataDetail => updateDataDetail.logDateTimeyyyyMMdd)
.Select(
groupedupdatedata => new
{
UpdateDateyyyyMM = groupedupdatedata.Key.ToString("yyyyMMdd"),
Duration =
(groupedupdatedata.Max(groupMember => groupMember.logDateTime) -
groupedupdatedata.Min(groupMember => groupMember.logDateTime)
)
.TotalMilliseconds
}
).
ToList();
var updatedataMonthlyDurations =
updatedataInDateRangeWithDescriptions.GroupBy(ue => ue.UpdateDateyyyyMM.Substring(0,6))
.Select(
group =>
new updatedataMonthlyAverageDuration
{
DbName = entry.DbName,
UpdateDateyyyyMM = group.Key.Substring(0,6),
Duration =
group.Average(
gmember =>
(gmember.Duration))
}
).ToList();
I know that GroupBy in LINQ isn't the same as GroupBy in T-SQL, but not sure what happens behind the scenes. Could anyone explain the difference and what happens in memory when I run the LINQ version? After I did the .ToList() after the first GroupBy things got a little faster. But even then this way of finding average duration is really slow.
What would be the best alternative and are there ways of improving a slow LINQ statement using Visual Studio 2012?
Your linq query is doing most of its work in linq-to-objects. You should be constructing a linq-to-entities/sql query that generates the complete query in one shot.
Your query seems to have a redundant group by clause, and I am not sure which table dbname comes from, but the following query should get you on the right track.
var query = from mmud in context.updateData
from mmudkv in context.updateDataKeyValue
.Where(x => mmud.updateDataid == x.updateDataId)
.DefaultIfEmpty()
from mmuddk in context.updateDataDetailKey
.Where(x => mmudkv.updateDataDetailKeyid == x.Id)
.DefaultIfEmpty()
where mmud.dbname == "MY_NEW_DB"
where mmudkv.value == "start" || mmudkv.value == "finish"
group mmud by mmud.logDateTime.Date into g
select new
{
Date = g.Key,
Average = EntityFunctions.DiffMilliseconds(g.Max(x => x.logDateTime), g.Min(x => x.logDateTime)),
};
var queryByMonth = from x in query
group x by new { x.Date.Year, x.Date.Month } into x
select new
{
Year = x.Key.Year,
Month = x.Key.Month,
Average = x.Average(y => y.Average)
};
// Single sql statement is to sent to your database
var result = queryByMonth.ToList();
If you are still having problems, we will need to know if you are using entityframework or linq-to-sql. And you will need to provide your context/model information