Queries generated by group by vs group join - c#

I have the following group by linq statement
from c in Categories
join p in Products on c equals p.Category into ps
select new { Category = new {c.CategoryID, c.CategoryName}, Products = ps };
However this generates the following left outer join query and returns all categories even if there are no products associated.
SELECT [t0].[CategoryID], [t0].[CategoryName], [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID] AS [CategoryID2], [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued], (
SELECT COUNT(*)
FROM [Products] AS [t2]
WHERE [t0].[CategoryID] = [t2].[CategoryID]
) AS [value]
FROM [Categories] AS [t0]
LEFT OUTER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
ORDER BY [t0].[CategoryID], [t1].[ProductID]
What I really want is to return only those categories that have associated products. But if I re-write the linq query like so:
from c in Categories
join p in Products on c equals p.Category
group p by new {c.CategoryID, c.CategoryName} into ps
select new { Category = ps.Key, Products = ps };
This gives me the desired result but a query is generated for each category:
SELECT [t0].[CategoryID], [t0].[CategoryName]
FROM [Categories] AS [t0]
INNER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
GROUP BY [t0].[CategoryID], [t0].[CategoryName]
GO
-- Region Parameters
DECLARE #x1 Int SET #x1 = 1
DECLARE #x2 NVarChar(9) SET #x2 = 'Beverages'
-- EndRegion
SELECT [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID], [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued]
FROM [Categories] AS [t0]
INNER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
WHERE (#x1 = [t0].[CategoryID]) AND (#x2 = [t0].[CategoryName])
GO
-- Region Parameters
DECLARE #x1 Int SET #x1 = 2
DECLARE #x2 NVarChar(10) SET #x2 = 'Condiments'
-- EndRegion
SELECT [t1].[ProductID], [t1].[ProductName], [t1].[SupplierID], [t1].[CategoryID], [t1].[QuantityPerUnit], [t1].[UnitPrice], [t1].[UnitsInStock], [t1].[UnitsOnOrder], [t1].[ReorderLevel], [t1].[Discontinued]
FROM [Categories] AS [t0]
INNER JOIN [Products] AS [t1] ON [t0].[CategoryID] = [t1].[CategoryID]
WHERE (#x1 = [t0].[CategoryID]) AND (#x2 = [t0].[CategoryName])
GO
...
Is there a way to do the equivalent of a inner join and group by and still only produce a single query like the group join?

var queryYouWant =
from c in Categories
join p in Products on c equals p.Category
select new {Category = c, Product = p};
var result =
from x in queryYouWant.AsEnumerable()
group x.Product by x.Category into g
select new { Category = g.Key, Products = g };
Is there a way to do the equivalent of a inner join and group by and still only produce a single query like the group join?
No. When you say GroupBy followed by non-aggregated access of the group elements, that's a repeated query with the group key as a filter.

What is the purpose of that join?
Your original query is identical to this:
from c in Categories
select new { Category = new { c.CategoryID, c.CategoryName }, c.Products }
Am I somehow missing something obvious???
If you want only categories with products, then do this:
from c in Categories
where c.Products.Any()
select new { Category = new { c.CategoryID, c.CategoryName }, c.Products }
Or, if you want to flatten the results:
from p in Products
select new { p, p.Category.CategoryID, p.Category.CategoryName }
The latter will translate into an inner or outer join - depending on whether that relationship is nullable. You can force the equivalent of an inner join as follows:
from p in Products
where p.Category != null
select new { p, p.Category.CategoryID, p.Category.CategoryName }

Related

Complex joins in Linq with multiple tables and LEFT OUTER JOIN

Hoping someone can point me in the right direction with this join. I'm trying to convert some SQL to Linq. My SQL has a left outer join after several inner joins. The following SQL produces the desired result:
SELECT TOP(50) [t].[TagFriendlyName] AS [TagName], [t0].[timeStamp] AS [LastSeen], [l].[Name] AS [LocationName]
FROM [Tags] AS [t]
INNER JOIN [tag_reads] AS [t0] ON [t].[epc] = [t0].[epc]
INNER JOIN [ReaderData] AS [r] ON [t0].[ReaderDataId] = [r].[Id]
LEFT OUTER JOIN [Readers] AS [r0] ON [r].[mac_address] = [r0].[mac_address]
INNER JOIN [Locations] AS [l] on [t0].[antennaPort] = [l].[AntennaId] AND [r].[Id] = [l].[ReaderId]
GROUP BY [t].[TagFriendlyName], [t0].[timeStamp], [l].[Name]
ORDER BY [t0].[timeStamp] DESC
My Linq code is as follows, but I can't figure out how to get the left outer join inserted properly. Not sure how to introduce the Readers table that needs the LEFT OUTER JOIN:
var query = (
from tags in db.Tags
join tagreads in db.tag_reads on tags.epc equals tagreads.epc
join readerdata in db.ReaderData on tagreads.ReaderDataId equals readerdata.Id
join readers in db.Readers on readerdata.mac_address equals readers.mac_address
group tags by new { tags.TagFriendlyName, timestamp = tagreads.timeStamp, readerdata.mac_address } into grp
select new CurrentStatus()
{
TagName = grp.Key.TagFriendlyName,
LastSeen = grp.Key.timestamp,
LocationName = grp.Key.mac_address
}
)
.OrderByDescending(o => o.LastSeen)
According to the documentation I need to use DefaultIfEmpty(), but I'm not sure where to introduce the Readers table.
Using EF Core 3.1.0. THANKS!
You should apply Left Join this way:
join readers in db.Readers on readerdata.mac_address equals readers.mac_address into readersJ
from readers in readersJ.DefaultIfEmpty()
The full code:
var query = (
from tags in db.Tags
join tagreads in db.tag_reads on tags.epc equals tagreads.epc
join readerdata in db.ReaderData on tagreads.ReaderDataId equals readerdata.Id
join readers in db.Readers on readerdata.mac_address equals readers.mac_address into readersJ
from readers in readersJ.DefaultIfEmpty()
join locations in db.Locations
on new { ap = tagreads.antennaPort, rd = readerdata.Id }
equals new { ap = locations.AntennaId, rd = locations.ReaderId }
group tags by new { tags.TagFriendlyName, timestamp = tagreads.timeStamp, readerdata.mac_address } into grp
select new CurrentStatus()
{
TagName = grp.Key.TagFriendlyName,
LastSeen = grp.Key.timestamp,
LocationName = grp.Key.mac_address
}
)
.OrderByDescending(o => o.LastSeen)

Generate a report from Northwind DB using Linq

I'm trying to generate following report from popular NorthWind DB using Linq. It should be group by Customer, OrderYear.
CustomerName OrderYear Amount
I've to use the following tables Customer,Order and Order Details.
So far this is what I've done.
NorthwindDataContext north = new NorthwindDataContext();
var query = from o in north.Orders
group o by o.Customer.CompanyName into cg
select new
{
Company = cg.Key,
YearGroup = ( from y in cg
group y by y.OrderDate.Value.Year into yg
select new
{
Year = yg.Key,
YearOrdes = yg
}
)
};
foreach (var q in query)
{
Console.WriteLine("Customer Name : " + q.Company);
foreach (var o in q.YearGroup)
{
Console.WriteLine("Year " + o.Year);
Console.WriteLine("Sum " + o.YearOrdes.Sum(yo => yo.Order_Details.Sum( yd=> Convert.ToDecimal(yd.UnitPrice* yd.Quantity))));
}
Console.WriteLine();
}
It is giving me expected results. I compared by running t-sql in back end.But, I've 2 questions.
In the Inner foreach, the 2nd statement generate the sum. Is it proper approach? Or there is better one available?
How to get the Sum in the Linq query itself.
Got it in single LINQ to SQL query:
var query = from o in north.Orders
from c in north.Customers.Where(c => c.CustomerID == o.CustomerID).DefaultIfEmpty()
from d in north.Order_Details.Where(d => d.OrderID == o.OrderID).DefaultIfEmpty()
group new { o, c, d } by new { o.OrderDate.Value.Year, c.CompanyName } into g
select new
{
Company = g.Key.CompanyName,
OrderYear = g.Key.Year,
Amount = g.Sum(e => e.d.UnitPrice * e.d.Quantity)
};
You can then simply get results:
var results = query.ToList();
Or sort it before fetching:
var results = query.OrderBy(g => g.Company).ThenByDescending(g => g.OrderYear).ToList();
I was curious about SQL that is generated by that LINQ to SQL query, so set custom Log and here it is:
SELECT [t5].[value22] AS [Company], [t5].[value2] AS [OrderYear], [t5].[value] AS [Amount]
FROM (
SELECT SUM([t4].[value]) AS [value], [t4].[value2], [t4].[value22]
FROM (
SELECT [t3].[UnitPrice] * (CONVERT(Decimal(29,4),[t3].[Quantity])) AS [value], [t3].[value] AS [value2], [t3].[value2] AS [value22]
FROM (
SELECT DATEPART(Year, [t0].[OrderDate]) AS [value], [t1].[CompanyName] AS [value2], [t2].[UnitPrice], [t2].[Quantity]
FROM [dbo].[Orders] AS [t0]
LEFT OUTER JOIN [dbo].[Customers] AS [t1] ON [t1].[CustomerID] = [t0].[CustomerID]
LEFT OUTER JOIN [dbo].[Order Details] AS [t2] ON [t2].[OrderID] = [t0].[OrderID]
) AS [t3]
) AS [t4]
GROUP BY [t4].[value2], [t4].[value22]
) AS [t5]
ORDER BY [t5].[value22], [t5].[value2] DESC
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.6387
A bit scary, isn't it? But if you look closer, there is standard LEFT JOIN used to combine all three tables together! All the rest is just grouping, sorting and summing.

How to write linq query to prevent duplicates joins?

I have a query that search for all accommodations in an order, sorted by day. When I check on the sever what query is executed, I see multiple join toward the same table on the same keys
var parcourt = this.DataService.From<OrderItem>()
.Where(i => i.OrderId == orderId && i.Product.ProductTypeId == (int)ProductTypes.Accommodation)
.OrderBy(i => i.DayNumber)
.ThenBy(i => i.OrderItemId)
.Select(i => new
{
i.OrderItemId,
i.DayNumber,
i.Product.Establishment.Address,
i.Product.Establishment.Coordinates
});
If you check the resulting SQL (as show by ToTraceString), you can see two join on the Products and Establishments table.
SELECT
[Project1].[OrderItemId] AS [OrderItemId],
[Project1].[DayNumber] AS [DayNumber],
[Project1].[Address] AS [Address],
[Project1].[EstablishmentId] AS [EstablishmentId],
[Project1].[Latitude] AS [Latitude],
[Project1].[Longitude] AS [Longitude]
FROM ( SELECT
[Extent1].[OrderItemId] AS [OrderItemId],
[Extent1].[DayNumber] AS [DayNumber],
[Extent4].[Address] AS [Address],
[Extent5].[EstablishmentId] AS [EstablishmentId],
[Extent5].[Latitude] AS [Latitude],
[Extent5].[Longitude] AS [Longitude]
FROM [dbo].[OrderItems] AS [Extent1]
INNER JOIN [dbo].[Products] AS [Extent2] ON [Extent1].[ProductId] = [Extent2].[ProductId]
LEFT OUTER JOIN [dbo].[Products] AS [Extent3] ON [Extent1].[ProductId] = [Extent3].[ProductId]
LEFT OUTER JOIN [dbo].[Establishments] AS [Extent4] ON [Extent3].[EstablishmentId] = [Extent4].[EstablishmentId]
LEFT OUTER JOIN [dbo].[Establishments] AS [Extent5] ON [Extent3].[EstablishmentId] = [Extent5].[EstablishmentId]
WHERE (1 = [Extent2].[ProductTypeId]) AND ([Extent1].[OrderId] = #p__linq__0)
) AS [Project1]
ORDER BY [Project1].[DayNumber] ASC, [Project1].[OrderItemId] ASC
How can I prevent this linq-to-entities from joining twice on a table? How can I rewrite the query to avoid this situation?
The table structure goes as follow (simplified):
This is the query
Could you try this query? I think if you call all your joins explicitly, it'll not create joins automatically.
var parcourt = (from i in this.DataService.OrderItem
join p in this.DataService.Product on p.ProductId equals i.ProductId
join e in this.DataService.Establishments on e.EstablishmentId equals p.EstablishmentId
where i.OrderId == orderId && p.ProductTypeId == (int)ProductTypes.Accomodation
orderby i.DayNumber, i.OrderItemId
select new
{
i.OrderItemId,
i.DayNumber,
e.Address,
e.Coordinates
});

How to use the results of the previous query in the next query?

As a result of this query I have a table:
select i.id, o.[name] from Item i
LEFT OUTER JOIN sys.objects o on o.[name]='I' + cast(i.id as nvarchar(20))
where o.name is not null
Now I need to use the result of this table in the next query:
select PriceListItem.ProductExternalId, #id.Id, #id.FriendlyName, #id.BriefWiki,
[PriceListItem].[ProductExternalDesc]
from [#id]
inner join [Product] on Product.ItemId = #name and Product.InstanceId = #id.ID
inner join [PriceListItem] on Product.ID = PriceListItem.ProductId
instead of '#id' I should use data from the table with name= id, and instead of '#name' I should use data from the table with name= name
Standard SQL way, works in most RDBMS
select PriceListItem.ProductExternalId, #id.Id, #id.FriendlyName, #id.BriefWiki,
[PriceListItem].[ProductExternalDesc]
from
(
select i.id, o.[name] from Item i
LEFT OUTER JOIN sys.objects o on o.[name]='I' + cast(i.id as nvarchar(20))
where o.name is not null
)
X
inner JOIN
[#id] ON X.id = #id.id --change here as needed
inner join [Product] on Product.ItemId = #name and Product.InstanceId = #id.ID
inner join [PriceListItem] on Product.ID = PriceListItem.ProductId*
Since you're on SQL 2K8 you can use a CTE:
-- You may need a ; before WITH as in ;WITH
WITH FirstQuery AS (
select i.id, o.[name] from Item i
LEFT OUTER JOIN sys.objects o on o.[name]='I' + cast(i.id as nvarchar(20))
where o.name is not null
)
select PriceListItem.ProductExternalId,
FQ.Id,
-- Neither of these are in your FirstQuery so you can not use them
-- #id.FriendlyName, #id.BriefWiki,
[PriceListItem].[ProductExternalDesc]
from FirstQuery FQ
inner join [Product] on Product.ItemId = FQ.name
and Product.InstanceId = FQ.ID
inner join [PriceListItem] on Product.ID = PriceListItem.ProductId;
From the queries alone it's tough to tell how you plan to JOIN them, but this will allow you to make use of the first query in the subsequent one.
Looks like you have some syntax errors in your second query - #id.id?
select i.id, o.[name] from Item i
into #temp_table
LEFT OUTER JOIN sys.objects o on o.[name]='I' + cast(i.id as nvarchar(20))
where o.name is not null
Now you can use #temp_table as you want :)

linq to sql optimized a group with multiple joins

I need help generating a more efficient LINQ query:
Table: Positions
-PositionID
-Name
Table: Person
-PersonID
-Name, etc...
Table: PersonPosition
-PersonID
-PositionID
I need a result set that groups the people assigned to each position:
PositionID Person
1 John
Bob
Frank
2 Bill
Tom
Frank, etc...
My first thought was this LINQ query:
from perspos in PersonPositions
join pers in Persons on perspos.PersonID equals pers.PersonID
group pers by perspos.PositionID into groups
select new {groups.Key, groups}
Which works great, but produces the following SQL:
SELECT [t0].[PositionID] AS [Key]
FROM [PersonPosition] AS [t0]
INNER JOIN [Person] AS [t1] ON [t0].[PersonID] = [t1].[PersonID]
GROUP BY [t0].[PositionID]
GO
-- Region Parameters
DECLARE #x1 Int = 3
-- EndRegion
SELECT [t1].[PersonID], [t1].[UserID], [t1].[Firstname], [t1].[Lastname], [t1].[Email], [t1].[Phone], [t1].[Mobile], [t1].[Comment], [t1].[Permissions]
FROM [PersonPosition] AS [t0]
INNER JOIN [Person] AS [t1] ON [t0].[PersonID] = [t1].[PersonID]
WHERE #x1 = [t0].[PositionID]
GO
-- Region Parameters
DECLARE #x1 Int = 4
-- EndRegion
SELECT [t1].[PersonID], [t1].[UserID], [t1].[Firstname], [t1].[Lastname], [t1].[Email], [t1].[Phone], [t1].[Mobile], [t1].[Comment], [t1].[Permissions]
FROM [PersonPosition] AS [t0]
INNER JOIN [Person] AS [t1] ON [t0].[PersonID] = [t1].[PersonID]
WHERE #x1 = [t0].[PositionID]
GO
-- Region Parameters
DECLARE #x1 Int = 5
-- EndRegion
SELECT [t1].[PersonID], [t1].[UserID], [t1].[Firstname], [t1].[Lastname], [t1].[Email], [t1].[Phone], [t1].[Mobile], [t1].[Comment], [t1].[Permissions]
FROM [PersonPosition] AS [t0]
INNER JOIN [Person] AS [t1] ON [t0].[PersonID] = [t1].[PersonID]
WHERE #x1 = [t0].[PositionID]
GO
on and on...
Is there a better LINQ query that translates to a more efficient SQL statement?
You should already have the relationship defined in your database, and also on your dbml.
Avoid doing joins when you don't have to; they are really tedious. Let LINQ-to-SQL do this for you. Something like this should work:
var data = context.PersonPositions
.Select(pos => new { pos.PositionID, pos.Person });
return data.GroupBy(pos => pos.PositionID);
or
return context.Positions.Select(pos =>
new { pos, pos.PersonPositions.Select(pp => pp.Person).ToList() }).ToList();
I'm fairly sure you have to just join the tables and select the result, then call .AsEnumerable(), and group after that:
(from perspos in PersonPositions
join pers in Persons
on perspos.PersonID equals pers.PersonID
select new { perspos.PositionID, Person = pers })
.AsEnumerable().GroupBy(p => p.PositionID, p => p.Person);

Categories

Resources