Self join in Entity Framework - c#

I want to have following type of query in entity frame work
SELECT c2.*
FROM Category c1 INNER JOIN Category c2
ON c1.CategoryID = c2.ParentCategoryID
WHERE c1.ParentCategoryID is NULL
How to do the above work in Entity framework...

Well, I don't know much about EF, but that looks something like:
var query = from c1 in db.Category
where c1.ParentCategoryId == null
join c2 in db.Category on c1.CategoryId equals c2.ParentCategoryId
select c2;

Just to tidy this up the following is a bit nicer and does the same thing:
var query = from c1 in db.Category
from c2 in db.Category
where c1.ParentCategoryId == null
where c1.CategoryId == c2.ParentCategoryId
select c2;

In EF 4.0+, LEFT JOIN syntax is a little different and presents a crazy quirk:
var query = from c1 in db.Category
join c2 in db.Category on c1.CategoryID equals c2.ParentCategoryID
into ChildCategory
from cc in ChildCategory.DefaultIfEmpty()
select new CategoryObject
{
CategoryID = c1.CategoryID,
ChildName = cc.CategoryName
}
If you capture the execution of this query in SQL Server Profiler, you will see that it does indeed perform a LEFT OUTER JOIN. HOWEVER, if you have multiple LEFT JOIN ("Group Join") clauses in your Linq-to-Entity query, I have found that the self-join clause MAY actually execute as in INNER JOIN - EVEN IF THE ABOVE SYNTAX IS USED!
The resolution to that? As crazy and, according to MS, wrong as it sounds, I resolved this by changing the order of the join clauses. If the self-referencing LEFT JOIN clause was the 1st Linq Group Join, SQL Profiler reported an INNER JOIN. If the self-referencing LEFT JOIN clause was the LAST Linq Group Join, SQL Profiler reported an LEFT JOIN.

Related

SQL to LINQ Left Outer Join with Multiple Conditions [duplicate]

This question already has answers here:
How to do joins in LINQ on multiple fields in single join
(13 answers)
Closed 1 year ago.
I am struggling with this left outer join with multiple conditions in LINQ. My initial SQL statement is as follows.
SELECT DISTINCT [Product].[prod_id],
[Product].[brd_id],
[Product].[prod_brdId],
[Size].[size_id],
[Size].[size_sort],
[Bin].[bin_rack],
[Bin].[bin_alpha]
FROM [Proddetails]
JOIN [Prodcolor]
ON [Proddetails].[pcl_id] = [Prodcolor].[pcl_id]
JOIN [Product]
ON [Prodcolor].[prod_id] = [Product].[prod_id]
JOIN [Size]
ON [Proddetails].[pdt_size] = [Size].[size_id]
LEFT OUTER JOIN [Bin]
ON [Product].[prod_id] = [Bin].[prod_id]
ORDER BY [Product].[brd_id], [Product].[prod_brdId], [Size].[size_sort]
And my corresponding LINQ statement in .NET
BinProds = (
from pd in applicationDbContext.Proddetails
join pc in applicationDbContext.Prodcolors on pd.PclId equals pc.PclId
join pr in applicationDbContext.Products on pc.ProdId equals pr.ProdId
join sz in applicationDbContext.Sizes on pd.PdtSize equals sz.SizeId
join bn in applicationDbContext.Bins on pr.ProdId equals bn.ProdId into ps
from bn in ps.DefaultIfEmpty()
select new CoreBin
{
... fields here
}
).Distinct().OrderBy(i=>i.BrdId).ThenBy(j=>j.ProdBrdId).ThenBy(k=>k.SizeSort).ToList();
So, both of these execute successfully in SQL Server Studio and .NET respectively. However it's not giving me the exact result I'm looking for because it's missing one additional condition on the LEFT OUTER JOIN. Below is the one change I made to my SQL statement which gives me exactly what I want in SQL Server Studio. I just can't figure out how to adjust the LINQ to get the same result in my .NET project.
LEFT OUTER JOIN [Bin]
ON [Product].[prod_id] = [Bin].[prod_id]
AND [Bin].[size_id] = [Size].[size_id]
I'm hoping there is a reasonable adjustment. If you need my table outputs or db design, plz let me know.
This one should give you desired SQL:
.....
join bn in applicationDbContext.Bins
on new { pr.ProdId, sz.size_id } equals new { bn.ProdId, bn.size_id } into ps
from bn in ps.DefaultIfEmpty()
select new CoreBin
{
... fields here
}
.....

SQL to LINQ - Left Join Before Inner Join

So I have a SQL query that I would like to convert to LINQ.
Here is said query:
SELECT *
FROM DatabaseA.SchemaA.TableA ta
LEFT OUTER JOIN DatabaseA.SchemaA.TableB tb
ON tb.ShipId = ta.ShipId
INNER JOIN DatabaseA.SchemaA.TableC tc
ON tc.PostageId= tb.PostageId
WHERE tc.PostageCode = 'Package'
AND ta.MailId = 'Specification'
The problem I am struggling with is I cannot seem to figure out how to do a left join in LINQ before an inner join, since doing a left join in LINQ is not as clear to me at least.
I have found numerous examples of a LINQ inner join and then a left join, but not left join and then inner join.
If it helps, here is the LINQ query I have been playing around with:
var query = from m in tableA
join s in tableB on m.ShipId equals s.ShipId into queryDetails
from qd in queryDetails.DefaultIfEmpty()
join p in tableC on qd.PostageId equals p.PostageId
where m.MailId == "Specification" && p.PostageCode == "Package"
select m.MailId;
I have tried this a few different ways but I keep getting an "Object reference not set to an instance of an object" error on qd.PostageId.
LINQ is very new to me and I love learning it, so any help on this would be much appreciated. Thanks!
From my SQL conversion recipe:
JOIN conditions that aren't all equality tests with AND must be handled using where clauses outside the join, or with cross product (from ... from ...) and then where
JOIN conditions that are multiple ANDed equality tests between the two tables should be translated into anonymous objects
LEFT JOIN is simulated by using into joinvariable and doing another from from the joinvariable followed by .DefaultIfEmpty().
The order of JOIN clauses doesn't change how you translate them:
var ans = from ta in TableA
join tb in TableB on ta.ShipId equals tb.ShipId into tbj
from tb in tbj.DefaultIfEmpty()
join tc in TableC on tb.PostageId equals tc.PostageId
where tc.PostageCode == "Package" && ta.MailId == "Specification"
select new { ta, tb, tc };
However, because the LEFT JOIN is executed before the INNER JOIN and then the NULL PostageIds in TableB for unmatched rows will never match any row in TableC, it becomes equivalent to an INNER JOIN as well, which translates as:
var ans2 = from ta in tableA
join tb in tableB on ta.ShipId equals tb.ShipId
join tc in tableC on tb.PostageId equals tc.PostageId
where tc.PostageCode == "Package" && ta.MailId == "Specification"
select new { ta, tb, tc };
Use:
var query = from m in tableA
join s in tableB on m.ShipId equals s.ShipId
join p in tableC on s.PostageId equals p.PostageId
where m.MailId == "Specification" && p.PostageCode == "Package"
select m.MailId;
Your query uses a LEFT OUTER JOIN but it doesn't need it.
It will, in practice, function as an INNER JOIN due to your tc.PostageCode = 'Package' clause. If you compare to a column value in a table in a WHERE clause (and there are no OR clauses and you aren't comparing to NULL) then effectively all joins to get to that table will be treated as INNER).
That clause will never be true if TableB is null (which is why you use LEFT OUTER JOIN vs INNER JOIN) - so you should just use an INNER JOIN to make the problem simpler.

Left Outer Join in Linq not working

I'm trying to do a left outer join in Linq but the below code is not working
var result = from dataRows1 in agdt.AsEnumerable()
join dataRows2 in hwt.AsEnumerable()
on dataRows1.Field<string>("ID") equals dataRows2.Field<string>("HWID")
where ((dataRows2.Field<string>("HWID") == null) &&
(dataRows1.Field<string>("TYPE")=="a"))
select dataRows1;
Without the where clauses I receive about 37000 rows and with it I receieve 0. The agdt table has 12000 rows and the hwt table has 6000. This is getting very frustrating. Can someone please help?
You are missing the DefaultIfEmpty method call.
From what I understand from your query, it should look something like:
var result = from dataRows1 in agdt.AsEnumerable()
join dataRows2 in hwt.AsEnumerable()
on dataRows1.Field<string>("ID") equals dataRows2.Field<string>("HWID")
into groupJoin
from leftOuterJoinedTable in groupJoin.DefaultIfEmpty()
where (leftOuterJoinedTable == null &&
(dataRows1.Field<string>("TYPE")=="a"))
select dataRows1;
It seems to me that this in essence would be the same as running the following SQL query
SELECT
DR1.*
FROM DataRows1 DR1
INNER JOIN DataRows2 DR2
ON DR1.ID=DR2.HWID
WHERE
DR2.HWID IS NULL
AND
DR1.Type='a'
Essentially your LINQ is doing an inner join and then executing the where. To truly do a left join, see the link
LEFT OUTER JOIN in LINQ

Multiple table join with multiple table where clause

I have a query that's something like this.
Select a.*
from table1 a
inner join table2 b on a.field1 = b.field1
inner join table3 c on b.field2 = c.field2
where b.field4 = beta and c.field5 = gamma.
On LINQ, I tried to do that this way:
var query = (from a in table1
join b in table2 on a["field1"] equals b["field1"]
join c in table3 on b["field2"] equals c["field2"]
where (b["field4"] == beta && c["field5"] == gamma)
select a).ToList();
But for some reason, when I try to do this I get an error that says that the entity "table2" doesn't have the field Name = "field5", as though as the where clause was all about the last joined table and the other ones were unaccessible. Furthermore, the compiler doesn't seem to notice neither, because it lets me write c["field5"] == gamma with no warning.
Any ideas? Am I writing this wrong?
Thanks
See these links:
How to: Perform Inner Joins (C# Programming Guide)
What is the syntax for an inner join in linq to sql?
Why you don't create View in database, and Select your data from View in LINQ?

linq-to-sql joins with multiple from clauses syntax vs. traditional join syntax

What the difference between writing a join using 2 from clauses and a where like this:
var SomeQuery = from a in MyDC.Table1
from b in MyDC.Table2
where a.SomeCol1 == SomeParameter && a.SomeCol2 === b.SomeCol1
and writing a join using the join operator.
This is for a join on 2 tables but of course, sometimes, we need to join even more tables and we need to combine other from clauses with where if we choose the syntax above.
I know both syntax queries return the same data but I was wondering if there's a performance difference, or another kind of difference, that would conclusively favor one syntax over the other.
Thanks for your suggestions.
This question is actually answered pretty good in these two.
INNER JOIN ON vs WHERE clause
INNER JOIN vs multiple table names in "FROM"
I've included two examples on how three different LINQ expressions will be translated into SQL.
Implicit join:
from prod in Articles
from kat in MainGroups
where kat.MainGroupNo == prod.MainGroupNo
select new { kat.Name, prod.ArticleNo }
Will be translated into
SELECT [t1].[Name], [t0].[ArticleNo]
FROM [dbo].[Article] AS [t0], [dbo].[MainGroup] AS [t1]
WHERE [t1].[MainGroupNo] = [t0].[MainGroupNo]
Inner join:
from prod in Articles
join kat in MainGroups on prod.MainGroupNo equals kat.MainGroupNo
select new { kat.Name, prod.ArticleNo }
Will be translated into
SELECT [t1].[Name], [t0].[ArticleNo]
FROM [dbo].[Article] AS [t0]
INNER JOIN [dbo].[MainGroup] AS [t1] ON [t0].[MainGroupNo] = [t1].[MainGroupNo]
Left outer join:
from prod in Articles
join g1 in MainGroups on prod.MainGroupNo equals g1.MainGroupNo into prodGroup
from kat in prodGroup.DefaultIfEmpty()
select new { kat.Name, prod.ArticleNo }
Will be translated into
SELECT [t1].[Name] AS [Name], [t0].[ArticleNo]
FROM [dbo].[Article] AS [t0]
LEFT OUTER JOIN [dbo].[MainGroup] AS [t1] ON [t0].[MainGroupNo] = [t1].[MainGroupNo]
If you want to test how your expressions will be translated into SQL, I recommend that you try LINQPad. It's an awesome tool for figuring out this kind of stuff.
var result = from a in DB.classA
from b in DB.classB
where a.id.Equals(b.id)
select new{a.b};

Categories

Resources