Left Outer Join in Linq not working - c#

I'm trying to do a left outer join in Linq but the below code is not working
var result = from dataRows1 in agdt.AsEnumerable()
join dataRows2 in hwt.AsEnumerable()
on dataRows1.Field<string>("ID") equals dataRows2.Field<string>("HWID")
where ((dataRows2.Field<string>("HWID") == null) &&
(dataRows1.Field<string>("TYPE")=="a"))
select dataRows1;
Without the where clauses I receive about 37000 rows and with it I receieve 0. The agdt table has 12000 rows and the hwt table has 6000. This is getting very frustrating. Can someone please help?

You are missing the DefaultIfEmpty method call.
From what I understand from your query, it should look something like:
var result = from dataRows1 in agdt.AsEnumerable()
join dataRows2 in hwt.AsEnumerable()
on dataRows1.Field<string>("ID") equals dataRows2.Field<string>("HWID")
into groupJoin
from leftOuterJoinedTable in groupJoin.DefaultIfEmpty()
where (leftOuterJoinedTable == null &&
(dataRows1.Field<string>("TYPE")=="a"))
select dataRows1;

It seems to me that this in essence would be the same as running the following SQL query
SELECT
DR1.*
FROM DataRows1 DR1
INNER JOIN DataRows2 DR2
ON DR1.ID=DR2.HWID
WHERE
DR2.HWID IS NULL
AND
DR1.Type='a'
Essentially your LINQ is doing an inner join and then executing the where. To truly do a left join, see the link
LEFT OUTER JOIN in LINQ

Related

What's wrong with the joins in this LINQ query?

I'm trying to replicate the following SQL query in LINQ:
SELECT *
FROM Table1 AS D INNER JOIN Table2 AS DV ON D.Table1Id = DV.Table1Id
INNER JOIN Table3 AS VT ON DV.Table3Id = VT.Table3Id
INNER JOIN Table4 AS C ON DV.CurrencyId = C.CurrencyId
INNER JOIN Table5 AS FP ON DV.DVDate BETWEEN FP.StartDate AND FP.EndDate
INNER JOIN Table6 AS FX ON DV.CurrencyId = FX.FromCurrencyId AND FX.ToCurrencyId = 'USD' AND FX.FiscalPeriodId = FP.FiscalPeriodId
This is what I have in LINQ:
from d in db.Table1
join dv in db.Table2 on d.Table1Id equals dv.Table1Id
join vt in db.Table3 on dv.Table3Id equals vt.Table3Id
join c in db.Table4 on dv.CurrencyId equals c.CurrencyId
join fp in db.Table5 on dv.DVDate >= fp.StartDate && dv.DVDate <= fp.EndDate //error on this line
join fx in db.Table6 on dv.CurrencyId equals fx.FromCurrencyId && fx.ToCurrencyId equals "USD" && fx.FiscalPeriodId equals fp.FiscalPeriodId //error also on this line
The last two joins to fp and fx are the problem but it's not clear to me what's wrong, it doesn't seem to like && but there's no and keyword like there is an equals that replaces =.
I've removed the select portion from LINQ as it's not relevant to the problem and I'd like to avoid spending more time obfuscating table and field names.
"A join clause performs an equijoin. In other words, you can only base matches on the equality of two keys. Other types of comparisons such as "greater than" or "not equals" are not supported. To make clear that all joins are equijoins, the join clause uses the equals keyword instead of the == operator. "
reference: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/join-clause
you need to do this in the where clause. Like here:
https://stackoverflow.com/a/3547706/3058487
To do a join using composite keys, you need to do something like here:
new { dv.CurrencyId, fp.FiscalPeriodId } equals new { CurrencyId = fx.ToCurrencyId, fx.FiscalPeriodId }
Reference:
https://learn.microsoft.com/en-us/dotnet/csharp/linq/join-by-using-composite-keys

SQL to LINQ - Left Join Before Inner Join

So I have a SQL query that I would like to convert to LINQ.
Here is said query:
SELECT *
FROM DatabaseA.SchemaA.TableA ta
LEFT OUTER JOIN DatabaseA.SchemaA.TableB tb
ON tb.ShipId = ta.ShipId
INNER JOIN DatabaseA.SchemaA.TableC tc
ON tc.PostageId= tb.PostageId
WHERE tc.PostageCode = 'Package'
AND ta.MailId = 'Specification'
The problem I am struggling with is I cannot seem to figure out how to do a left join in LINQ before an inner join, since doing a left join in LINQ is not as clear to me at least.
I have found numerous examples of a LINQ inner join and then a left join, but not left join and then inner join.
If it helps, here is the LINQ query I have been playing around with:
var query = from m in tableA
join s in tableB on m.ShipId equals s.ShipId into queryDetails
from qd in queryDetails.DefaultIfEmpty()
join p in tableC on qd.PostageId equals p.PostageId
where m.MailId == "Specification" && p.PostageCode == "Package"
select m.MailId;
I have tried this a few different ways but I keep getting an "Object reference not set to an instance of an object" error on qd.PostageId.
LINQ is very new to me and I love learning it, so any help on this would be much appreciated. Thanks!
From my SQL conversion recipe:
JOIN conditions that aren't all equality tests with AND must be handled using where clauses outside the join, or with cross product (from ... from ...) and then where
JOIN conditions that are multiple ANDed equality tests between the two tables should be translated into anonymous objects
LEFT JOIN is simulated by using into joinvariable and doing another from from the joinvariable followed by .DefaultIfEmpty().
The order of JOIN clauses doesn't change how you translate them:
var ans = from ta in TableA
join tb in TableB on ta.ShipId equals tb.ShipId into tbj
from tb in tbj.DefaultIfEmpty()
join tc in TableC on tb.PostageId equals tc.PostageId
where tc.PostageCode == "Package" && ta.MailId == "Specification"
select new { ta, tb, tc };
However, because the LEFT JOIN is executed before the INNER JOIN and then the NULL PostageIds in TableB for unmatched rows will never match any row in TableC, it becomes equivalent to an INNER JOIN as well, which translates as:
var ans2 = from ta in tableA
join tb in tableB on ta.ShipId equals tb.ShipId
join tc in tableC on tb.PostageId equals tc.PostageId
where tc.PostageCode == "Package" && ta.MailId == "Specification"
select new { ta, tb, tc };
Use:
var query = from m in tableA
join s in tableB on m.ShipId equals s.ShipId
join p in tableC on s.PostageId equals p.PostageId
where m.MailId == "Specification" && p.PostageCode == "Package"
select m.MailId;
Your query uses a LEFT OUTER JOIN but it doesn't need it.
It will, in practice, function as an INNER JOIN due to your tc.PostageCode = 'Package' clause. If you compare to a column value in a table in a WHERE clause (and there are no OR clauses and you aren't comparing to NULL) then effectively all joins to get to that table will be treated as INNER).
That clause will never be true if TableB is null (which is why you use LEFT OUTER JOIN vs INNER JOIN) - so you should just use an INNER JOIN to make the problem simpler.

Multiple table join with multiple table where clause

I have a query that's something like this.
Select a.*
from table1 a
inner join table2 b on a.field1 = b.field1
inner join table3 c on b.field2 = c.field2
where b.field4 = beta and c.field5 = gamma.
On LINQ, I tried to do that this way:
var query = (from a in table1
join b in table2 on a["field1"] equals b["field1"]
join c in table3 on b["field2"] equals c["field2"]
where (b["field4"] == beta && c["field5"] == gamma)
select a).ToList();
But for some reason, when I try to do this I get an error that says that the entity "table2" doesn't have the field Name = "field5", as though as the where clause was all about the last joined table and the other ones were unaccessible. Furthermore, the compiler doesn't seem to notice neither, because it lets me write c["field5"] == gamma with no warning.
Any ideas? Am I writing this wrong?
Thanks
See these links:
How to: Perform Inner Joins (C# Programming Guide)
What is the syntax for an inner join in linq to sql?
Why you don't create View in database, and Select your data from View in LINQ?

linq-to-sql joins with multiple from clauses syntax vs. traditional join syntax

What the difference between writing a join using 2 from clauses and a where like this:
var SomeQuery = from a in MyDC.Table1
from b in MyDC.Table2
where a.SomeCol1 == SomeParameter && a.SomeCol2 === b.SomeCol1
and writing a join using the join operator.
This is for a join on 2 tables but of course, sometimes, we need to join even more tables and we need to combine other from clauses with where if we choose the syntax above.
I know both syntax queries return the same data but I was wondering if there's a performance difference, or another kind of difference, that would conclusively favor one syntax over the other.
Thanks for your suggestions.
This question is actually answered pretty good in these two.
INNER JOIN ON vs WHERE clause
INNER JOIN vs multiple table names in "FROM"
I've included two examples on how three different LINQ expressions will be translated into SQL.
Implicit join:
from prod in Articles
from kat in MainGroups
where kat.MainGroupNo == prod.MainGroupNo
select new { kat.Name, prod.ArticleNo }
Will be translated into
SELECT [t1].[Name], [t0].[ArticleNo]
FROM [dbo].[Article] AS [t0], [dbo].[MainGroup] AS [t1]
WHERE [t1].[MainGroupNo] = [t0].[MainGroupNo]
Inner join:
from prod in Articles
join kat in MainGroups on prod.MainGroupNo equals kat.MainGroupNo
select new { kat.Name, prod.ArticleNo }
Will be translated into
SELECT [t1].[Name], [t0].[ArticleNo]
FROM [dbo].[Article] AS [t0]
INNER JOIN [dbo].[MainGroup] AS [t1] ON [t0].[MainGroupNo] = [t1].[MainGroupNo]
Left outer join:
from prod in Articles
join g1 in MainGroups on prod.MainGroupNo equals g1.MainGroupNo into prodGroup
from kat in prodGroup.DefaultIfEmpty()
select new { kat.Name, prod.ArticleNo }
Will be translated into
SELECT [t1].[Name] AS [Name], [t0].[ArticleNo]
FROM [dbo].[Article] AS [t0]
LEFT OUTER JOIN [dbo].[MainGroup] AS [t1] ON [t0].[MainGroupNo] = [t1].[MainGroupNo]
If you want to test how your expressions will be translated into SQL, I recommend that you try LINQPad. It's an awesome tool for figuring out this kind of stuff.
var result = from a in DB.classA
from b in DB.classB
where a.id.Equals(b.id)
select new{a.b};

Self join in Entity Framework

I want to have following type of query in entity frame work
SELECT c2.*
FROM Category c1 INNER JOIN Category c2
ON c1.CategoryID = c2.ParentCategoryID
WHERE c1.ParentCategoryID is NULL
How to do the above work in Entity framework...
Well, I don't know much about EF, but that looks something like:
var query = from c1 in db.Category
where c1.ParentCategoryId == null
join c2 in db.Category on c1.CategoryId equals c2.ParentCategoryId
select c2;
Just to tidy this up the following is a bit nicer and does the same thing:
var query = from c1 in db.Category
from c2 in db.Category
where c1.ParentCategoryId == null
where c1.CategoryId == c2.ParentCategoryId
select c2;
In EF 4.0+, LEFT JOIN syntax is a little different and presents a crazy quirk:
var query = from c1 in db.Category
join c2 in db.Category on c1.CategoryID equals c2.ParentCategoryID
into ChildCategory
from cc in ChildCategory.DefaultIfEmpty()
select new CategoryObject
{
CategoryID = c1.CategoryID,
ChildName = cc.CategoryName
}
If you capture the execution of this query in SQL Server Profiler, you will see that it does indeed perform a LEFT OUTER JOIN. HOWEVER, if you have multiple LEFT JOIN ("Group Join") clauses in your Linq-to-Entity query, I have found that the self-join clause MAY actually execute as in INNER JOIN - EVEN IF THE ABOVE SYNTAX IS USED!
The resolution to that? As crazy and, according to MS, wrong as it sounds, I resolved this by changing the order of the join clauses. If the self-referencing LEFT JOIN clause was the 1st Linq Group Join, SQL Profiler reported an INNER JOIN. If the self-referencing LEFT JOIN clause was the LAST Linq Group Join, SQL Profiler reported an LEFT JOIN.

Categories

Resources