LINQ Join with Multiple From Clauses

LINQ Join with Multiple From Clauses - c#

When writing LINQ queries in C#, I know I can perform a join using the join keyword. But what does the following do?
from c in Companies
from e in c.Employees
select e;
A LINQ book I have say it's a type of join, but not a proper join (which uses the join keyword). So exactly what type of join is it then?

Multiple "from" statements are considered compound linq statments. They are like nested foreach statements. The msdn page does list a great example here
var scoreQuery = from student in students
from score in student.Scores
where score > 90
select new { Last = student.LastName, score };
this statement could be rewritten as:
SomeDupCollection<string, decimal> nameScore = new SomeDupCollection<string, float>();
foreach(Student curStudent in students)
{
foreach(Score curScore in curStudent.scores)
{
if (curScore > 90)
{
nameScore.Add(curStudent.LastName, curScore);
}
}
}

This will get translated into a SelectMany() call. It is essentially a cross-join.
Jon Skeet talks about it on his blog, as part of the Edulinq series. (Scroll down to Secondary "from" clauses.)

The code that you listed:
from c in company
from e in c.Employees
select e;
... will produce a list of every employee for every company in the company variable. If an employee works for two companies, they will be included in the list twice.
The only "join" that might occur here is when you say c.Employees. In an SQL-backed provider, this would translate to an inner join from the Company table to the Employee table.
However, the double-from construct is often used to perform "joins" manually, like so:
from c in companies
from e in employees
where c.CompanyId == e.CompanyId
select e;
This would have a similar effect as the code you posted, with potential subtle differences depending on what the employees variable contains. This would also be equivalent to the following join:
from c in companies
join e in employees
on c.CompanyId equals e.CompanyId
select e;
If you wanted a Cartesian product, however, you could just remove the where clause. (To make it worth anything, you'd probably want to change the select slightly, too, though.)
from c in companies
from e in employees
select new {c, e};
This last query would give you every possible combination of company and employee.

All the first set of objects will be joined with all the second set of objects. For example, the following test will pass...
[TestMethod()]
public void TestJoin()
{
var list1 = new List<Object1>();
var list2 = new List<Object2>();
list1.Add(new Object1 { Prop1 = 1, Prop2 = "2" });
list1.Add(new Object1 { Prop1 = 4, Prop2 = "2av" });
list1.Add(new Object1 { Prop1 = 5, Prop2 = "2gks" });
list2.Add(new Object2 { Prop1 = 3, Prop2 = "wq" });
list2.Add(new Object2 { Prop1 = 9, Prop2 = "sdf" });
var list = (from l1 in list1
from l2 in list2
select l1).ToList();
Assert.AreEqual(6, list.Count);
}

Related

Too much data in Contains (Linq): How to increase performace

I have a linq query like this:
from a in context.table_A
join b in
(from temp in context.table_B
where idlist.Contains(temp.id)
select temp)
on a.seq_id equals b.seq_id into c
where
idlist.Contains(a.id)
select new MyObject
{
...
}).ToList();
idlist is List
The problem I have is that the idlist has too many values (hundreds of thousands to several million records). It works fine with few records but when there are too many records, the contains function is error.
Error log is
The query plan cannot be created due to lack of internal resources of
the query processor. This is a rare event and only occurs for very
complex queries or queries that reference a very large number of
tables or partitions. Make the query easy.
I want to improve the performance of this section. Any ideas?

I would suggest to install extension linq2db.EntityFrameworkCore and use temporary tables with fast BulkCopy
public class IdItem
{
public int Id { get; set; }
}
...
var items = idList.Select(id => new IdItem { Id = id });
using var db = context.CreateLinqToDBConnection();
using var temp = db.CreateTempTable("#IdList", items);
var query =
from a in context.table_A
join id1 in temp on a.Id equals id1.Id
join b in context.table_B on a.seq_id equals b.seq_id
join id2 in temp on b.Id equals id2.Id
select new MyObject
{
...
};
// switch to alternative translator
var query = query.ToLinqToDB();
var result = query.ToList();

How to Convert Linq to Lambda Expression

var getr = (from d in _context.DR
join r in _context.R on d.RID equals r.RID
where HID == r.HID && cI >= d.DRD && cO < d.DRD
group d by new {d.RID, d.RGID} into g
select g);
How to convert Linq to lambda? This is what I got:
var getr = _context.DR.Join(_context.R, x => x.RID, y => y.RID, (x, y) => new { R= x, DR= y}).Where(z => z.DR.RID== y.RID);
Are there any pros and cons of using either one?

In terms of performance : there is no performance difference whatsoever between two.
Which one should use is mostly personal preference, but its important to bear in mind that there are situation where one will be better suited the other.
int[] ints = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
// using Query expression
var evensQuery = from i in ints where isEven(i) select i;
// using Lambda expression
var evensLambda = ints.Where(isEven);
There so many function that available with lambda ie. single(), First(), Take(), Skip()..
Although you can mix and match the two by calling the Lambda-only methods at the end of the query:
// mix and match query and Lambda syntax
//Example ver :1
var query = (from person in people
join pet in pets on person equals pet.Owner
select new { OwnerName = person.Name, Pet = pet.Name }).Skip(1).Take(2);
or, for better readability :
//Example ver :2
var query = from person in people
join pet in pets on person equals pet.Owner
select new { OwnerName = person.Name, Pet = pet.Name };
var result = query.Skip(1).Take(2);
Both example version return the same output without performance differences because of delayed(or Deferred ) execution, that means query is not executing at the point of declaration, but it will execute when try to iterate through the result variable.
BUT, if you don’t want delayed execution, or need to use one of the aggregate functions such as Average() or Sum(), for example, you should be aware of the possibility of the underlying sequence being modified between the assignments to query and result. In this case,I’d argue it’s best to use Lambda expressions to start with or add the Lambda-only methods to the query expression.

Entity Framework + LINQ Expression outer join error

i want to create a left outer join for a linq expression that query data from database via entity framework. this is the linq expression. basically what I am trying to do is search problem_vehicle_id from problemVehiclesTicket in Problems table to see if it exists, if it doesn't exists, i want to return a problem object that is null/empty. Basically I believe it is left outer join.
var ticketsDetails = (from tickets in DbContext.tickets
join problemVehiclesTicket in DbContext.problem_vehicle on tickets.tickets_id equals problemVehiclesTicket.tickets_id
join problems in DbContext.problem on problemVehiclesTicket.problem_vehicle_id equals problem.problem_vehicle_id into problemGroup
from problems in problemGroup.DefaultIfEmpty(new problem { })
where (tickets.tickets_id == ticketsId)
select new TicketsDetails
{
Ticket = tickets,
ProblemVehicle = problemVehiclesTicket,
Problems= problem,
}).ToList();
Problem is a class that mirrors that of the Problem table in database
`Problem`
id (int), description (string), type (short)
The error i got is "The entity or complex type 'SPOTS_Repository.speeding_offence' cannot be constructed in a LINQ to Entities query." The source is from Entity Framework.
any help is greatly appreciated.

The type problem in your case is a mapped entity. Therefore, you cannot project onto it. You can use an anonymous type or another non-mapped class (DTO).
Because in your DefaultIfEmpty method you are constructing a new problem, which is a mapped entity, this not allowed.
Fix
You do not need to pass anything to DefaultIfEmpty method. Actually in your case, you are not even allowed because the only thing you can pass is problem and that is mapped. Therefore, use .DefaultIfEmpty() without creating a new problem.
More Belabor
Here is an example, which will clarify the usage of DefaultIfEmpty:
Option 1: DefaultIfEmpty() with No Parameter
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty()
select j;
Output: 1, 2, 0, 0, 4
Why? Because 3 and 6 are not found and DefaultIfEmpty for an integer returns a 0.
Option 2: DefaultIfEmpty() with Parameter
In some cases we may want to indicate that if the item is not found in the join, what to return instead. We can do that by sending a single parameter to DefaultIfEmpty method like this:
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty(99) //<-- see this
select j;
Output: 1, 2, 99, 99, 4 Why? Because 3 and 6 are not found and we instructed DefaultIfEmpty to return a 99 in that case.
Please note that DefaultIfEmpty is a generic method. In my case it required an int because I am joining to the second list which is a List of int(s). In your case it is problem(s) but that is mapped. Therefore, you cannot construct it in your query.
Here is another example:
var depts = new List<Department>
{
new Department { Name = "Accounting" },
new Department { Name = "IT" },
new Department { Name = "Marketing" }
};
var persons = new List<Person>
{
new Person { DeptName = "Accounting", Name = "Bob" }
};
var selection2 =
from d in depts
join p in persons on d.Name equals p.DeptName into joined2
// See here DefaultIfEmpty can be passed a Person
from j2 in joined2.DefaultIfEmpty(new Person { DeptName = "Unknown", Name = "Alien" })
select j2;
foreach(var thisJ in selection2)
{
Console.WriteLine("Dept: {0}, Name: {1}", thisJ.DeptName, thisJ.Name);
}
Output:
Dept: Accounting, Name: Bob
Dept: Unknown, Name: Alien
Dept: Unknown, Name: Alien
<== Fiddle Me ==>

Public class problem()
{
public int id;
public string description;
public short type;
}
.DefaultIfEmpty(
new problem()
{
Id = ticketsId,
Description = string.empty,
});
create class and make use of that in linq query
Hope it helps you.

Entity framework - select by multiple conditions in same column - referenced table

Example scenario:
Two tables: order and orderItem, relationship One to Many.
I want to select all orders that have at least one orderItem with price 100 and at least one orderItem with price 200.
I can do it like this:
var orders = (from o in kontextdbs.orders
join oi in kontextdbs.order_item on o.id equals oi.order_id
join oi2 in kontextdbs.order_item on o.id equals oi2.order_id
where oi.price == 100 && oi2.price == 200
select o).Distinct();
But what if those conditions are user generated?
So I dont know how many conditions there will be.

You need to loop through all the values using a Where and Any method like this:
List<int> values= new List() { 100, 200 };
var orders = from o in kontextdbs.orders
select o;
foreach(int value in values)
{
int tmpValue = value;
orders = orders.Where(x => kontextdbs.order_item.Where(oi => x.id == oi.order_id)
.Any(oi => oi.price == tmpValue));
}
orders = orders.Distinct();

List<int> orderValues = new List() { 100, 200 };
ObjectQuery<Order> orders = kontextdbs.Orders;
foreach(int value in orderValues) {
orders = (ObjectQuery<Order>)(from o in orders
join oi in kontextdbs.order_item
on o.id equals oi.order_id
where oi.price == value
select o);
}
orders = orders.Distinct();
ought to work, or at least that's the general pattern - you can apply extra queries to the IObjectQueryables at each stage.
Note that in my experience generating dynamic queries like this with EF gives terrible performance, unfortunately - it spends a few seconds compiling each one into SQL the first time it gets a specific pattern. If the number of order values is fairly stable though then this particular query ought to work OK.

LINQ In Line Property Update During Join

I have two obects, A & B for this discussion. I can join these objects (tables) via a common relationship or foreign key. I am using linq to do this join and I only want to return ObjectA in my result set; however, I would like to update a property of ObejctA with data from ObjectB during the join so that the ObjectAs I get out of my LINQ query are "slightly" different from their original state in the storage medium of choice.
Here is my query, you can see that I would just like to be able to do something like objectA.SomeProperty = objectB.AValueIWantBadly
I know I could do a new in my select and spin up new OBjectAs, but I would like to avoid that if possible and simply update a field.
return from objectA in GetObjectAs()
join objectB in GetObjectBs()
on objectA.Id equals objectB.AId
// update object A with object B data before selecting it
select objectA;

Add an update method to your ClassA
class ClassA {
public ClassA UpdateWithB(ClassB objectB) {
// Do the update
return this;
}
}
then use
return from objectA in GetObjectAs()
join objectB in GetObjectBs()
on objectA.Id equals objectB.AId
// update object A with object B data before selecting it
select objectA.UpdateWithB(objectB);
EDIT:
Or use a local lambda function like:
Func<ClassA, ClassB, ClassA> f = ((a,b)=> { a.DoSomethingWithB(b); return a;});
return from objectA in GetObjectAs()
join objectB in GetObjectBs()
on objectA.Id equals objectB.AId
select f(objectA , objectA );

From the word "tables", it sounds like you are getting this data from a database. In which case; no: you can't do this. The closest you can do would to select the objects and the extra columns, and update the properties afterwards:
var qry = from objectA in GetObjectAs()
join objectB in GetObjectBs()
on objectA.Id equals objectB.AId
select new { A = objectA,
objectB.SomeProp, objectB.SomeOtherProp };
foreach(var item in qry) {
item.A.SomeProp = item.SomeProp;
item.A.SomeOtherProp = item.SomeOtherProp;
// perhaps "yield return item.A;" here
}
If you were doing LINQ-to-Objects, there are perhaps some hacky ways you could do it with fluent APIs - not pretty, though. (edit - like this other reply)

I am doing a left join here so I still have all the data from objectA even if the corresponding property in objectB is null. So if the corresponding property in objectB is null then you have to define what to do in objectA. I use this statement all the time for joining two sets of data. You do not need to exhaustively list all properties in objectA and how they map, you only need to list the values you want to update with objectB. Pre-existing values in objectA are safe unless a mapping to objectB is defined.
return from objectA in GetObjectAs()
join objectB in GetObjectBs()
on objectA.Id equals objectB.AId into combinedObj
from subObject in combinedObj.DefaultIfEmpty()
// update object A with object B data before selecting it
select ((Func<objectAType>)(() =>
{
objectA.property = ((subObject == null) ? "Object B was null" : subObject.property);
objectA.property = ((subObject == null) ? "Object B was null" : subObject.property);
return objectA;
}))()

First extend Linq to have an Each option by creating a class called LinqExtensions.
public static class LinqExtensions
{
public static void Each<T>(this IEnumerable<T> source, Action<T> method)
{
foreach (var item in source)
{
method(item);
}
}
}
Then you can use Join to return a list of new objects that contain the original objects with it's appropriate value. The Each will iterate over them allowing you to either assign or pass the values as parameters to each object.
Assignment example:
objectA.Join(objectB,a=>a.Id,b=>b.Id,(a,b) => new {a,b.AValueIWant}).Each(o=>o.a.SomeProperty=o.AValueIWant);
Parameter passing example:
objectA.Join(objectB,a=>a.Id,b=>b.Id,(a,b) => new {a,b.AValueIWant}).Each(o=>o.a.SomeMethod(o.AValueIWant));
The nice thing about this is that ObjectA and ObjectB do not have to be the same type. I have done this with a list of objects joined to a Dictionary (like a lookup). Bad thing is it isn't clear what is going on. You would be better to skip the Each extention and write it like this.
foreach(var change in objectA.Join(objectB,a=>a.Id,b=>b.Id,(a,b) => new {a,b.AValueIWant}))
{
change.a.SomeProperty = change.AValueIWant;
change.a.SomeMethod(change.AValueIWant);
}
But for more clarity I would probably do this:
foreach(var update in objectA.Join(objectB,objectA=>objectA.Id,objectB=>objectB.Id,(objectA,objectB) => new {objectA, Value = objectB.AValueIWant}))
{
update.objectA.SomeProperty = update.Value;
}
You will need to return the whole ObjectA in your new object, because it will be readonly and the only reason this works is because the objects in a collection are referenced allowing you to make your changes to properties on the objects.
But in the end it would be clearest to skip the LINQ join all together and just loop through the collections and look for matches, this will help with future maintenence. LINQ is awesome but just like when you have a hammer it doesn't make everything a nail, when you have a collection it doesn't mean LINQ is the answer.

can u try the let statement? (not at my dev machine to test this out myself):
return from objectA in GetObjectAs()
join objectB in GetObjectBs()
on objectA.Id equals objectB.AId
let objectA.SomeProperty = objectB.AValueIWantBadly
select objectA;

you can try by following..
var list1 = new List<ItemOne>
{
new ItemOne {IDItem = 1, OneProperty = "1"},
new ItemOne {IDItem = 2, OneProperty = null},
new ItemOne {IDItem = 3, OneProperty = "3"},
new ItemOne {IDItem = 4, OneProperty = "4"}
};
var list2 = new List<ItemTwo>
{
new ItemTwo {IDItem = 2, TwoProperty = "2"},
new ItemTwo {IDItem = 3, TwoProperty = "3"},
};
var query = list1.Join(list2, l1 => l1.IDItem, l2 => l2.IDItem, (l1, l2) =>
{
l1.OneProperty = l2.TwoProperty;
return l1;
});

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

LINQ Join with Multiple From Clauses - c#

This will get translated into a SelectMany() call. It is essentially a cross-join. Jon Skeet talks about it on his blog, as part of the Edulinq series. (Scroll down to Secondary "from" clauses.)

Related

Too much data in Contains (Linq): How to increase performace

How to Convert Linq to Lambda Expression

Entity Framework + LINQ Expression outer join error

Entity framework - select by multiple conditions in same column - referenced table

LINQ In Line Property Update During Join

Categories

Resources