Understanding the GroupJoin and Join in Linq chaining syntax (Homework) - c#

I need help with understanding the fourth argument for GroupJoin. From what i understand so far GroupJoin takes 4 arguments: (1, 2) - the first one is the secondary list and argument two is a Func that returns the key from the first object type in other words from the first list in this case (people). (3, 4) A Func that returns the key from the second object type from the second list in this case (items), and one that stores the grouped object with the group itself (I can't understand the code for this part). Considering this and having the code below:
var products = new Product[]
{
new Product { Id = 1, Type = "Phone", Model = "OnePlus", Price = 1000 },
new Product { Id = 2, Type = "Phone", Model = "Apple", Price = 2000 },
new Product { Id = 3, Type = "Phone", Model = "Samsung", Price = 1500 },
new Product { Id = 4, Type = "TV", Model = "Samsung 32", Price = 200 },
};
var people = new Person[]
{
new Person { Id = 1, Name = "Ivan Ivanov", Money = 150000 },
new Person { Id = 2, Name = "Dragan Draganov", Money = 250000 },
new Person { Id = 3, Name = "Ivelin Ivelinov", Money = 350000
}
};
var items = new Item[]
{
new Item { PersonId = 1, ProductId = 1, Amount = 1 },
new Item { PersonId = 1, ProductId = 4, Amount = 1 },
new Item { PersonId = 1, ProductId = 5, Amount = 1 },
new Item { PersonId = 1, ProductId = 7, Amount = 1 },
new Item { PersonId = 2, ProductId = 2, Amount = 1 },
};
Query:
var productOwnerList = people
.GroupJoin(
items,
o => o.Id,
i => i.PersonId,
(o, i) => new <--- (**)
{
Person = o,
Products = i
.Join(products,
o1 => o1.ProductId,
i2 => i2.Id,
(o1, i2) => i2) <--- (*)
.ToArray()
})
.ToArray();
Just to mention I post only a few lines for the data. I need help to understand what the 4th argument for the join method is performing here -> (*) (stores the grouped object with the group itself) ? When i watch the result i see it it puts all Person id's associate with the product keys and joined the two lists based on Items list (one to many). But i cannot get what exactly this line means (o1, o2) => i2). Its obvious what is doing (put all the items associated with the person id in a array (items[]) for every person. but what is "under the hood" here ? Also one question about (**) this line its creating new object, is this a anonymous class or if its not what is it.

The fourth argument - which maps to the fifth parameter in the documentation (because the first parameter is the target of the extension method call) is just the result selector. It's a function accepting two parameters: the first is an element of the "outer" sequence (the people array in your case) and the second is a sequence of elements from the "inner" sequence (the items array in your case) which have the same key as the outer element. The function should return a "result" element, and the overall result of the method call is a sequence of those results.
The function is called once for each of the "outer" elements, so you'd have:
First call: person ID 1, and products with IDs 1, 4, 5, 7
Second call: person ID 2, and the product with ID 2
Third call: person ID 3, and an empty sequence of products
Your query is complex because you're using an anonymous type for your result, and constructing an instance of the anonymous type using another query. Here's a simpler query that might help to clarify:
var productOwnerList = people
.GroupJoin(
items,
o => o.Id,
i => i.PersonId,
(person, items) => $"{person.Id}: {string.Join(",", items.Select(item => item.ProductId))}"
.ToArray();

Related

Intersect two object lists on a common property and then compare a different property

I have two lists
List<objA> List1
List<objA> List2
I want to compare these two list on ID field, once a match is found I want to compare another field Distace amongst these two lists and grab the object with the lower value.
Using Linq isn't is not giving the result I want, atleast for the first part of the problem.
var test = List1.Select(x => x.ID)
.Intersect(List2.Select(y => y.ID));
Here's one way you could do this with Linq. Firstly, join the two lists together with Union. Then, group them by the Id field. Lastly, order those sub lists by Distance within the grouping, and take the first one of each to get a list of objects by Id with the minimum available distance.
var aList = new[]
{
new SomeObject() { Id = 1, Distance = 3 },
new SomeObject() { Id = 2, Distance = 5 }
};
var bList = new[]
{
new SomeObject() { Id = 1, Distance = 2 },
new SomeObject() { Id = 2, Distance = 6 }
};
var results = aList
.Union(bList)
.GroupBy(a => a.Id, a => a)
.Select(a => a.OrderBy(b => b.Distance).First());

Entity Framework + LINQ Expression outer join error

i want to create a left outer join for a linq expression that query data from database via entity framework. this is the linq expression. basically what I am trying to do is search problem_vehicle_id from problemVehiclesTicket in Problems table to see if it exists, if it doesn't exists, i want to return a problem object that is null/empty. Basically I believe it is left outer join.
var ticketsDetails = (from tickets in DbContext.tickets
join problemVehiclesTicket in DbContext.problem_vehicle on tickets.tickets_id equals problemVehiclesTicket.tickets_id
join problems in DbContext.problem on problemVehiclesTicket.problem_vehicle_id equals problem.problem_vehicle_id into problemGroup
from problems in problemGroup.DefaultIfEmpty(new problem { })
where (tickets.tickets_id == ticketsId)
select new TicketsDetails
{
Ticket = tickets,
ProblemVehicle = problemVehiclesTicket,
Problems= problem,
}).ToList();
Problem is a class that mirrors that of the Problem table in database
`Problem`
id (int), description (string), type (short)
The error i got is "The entity or complex type 'SPOTS_Repository.speeding_offence' cannot be constructed in a LINQ to Entities query." The source is from Entity Framework.
any help is greatly appreciated.
The type problem in your case is a mapped entity. Therefore, you cannot project onto it. You can use an anonymous type or another non-mapped class (DTO).
Because in your DefaultIfEmpty method you are constructing a new problem, which is a mapped entity, this not allowed.
Fix
You do not need to pass anything to DefaultIfEmpty method. Actually in your case, you are not even allowed because the only thing you can pass is problem and that is mapped. Therefore, use .DefaultIfEmpty() without creating a new problem.
More Belabor
Here is an example, which will clarify the usage of DefaultIfEmpty:
Option 1: DefaultIfEmpty() with No Parameter
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty()
select j;
Output: 1, 2, 0, 0, 4
Why? Because 3 and 6 are not found and DefaultIfEmpty for an integer returns a 0.
Option 2: DefaultIfEmpty() with Parameter
In some cases we may want to indicate that if the item is not found in the join, what to return instead. We can do that by sending a single parameter to DefaultIfEmpty method like this:
var list1 = new List<int> { 1, 2, 3, 6, 4 };
var list2 = new List<int> { 4, 1, 2 };
var selection =
from l1 in list1
join l2 in list2 on l1 equals l2 into joined
from j in joined.DefaultIfEmpty(99) //<-- see this
select j;
Output: 1, 2, 99, 99, 4 Why? Because 3 and 6 are not found and we instructed DefaultIfEmpty to return a 99 in that case.
Please note that DefaultIfEmpty is a generic method. In my case it required an int because I am joining to the second list which is a List of int(s). In your case it is problem(s) but that is mapped. Therefore, you cannot construct it in your query.
Here is another example:
var depts = new List<Department>
{
new Department { Name = "Accounting" },
new Department { Name = "IT" },
new Department { Name = "Marketing" }
};
var persons = new List<Person>
{
new Person { DeptName = "Accounting", Name = "Bob" }
};
var selection2 =
from d in depts
join p in persons on d.Name equals p.DeptName into joined2
// See here DefaultIfEmpty can be passed a Person
from j2 in joined2.DefaultIfEmpty(new Person { DeptName = "Unknown", Name = "Alien" })
select j2;
foreach(var thisJ in selection2)
{
Console.WriteLine("Dept: {0}, Name: {1}", thisJ.DeptName, thisJ.Name);
}
Output:
Dept: Accounting, Name: Bob
Dept: Unknown, Name: Alien
Dept: Unknown, Name: Alien
<== Fiddle Me ==>
Public class problem()
{
public int id;
public string description;
public short type;
}
.DefaultIfEmpty(
new problem()
{
Id = ticketsId,
Description = string.empty,
});
create class and make use of that in linq query
Hope it helps you.

Select Distinct rows using Linq

The data is as follow
ID Title Category About Link CategoryID
1 The Matrix Sci-Fi Text goes here http://... 1
2 The Simpsons Cartoon Text goes here http://... 2
3 Avengers Action Text goes here http://... 3
4 The Matrix Sci-Fi Text goes here http://... 1
5 The One Sci-Fi Text goes here http://... 1
6 The Hobbit Sci-Fi Text goes here http://... 1
I have a checkbox list containing the categories. The problem is if the user selects 'Action' and 'Sci-Fi' as category to display The Matrix will be displayed twice.
This is my try for getting unique rows in SQL Query.
select distinct title, about, link from mytable
inner join tableCategories on categoryID = tableCategoriesID
group by title, about, link
Using the LINQ,
(from table in movieTables
join x in categoryIDList
on categoryID equals x
slect table).Distinct()
Note that the categories are in a separate table linked by the categoryID.
Need help displaying unique or distinct rows in LINQ.
You can happily select your result into a list of whatever you want:
var v = from entry in tables
where matching_logic_here
select new {id = some_id, val=some_value};
and then you can run your distinct on that list (well, a ToList() on the above will make it one), based on your needs.
The following should illustrate what i mean (just paste into linqpad. if you're using VS, get rid of the .Dump():
void Main()
{
var input = new List<mock_entry> {
new mock_entry {id = 1, name="The Matrix", cat= "Sci-Fi"},
new mock_entry {id = 2, name="The Simpsons" ,cat= "Cartoon"},
new mock_entry {id = 3, name="Avengers" ,cat= "Action"},
new mock_entry {id = 4, name="The Matrix", cat= "Sci-Fi"},
new mock_entry {id = 5, name="The One" ,cat= "Sci-Fi"},
new mock_entry {id = 6, name="The Hobbit",cat= "Sci-Fi"},
};
var v = input.Where(e=>e.cat == "Action" || e.cat =="Sci-Fi")
.Dump()
.Select(e => new {n = e.name, c =e.cat})
.Dump()
;
var d = v.Distinct()
.Dump()
;
}
// Define other methods and classes here
public struct mock_entry {
public int id {get;set;}
public string name {get;set;}
public string cat {get;set;}
}
Another option would be to use DistinctBy from more linq as suggested in this question
Edit:
Even simpler, you can use GroupBy, and just select the first entry (you'll lose the id though, but up to you).
Here's an example that will work with the above:
var v = input.GroupBy (i => i.name)
.Select(e => e.First ())
.Dump()
.Where(e=>e.cat == "Action" || e.cat =="Sci-Fi")
.Dump()
;
will yield:
1 The Matrix Sci-Fi
3 Avengers Action
5 The One Sci-Fi
6 The Hobbit Sci-Fi

Speed of linq query grouping and intersect in particular

Say 3 lists exist with over 500,000 records and we need to perform a set of operations (subsets shown below):
1) Check for repeating ids in list one and two and retrieve distinct ids while Summing up "ValuesA" for duplicate ids and put results in a list. Lets call this list list12.
2) compare all the values with matching ids between list 3 list12 and print results say to console.
3) ensure optimal performance.
This what i have so far:
var list1 = new List<abc>()
{
new abc() { Id = 0, ValueA = 50},
new abc() { Id = 1, ValueA = 40},
new abc() { Id = 1, ValueA = 70}
};
var list2 = new List<abc>()
{
new abc() { Id = 0, ValueA = 40},
new abc() { Id = 1, ValueA = 60},
new abc() { Id = 3, ValueA = 20},
};
var list3 = new List<abc>()
{
new abc() { Id = 0, ValueA = 50},
new abc() { Id = 1, ValueA = 40},
new abc() { Id = 4, ValueA = 70},
};
1) with the help of the solution from here [link][1] I was able to resolve part 1.
var list12 = list2.GroupBy(i => i.Id)
.Select(g => new
{
Id = g.Key,
NewValueA = g.Sum(j => j.ValueA),
});
2)I cant seem to properly get the complete result set from this part. I can get the matching account numbers, maybe someone knows of a faster way other than hashsets, but I also need the ValueA from each list along with the matching account numbers.
foreach (var values in list3.ToHashSet().Select(i => i.ID).Intersect(list12.ToHashSet().Select(j => j.UniqueAccount)))
{
Console.WriteLine(values) //prints matching account number
//?? how do I get ValueA with from both lists with this in the quickest way possible
}
3) my only attempt at improving performance from reading online is to use hashsets as I seen in the attempt above but I may be doing this incorrectly and someone may have a better solution
I don't think that any conversion to HashSet, however efficient, will increase performance. The reason is that the lists must be enumerated to create the HashSets and then the HashSets must be enumerated to get to the results.
If you put everything in one LINQ statement the number of enumerations will be minimized. And by calculating the sums at the end the number of calculations is reduced to the absolute minimum:
list1.Concat(list2)
.Join(list3, x => x.Id, l3 => l3.Id, (l12,l3) => l12)
.GroupBy (x => x.Id)
.Select(g => new
{
Id = g.Key,
NewValueA = g.Sum(j => j.ValueA),
})
With your data this shows:
Id NewValueA
0 90
1 170
I don't know if I understood all requirements well, but this should give you the general idea.
If you want to get access to both elements you probably want a join. A join is a very general construct that can be used to construct all other set operations.

Linq query to fetch DB records corresponding to List of int[2]

I am trying to count the VALUEs corresponding to a List<int[]> using Linq to Entity Framework.
I have a List<int[]> where each int[] in the List is of length 2.
I have a DB table VALUES which contains 3 columns, called ID, PARENT and VALUE, where each int[] in the List (see 1) may correspond to a record in the table, the 0 index being ID and the 1 index being PARENT. *But some arrays likely do not correspond to any existing records in the table.
Each combination of ID and PARENT correspond to multiple DB records, with different VALUEs.
Several points that are important to note:
One of the problems is that I can't rely on ID alone - each value is defined/located according to both the ID and PARENT.
None of the int arrays repeat, though the value in each index may appear in several arrays, e.g.
List<int[]> myList = new List<int[]>();
myList.add(new int[]{2, 1});
myList.add(new int[]{3, 1}); //Notice - same "PARENT"
myList.add(new int[]{4, 1}); //Notice - same "PARENT"
myList.add(new int[]{3, 1}); //!!!! Cannot exist - already in the List
I can't seem to figure out how to request all of the VALUEs from the VALUES table that correspond to the ID, PARENT pairs in the List<int[]>.
I've tried several variations but keep arriving at the pitfall of attempting to compare an array in a linq statement... I can't seem to crack it without loading substantially more information that I actually need.
Probably the closest I've gotten is with the following line:
var myList = new List<int[]>();
// ... fill the list ...
var res = myContext.VALUES.Where(v => myList.Any(listItem => listItem[0] == v.ID && listItem[1] == v.PARENT));
Of course, this can't work because The LINQ expression node type 'ArrayIndex' is not supported in LINQ to Entities.
#chris huber
I tried it out but it was unsuccessful.
2 things:
Where you created "myValues" I have a DB table entity, not a List.
Due to point number 1, I am using LINQ to Entities, as opposed to LINQ to Object
My code then comes to something like this:
var q2 = from value in myContext.VALUES where myList.Select(x => new { ID = x.ElementAt(0), Parent = x.ElementAt(1) }).Contains(new { ID = value.ID, Parent = value.PARENT }) select value;
This returns the following error message when run:
LINQ to Entities does not recognize the method 'Int32 ElementAt[Int32](System.Collections.Generic.IEnumerable` 1[System.Int32],Int32)' method, and this method cannot be translated into a store expression.
#Ovidiu
I attempted your solution as well but the same problem as above:
As I am using LINQ to Entities, there are simply certain things that cannot be performed, in this case - the ToString() method is "not recognized". Removing the ToString() method and attempting to simply have a Int32 + "|" + In32 gives me a whole other error about LINQ to Entities not being able to cast an Int32 to Object.
Use the following LINQ expression:
List<int[]> myList = new List<int[]>();
myList.Add(new int[] { 2, 1 });
myList.Add(new int[] { 3, 1 }); //Notice - same "PARENT"
myList.Add(new int[] { 4, 1 }); //Notice - same "PARENT"
myList.Add(new int[] { 3, 1 });
List<int[]> myValues = new List<int[]>();
myValues.Add(new int[] { 2, 1 , 1});
myValues.Add(new int[] { 3, 1 , 2}); //Notice - same "PARENT"
myValues.Add(new int[] { 4, 1 , 3}); //Notice - same "PARENT"
myValues.Add(new int[] { 3, 1, 4 });
myValues.Add(new int[] { 3, 2, 4 });
var q2 = from value in myValues where myList.Select(x => new { ID = x.ElementAt(0), Parent = x.ElementAt(1) }).Contains(new { ID = value.ElementAt(0), Parent = value.ElementAt(1) }) select value;
var list = q2.ToList();
You could use this workaround: create a new List<string> from your List<int[]> and compare the values in your table with the new list.
I didn't test this with EF, but it might work
List<string> strList = myList.Select(x => x[0].ToString() + "|" + x[1].ToString()).ToList();
var res = myContext.VALUES.Where(x => strList.Contains(SqlFunctions.StringConvert((double)x.ID).Trim() + "|" + SqlFunctions.StringConvert((double)x.PARENT).Trim()));

Categories

Resources