nested linq queries, how to get distinct values?

nested linq queries, how to get distinct values? - c#

table data of 2 columns "category" and "subcategory"
i want to get a collection of "category", [subcategories]
using code below i get duplicates. Puting .Distinct() after outer "from" does not help much. What do i miss?
var rootcategories = (from p in sr.products
orderby p.category
select new
{
category = p.category,
subcategories = (
from p2 in sr.products
where p2.category == p.category
select p2.subcategory).Distinct()
}).Distinct();
sr.products looks like this
category subcategory
----------------------
cat1 subcat1
cat1 subcat2
cat2 subcat3
cat2 subcat3
what i get in results is
cat1, [subcat1,subcat2]
cat1, [subcat1,subcat2]
but i only want one entry
solved my problem with this code:
var rootcategories2 = (from p in sr.products
group p.subcategory by p.category into subcats
select subcats);
now maybe it is time to think of what was the right question.. (-:

solved with this code
var rootcategories2 = (from p in sr.products
group p.subcategory by p.category into subcats
select subcats);
thanks everyone

I think you need 2 "Distinct()" calls, one for the main categories and another for the subcategories.
This should work for you:
var mainCategories = (from p in products select p.category).Distinct();
var rootCategories =
from c in mainCategories
select new {
category = c,
subcategories = (from p in products
where p.category == c
select p.subcategory).Distinct()
};

The algorithm behind Distinct() needs a way to tell if 2 objects in the source IEnumerable are equal.
The default method for that is to compare 2 objects by their reference and therefore its likely that no 2 objects are "equal" since you are creating them with the "new" keyword.
What you have to do is to write a custom class which implements IEnumerable and pass that to the Distinct() call.

Your main query is on Products, so you're going to get records for each product. Switch it around so you're querying on Category, but filtering on Product.Category

Related

Understanding DefaultIfEmpty in LINQ

I don't understand how DefaultIfEmpty method works. It is usually used to be reminiscent of left-outer join in LINQ.
DefaultIfEmpty() method must be run on a collection.
DefaultIfEmpty() method cannot be run on null collection reference.
A code example I don't understand some points that
Does p, which is after into keyword, refer to products?
Is ps the group of product objects? I mean a sequence of sequences.
If DefaultIfEmpty() isn't used, doesn't p, from p in ps.DefaultIfEmpty(), run into select? Why?
,
#region left-outer-join
string[] categories = {
"Beverages",
"Condiments",
"Vegetables",
"Dairy Products",
"Seafood"
};
List<Product> products = GetProductList();
var q = from c in categories
join p in products on c equals p.Category into ps
from p in ps.DefaultIfEmpty()
select (Category: c, ProductName: p == null ? "(No products)" : p.ProductName);
foreach (var v in q)
{
Console.WriteLine($"{v.ProductName}: {v.Category}");
}
#endregion
Code from 101 Examples of LINQ.

I ain't generally answer my own question, however, I think some people might find the question somewhat intricate.
In the first step, the working logic of the DefaultIfEmpty method group should be figured out(LINQ doesn't support its overloaded versions, by the by).
class foo
{
public string Test { get; set; }
}
// list1
var l1 = new List<foo>();
//l1.Add(null); --> try the code too by uncommenting
//list2
var l2 = l1.DefaultIfEmpty();
foreach (var x in l1)
Console.WriteLine((x == null ? "null" : "not null") + " entered l1");
foreach (var x in l2)
Console.WriteLine((x == null ? "null" : "not null") + " entered l2");
When being run, seeing that it gives null entered l2 out result out.
What if l1.Add(null); is commented in? It is at your disposal, not hard to guess at all.
l2 has an item which is of null since foo is not one of the building block types like Int32, String, or Char. If it were, default promotion would be applied to, e.g. for string, " "(blank character) is supplied to.
Now let's examine the LINQ statement being mentioned.
Just for a remembrance, unless an aggregate operator or a To{a
collection}() is applied to a LINQ expression, lazy evaluation(honor
deferred) is carried out.
The followed image, albeit not belonging to C#, helps to get what it means.
In the light of the lazy evaluation, we are now wisely cognizant of the fact that the LINQ using query expression is evaluated when requested, that is, on-demand.
So, ps contains product items iff the equality expressed at on keyword of join is satisfied. Further, ps has different product items at each demand of the LINQ expression. Otherwise, unless DefaultIfEmpty() is used, select is not hit thereby not iterating over and not yielding any Console.WriteLine($"{productName}: {category}");. (Please correct me at this point if I'm wrong.)

Answers
Does p refer to products after into keyword?
The p in the from clause is a new local variable referring to a single product of one category.
Is ps the group of product objects? I mean a sequence of sequences.
Yes, ps is the group of products for the category c. But it is not a sequence of sequences, just a simple IEnumerable<Product>, just like c is a single category, not all categories in the group join.
In the query you only see data for one result row, never the whole group join result. Look at the final select, it prints one category and one product it joined with. That product comes from the ps group of product that one category joined with.
The query then does the walking over all categories and all their groups of products.
If DefaultIfEmpty() isn't used, doesn't p, from p in ps.DefaultIfEmpty(), run into select? Why?
It is not equal to a Select, because the from clause creates a new join with itself, which turns into SelectMany.
Structure
Taking the query by parts, first the group join:
from c in categories
join p in products on c equals p.Category into ps
After this only c and ps are usable, representing a category and its joined products.
Now note that the whole query is in the same form as:
from car in Cars
from passenger in car.Passengers
select (car, passenger)
Which joins Cars with its own Passengers using Cars.SelectMany(car => car.Passengers, (car, passenger) => (car, passenger));
So in your query
from group_join_result into ps
from p in ps.DefaultIfEmpty()
creates a new join of the previous group join result with its own data (lists of grouped products) ran through DefaultIfEmpty using SelectMany.
Conclusion
In the end the complexity is in the Linq query and not the DefaultIfEmpty method. The method is simply explained on the MSDN page i posted in comment. It simply turns a collection with no elements into collection that has 1 element, which is either the default() value or the supplied value.
Compiled source
This is approximately the C# code the query gets compiled to:
//Pairs of: (category, the products that joined with the category)
IEnumerable<(string category, IEnumerable<Product> groupedProducts)> groupJoinData = Enumerable.GroupJoin(
categories,
products,
(string c) => c,
(Product p) => p.Category,
(string c, IEnumerable<Product> ps) => (c, ps)
);
//Flattening of the pair collection, calling DefaultIfEmpty on each joined group of products
IEnumerable<(string Category, string ProductName)> q = groupJoinData.SelectMany(
catProdsPair => catProdsPair.groupedProducts.DefaultIfEmpty(),
(catProdsPair, p) => (catProdsPair.category, (p == null) ? "(No products)" : p.ProductName)
);
Done with the help of ILSpy using C# 8.0 view.

Linq to SQL, contains in where clause repeated values

I have a simple table of items, called "ITEMS":
id description
-- -----------
1 Something
2 Another thing
3 Best thing
I have a list of Int32 which are item IDs I'd like to show:
List<Int32> currentItemsCodes = new List<int>();
For this example currentItemsCodes contains 1,2,2,3
Currently I have this Linq-to-SQL:
var itemDescriptions = (from a in db.ITEMS
where currentItemsCodes.Contains(a.id)
select new {a.id, a.description});
What this returns is:
1,Something
2,Another thing
3,Best thing
I need to return two "Another things":
1,Something
2,Another thing
2,Another thing
3,Best thing
Or if currentItemsCodes was 3,3,3,3 I would need 4 x "Best thing" returned

You should do a inner join in linq to get what you are looking for. Use the below linq query to do that.
var itemDescriptions = (from a in db.ITEMS
join c in currentItemsCodes
on a.id equals c
select new {a.id, a.description});

You can use a join clause for that:
var itemDescriptions = (from item in db.ITEMS
join i in currentItemsCodes on item.id equals i
select new
{
id = item.id,
description = item.description
}).ToList();

Something like this?
var x = db.items;
var itemDescriptions = (from a in currentItemsCodes
select new {a, x[a].description});
As in Kris's comments substitute for [a] a method to access the items by id

How do I select an object by a sub-property

i've got a List of objects, lets call them Product, which each of them contains a bunch of properties and also a List of Version (which are also objects).
Version also has a bunch of properties and does contain a List of Customer (which again are objects).
Customer again has properties, one of them is its ID (=Guid).
What i try to do is to make a List of Product, selected by a certain ID of its Product.VersionList.Version.ID.
I would prefere a join query, but every efficient way is welcome. I tried so far this, but because i have only a single ID to compare with, i don't know how to construct the join.
lp = List<Entity.Product>;
g = GetGuid();
var query = from product in Entity.ProductCollection
join g in g
on product.Version.Where(x => x.id == g)
select product;
lp.AddRange(query);

I'm guessing you mean:
var query = from product in Entity.ProductCollection
where product.Version.Any(x => x.id == g)
select product;
i.e. select all the products that have a version where the id matches the guid you were thinking of.
Note that joining to the versions would cause product duplication if any product has multiple matching versions.

Try this .... May be you wants more deep digging on it..
var query = from Product product in pc
from varsion in product.Version
let v= varsion as Entity.Version
where v.id == g
select product;

var query = Entity.ProductCollection.Where(p => p.Version.Any(v => v.Id == g));
You can use Any rather than having to do a self join.

Using LINQ to select desired results between two related IEnumerable query objects

I think this is kind of a basic question but I'm getting confused. I have two objects, Orders and OrderTags. In the database, Orders has no relation to OrderTags, but OrderTags has a FK relation to Orders.
So I capture both objects in my context like so:
orders = context.Orders;
tags = context.OrderTags.Where(tag=> tag.ID = myID);
Now I want to reduce the orders list to only be equal to the orders that exist in my tags list. Here is my best pseudocode of what I want to do:
orders = orders.Where(every order id exists somewhere in the tags list of order ids)
For clarification, each Tag object has a TagID and an OrderID. So I only want the orders that correspond to the tags I have looked up. Can anyone assist me with the syntax so I can get what I'm looking for?

Using a LINQ query:
var results = (from o in context.Orders
join t in context.Tags on o.OrderId equals t.OrderId
where t.ID == myID
select o ).ToList();

Using LINQ query:
orders = orders.Where(order => tags.Contains(tag => tag.ID == order.OrderID)).ToList();

Using a LINQ query with lambda expressions:
orders.RemoveAll(x => !tags.ConvertAll(y => y.tagId).Contains(x.tagID));

Something like this should work.
orders = orders.Where(o=>tags.Contains(t=>o.ID == t.OrderID));
You could also just perform a join.

Set operation: how to filter a collection based on a COMBINED set of ints

I have this expression:
var result = from pav in ProductAttributes
join id in valueIds
on pav.AttributeValueID equals id
select pav.ProductImageID;
which works up to a point. The issue is that the collection ProductAttributes contains the same product many times, for each attribute. It's structure is:
ID - unique
ProductID
ProductAttributeValueID
ProductImageID
So a Product may appear many times in the collection. I want the result to actually filter OUT all the products that DON'T have any matches at all in valueIds (which is a list of ProductAttributeValueIDs).
So I want to ONLY return products that have ALL of the COMBINED valueIds, not just ANY of them, which is what the above linq expression is doing.
PS I can post SQL code that shows what I mean in SQL if that helps!
#devgeezer posted an answer which was close enough but it only worked for one value.
I ended up with the code below, which works. I group on the ProductID, then use that in a 2nd query to filter the original collection:
var result =
from pav in ProductAttributeValues
join id in valueIds
on pav.AttributeValueID equals id
group pav by pav.ProductID into gj
where gj.Count() == valueIds.Count()
select gj.Key;
var imageIds = from pav in ProductAttributeValues
join id in result
on pav.ProductID equals id
select pav.ProductImageID;

You might try a group and filter approach something like the following:
var result =
from pav in ProductAttributes
join id in valueIds
on pav.AttributeValueID equals id
group pav by pav.ProductImageID into gj
where gj.Count() == valueIds.Count()
select gj.Key;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

nested linq queries, how to get distinct values? - c#

solved with this code var rootcategories2 = (from p in sr.products group p.subcategory by p.category into subcats select subcats); thanks everyone

Your main query is on Products, so you're going to get records for each product. Switch it around so you're querying on Category, but filtering on Product.Category

Related

Understanding DefaultIfEmpty in LINQ

Linq to SQL, contains in where clause repeated values

How do I select an object by a sub-property

Using LINQ to select desired results between two related IEnumerable query objects

Set operation: how to filter a collection based on a COMBINED set of ints

Categories

Resources