Extensible relational division in LINQ

Extensible relational division in LINQ - c#

In this example class IcdPatient represents a many-to-many relationship between a Patient table (not shown in this example) and a lookup table Icd.
public class IcdPatient
{
public int PatientId { get; set; }
public int ConditionCode { get; set; }
public static List<IcdPatient> GetIcdPatientList()
{
return new List<IcdPatient>()
{
new IcdPatient { PatientId = 100, ConditionCode = 111 },
new IcdPatient { PatientId = 100, ConditionCode = 222 },
new IcdPatient { PatientId = 200, ConditionCode = 111 },
new IcdPatient { PatientId = 200, ConditionCode = 222 },
new IcdPatient { PatientId = 3, ConditionCode = 222 },
};
}
}
public class Icd
{
public int ConditionCode { get; set; }
public string ConditionName { get; set; }
public static List<Icd> GetIcdList()
{
return new List<Icd>()
{
new Icd() { ConditionCode =111, ConditionName ="Condition 1"},
new Icd() { ConditionCode =222, ConditionName ="Condition 2"},
};
}
}
I would like for the user to be able to enter as many conditions as they want, and get a LINQ object back that tells them how many PatientIds satisfy that query. I've come up with:
List<string> stringFilteredList = new List<string> { "Condition 1", "Condition 2" };
List<int> filteringList = new List<int> { 111,222 };
var manyToMany = IcdPatient.GetIcdPatientList();
var icdList = Icd.GetIcdList();
/*Working method without joining on the lookup table*/
var grouped = from m in manyToMany
group m by m.PatientId into g
where g.Count() == filteringList.Distinct().Count()
select new
{
PatientId = g.Key,
Count = g.Count()
};
/*End*/
foreach (var item in grouped)
{
Console.WriteLine(item.PatientId);
}
Let's say that IcdPatient has a composite primary key on both fields, so we know that each row is unique. If we find the distinct number of entries in filteringList and do a count on the number of times a PatientId shows up, that means we've found all the people who have all conditions. Because the codes can be esoteric, I would like to do something like
let the user table in the ConditionName in type Icd and perform the same operation. I've not used LINQ this way a lot and I've gathered:
List<int> filteringList = new List<int> { 111,222 };
List<string> stringFilteredList= new List<string>{"Condition 1","Condition 2" };
filteringList.Distinct();
var manyToMany = IcdPatient.GetIcdPatientList();
var icdList = Icd.GetIcdList();
/*Working method without joining on the lookup table*/
var grouped = from m in manyToMany
join i in icdList on
m.ConditionCode equals i.ConditionCode
//group m by m.PatientId into g
group new {m,i} by new { m.ConditionCode }into g
where g.Count() == filteringList.Distinct().Count()
select new
{
Condition = g.Key.ConditionCode
};
/*End*/
but can't get anything to work. This is essentially a join on top of my first query, but I'm not getting what I need to group on.

You don't need to group anything in this case, just use a join and a contains:
List<string> stringFilteredList= new List<string>{"Condition 1","Condition 2" };
var patients =
from icd in Icd.GetIcdList()
join patient in IcdPatient.GetIcdPatientList() on icd.ConditionCode equals patient.ConditionCode
where stringFilteredList.Contains(icd.ConditionName)
select patient.PatientId;

Let's say that IcdPatient has a composite primary key on both fields, so we know that each row is unique. If we find the distinct number of entries in filteringList and do a count on the number of times a PatientId shows up, that means we've found all the people who have all conditions. Because the codes can be esoteric, I would like to do something like let the user table in the ConditionName in type Icd and perform the same operation.
I believe you're asking:
Given a list of ConditionCodes, return a list of PatientIds where every patient has every condition in the list.
In that case, the easiest thing to do is group your IcdPatients table by Id, so that we can tell every condition that a patient has by looking once. Then we check that every ConditionCode we're looking for is in the group. In code, that looks like:
var result = IcdPatient.GetIcdPatientList()
// group up all the objects with the same PatientId
.GroupBy(patient => patient.PatientId)
// gather the information we care about into a single object of type {int, List<int>}
.Select(patients => new {Id = patients.Key,
Conditions = patients.Select(p => p.ConditionCode)})
// get rid of the patients without every condition
.Where(conditionsByPatient =>
conditionsByPatient.Conditions.All(condition => filteringList.Contains(condition)))
.Select(conditionsByPatient => conditionsByPatient.Id);
In query format, that looks like:
var groupedInfo = from patient in IcdPatient.GetIcdPatientList()
group patient by patient.PatientId
into patients
select new { Id = patients.Key,
Conditions = patients.Select(patient => patient.ConditionCode) };
var resultAlt = from g in groupedInfo
where g.Conditions.All(condition => filteringList.Contains(condition))
select g.Id;
Edit: If you'd also like to let your user specify the ConditionName rather than the ConditionId then simply convert from one to the other, storing the result in filteringList, like so:
var conditionNames = // some list of names from the user
var filteringList = Icd.GetIcdList().Where(icd => conditionNames.Contains(icd.ConditionName))
.Select(icd => icd.ConditionCode);

Related

Group by linq for nested objects

I am making a group by linq statement where i convert a single list of data into an list with a nested list. Here is my code so far:
[TestMethod]
public void LinqTestNestedSelect2()
{
// initialization
List<combi> listToLinq = new List<combi>() {
new combi{ id = 1, desc = "a", name = "A", count = 1 },
new combi{ id = 1, desc = "b", name = "A", count = 2 },
new combi{ id = 2, desc = "c", name = "B", count = 3 },
new combi{id = 2, desc = "d", name = "B", count = 4 },
};
// linq group by
var result = (from row in listToLinq
group new { des = row.desc, count = row.count } by new { name = row.name, id = row.id } into obj
select new A { name = obj.Key.name, id = obj.Key.id, descriptions = (from r in obj select new B() { des = r.des, count = r.count }).ToList() }).ToList();
// validation of the results
Assert.AreEqual(2, result.Count);
Assert.AreEqual(2, result[0].descriptions.Count);
Assert.AreEqual(2, result[0].descriptions.Count);
Assert.AreEqual(2, result[1].descriptions.Count);
Assert.AreEqual(2, result[1].descriptions.Count);
}
public class A
{
public int id;
public string name;
public List<B> descriptions;
}
public class B
{
public int count;
public string des;
}
public class combi
{
public int id;
public string name;
public int count;
public string desc;
}
This is fine if the objects are small like the example. However I will implement this for objects with a lot more properties. How can I efficiently write this statement so I don't have to write field names twice in my linq statement?
I would like to return the objects in the statement and I want something like:
// not working wishfull thinking code
var result = (from row in listToLinq
group new { des = row.desc, count = row.count } by new { name = row.name, id = row.id } into obj
select new (A){ this = obj.key , descriptions = obj.ToList<B>()}).ToList();
Background: I am re writing a web api that retrieves objects with nested objects in a single database call for the sake of db performance. It's basically a big query with a join that retrieves a crap load of data which I need to sort out into objects.
probably important: the ID is unique.
EDIT:
based on the answers so far I have made a solution which sort of works for me, but is still a bit ugly, and I would want it to be better looking.
{
// start part
return (from row in reader.AsEnumerable()
group row by row.id into grouping
select CreateA(grouping)).ToList();
}
private static A CreateA(IGrouping<object, listToLinq> grouping)
{
A retVal = StaticCreateAFunction(grouping.First());
retVal.descriptions = grouping.Select(item => StaticCreateBFunction(item)).ToList();
return ret;
}
I hope the StaticCreateAFunction is obvious enough for what it does. In this scenario I only have to write out each property once, which is what I really wanted. But I hope there is a more clever or linq-ish way to write this.

var result = (from row in listToLinq
group new B { des = row.desc, count = row.count } by new A { name = row.name, id = row.id } into obj
select new A { name = obj.Key.name, id = obj.Key.id, descriptions = obj.ToList() }).ToList();

You can add to each of the A and B classes a constructor that receives a combi and then it takes from it only what it needs. For example for a:
public class A
{
public A(combi c)
{
id = c.id;
name = c.name;
}
}
public class B
{
public B(combi c)
{
count = c.count;
des = c.desc;
}
}
Then your query can look like:
var result = (from row in listToLinq
group row by new { row.id, row.name } into grouping
select new A(grouping.First())
{
descriptions = grouping.Select(item => new B(item)).ToList()
}).ToList();
If you don't like the grouping.First() you can then override Equals and GetHashCode and then in the group by do by a new a with the relevant fields (which will be those in the Equals) and then add a copy constructor from a
Another way, in which you decouple the A/B classes from the combi is to extract the convert logic to a collection of static methods.

using variable in linq group by clause

Lets say that i have a table with multiple columns (student, teacher, subject, marks) and I want to compare each of these column to another table with same columns (with sum(marks)).
I have the following method with column name as an argument, which is then used in group by clause.
public List<AllMarks> RIMarks(string filter)
{
var MarkTable = from p in MainDB.classTable
let fil = filter
group p by new { fil } into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
List<AllMarks> lstRI = MarkTable.ToList();
return lstRI;
}
public void Test()
{
var filter = new string[] {"Student", "Teacher", "Subject"}
foreach (f in filter)
{
// Call RIMarks(f) and do something
}
}
I have number of distinct students in my table, however with this method, what I get is just a single row with sum(marks(all students) for the first filter criteria (which is Student) and its not actually grouping by the student.
How do I use local variable in linq groupby clause ?
Update:
Sample DB:
Student Teacher Subject Marks
stu1 teac1 sub1 23
stu1 teac1 sub1 45
stu2 teac2 sub2 34

You're trying to group by the name of a particular property of some data object you have.
You want to reflect the property of the field you want then group by that field's values.
You can't just provide the name because the filter wont work and it'll just group the whole collection.
Test this out in LinqPad.
What you have are the marks being summed on the grouping, not sure if you wanted the average but
what you have here should lead you to getting the average.
void Main()
{
var mySchool = new List<School>{
new School{Student = "Student A", Teacher = "Teacher A", Subject = "Math", Marks = 80},
new School{Student = "Student B", Teacher = "Teacher A", Subject = "Math", Marks = 65},
new School{Student = "Student C", Teacher = "Teacher A", Subject = "Math", Marks = 95},
new School{Student = "Student A", Teacher = "Teacher B", Subject = "History", Marks = 80},
new School{Student = "Student B", Teacher = "Teacher B", Subject = "History", Marks = 100},
};
GroupByFilter("Student", mySchool);
GroupByFilter("Teacher", mySchool);
GroupByFilter("Subject", mySchool);
}
public void GroupByFilter(string filter, List<School> school)
{
PropertyInfo prop = typeof(School).GetProperties()
.Where(x => x.Name == filter)
.First();
var grouping = from s in school
group s by new {filter = prop.GetValue(s)} into gr
select new {
Filter = gr.Key.filter,
Marks = gr.Sum(x => x.Marks)
};
grouping.Dump(); // this is linqpad specific
}
// Define other methods and classes here
public class School{
public string Student {get;set;}
public string Teacher {get;set;}
public string Subject {get;set;}
public int Marks {get;set;}
}
Results
Group By Student
Filter Marks
Student A 160
Student B 165
Student C 95
Group By Teacher
Filter Marks
Teacher A 240
Teacher B 180
Group By Subject
Filter Marks
Math 240
History 180

please try below code.
public List<AllMarks> RIMarks(string filter)
{
if (filter == "Student") {
var MarkTable = from p in MainDB.classTable
group p by new { p.Student} into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
}
else if (filter == "Teacher") {
var MarkTable = from p in MainDB.classTable
group p by new { p.Teacher} into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
}
else if (filter == "Subject") {
var MarkTable = from p in MainDB.classTable
group p by new { p.Subject} into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
}
}

EntityFramework / LinQ load entity from database to dto

I have a problem loading the correct data to a DTO using EF and linq.
From my DB I receive following example data:
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 4
2, 1, 5
etc.
I want to load these data in a DTO which should look like this:
int, int, ICollection<int>
so for the example data:
new MyDto(1, 1, new List<int> { 1, 2, 3 });
new MyDto(2, 1, new List<int> { 4, 5 });
This is my linq query
var result = (from adresses in context.Adress
join person in context.Person on adresses.PersonId equals person.Id
select new MyObj { Id1 = adresses.Id1, Id2 = adresses.Id2, PersonId = person.Id })
But it is wrong, since it doesn't group by Id1 and Id2 and doesn't put the personIds in the list...
Could you please tell me how I can achieve this?

Pivot data using Linq is a better way. You can take look at this link:
Is it possible to Pivot data using LINQ
To answer your question, below is an example:
var result = (from adresses in context.Adress
join person in context.Person on adresses.PersonId equals person.Id
group address by address.Id1 into gResult
select new{
Id1 = gResult.Key,
Id2 = gResult.Select(r => r.Id2).FirstOrDefault (),
Id3 = gResult.Select (r => r.Id3)
});

In your Address class, do you have a property for a Person instance so you're able to set up a relationship between the two classes? If so, the following query may get you the result set that you're looking for:
public class Address
{
public int Id1 { get; set; }
public int Id2 { get; set; }
public virtual Person Person { get; set; }
}
public void Foo()
{
IEnumerable<MyObj> = context.Address.Select(x => new {
Id1 = x.Id1,
Id2 = x.Id2,
PersonId = x.Person.Id
});
}

Thanks for the good answers of you guys, I could finally work it out :-)
var result = from tuple in (from address in context.Adresses
join person in context.Persons on address.PersonId equals person.Id
select new { person.Id, address.Id1, address.Id2})
group tuple by new { tuple.Id1, tuple.Id2 } into myGrouping
select
new MyObj
{
Id1 = myGrouping.Key.Id1,
Id2 = myGrouping.Key.Id2,
PersonIds = myGrouping.Select(x => x.PersonId).Distinct()
};

Select All distinct values in a column using LINQ

I created a Web Api in VS 2012.
I am trying to get all the value from one column "Category", that is all the unique value, I don't want the list to be returned with duplicates.
I used this code to get products in a particular category. How do I get a full list of categories (All the unique values in the Category Column)?
public IEnumerable<Product> GetProductsByCategory(string category)
{
return repository.GetAllProducts().Where(
p => string.Equals(p.Category, category, StringComparison.OrdinalIgnoreCase));
}

To have unique Categories:
var uniqueCategories = repository.GetAllProducts()
.Select(p => p.Category)
.Distinct();

var uniq = allvalues.GroupBy(x => x.Id).Select(y=>y.First()).Distinct();
Easy and simple

I have to find distinct rows with the following details
class : Scountry
columns: countryID, countryName,isactive
There is no primary key in this. I have succeeded with the followin queries
public DbSet<SCountry> country { get; set; }
public List<SCountry> DoDistinct()
{
var query = (from m in country group m by new { m.CountryID, m.CountryName, m.isactive } into mygroup select mygroup.FirstOrDefault()).Distinct();
var Countries = query.ToList().Select(m => new SCountry { CountryID = m.CountryID, CountryName = m.CountryName, isactive = m.isactive }).ToList();
return Countries;
}

Interestingly enough I tried both of these in LinqPad and the variant using group from Dmitry Gribkov by appears to be quicker. (also the final distinct is not required as the result is already distinct.
My (somewhat simple) code was:
public class Pair
{
public int id {get;set;}
public string Arb {get;set;}
}
void Main()
{
var theList = new List<Pair>();
var randomiser = new Random();
for (int count = 1; count < 10000; count++)
{
theList.Add(new Pair
{
id = randomiser.Next(1, 50),
Arb = "not used"
});
}
var timer = new Stopwatch();
timer.Start();
var distinct = theList.GroupBy(c => c.id).Select(p => p.First().id);
timer.Stop();
Debug.WriteLine(timer.Elapsed);
timer.Start();
var otherDistinct = theList.Select(p => p.id).Distinct();
timer.Stop();
Debug.WriteLine(timer.Elapsed);
}

Difference Between Select and SelectMany

I've been searching the difference between Select and SelectMany but I haven't been able to find a suitable answer. I need to learn the difference when using LINQ To SQL but all I've found are standard array examples.
Can someone provide a LINQ To SQL example?

SelectMany flattens queries that return lists of lists. For example
public class PhoneNumber
{
public string Number { get; set; }
}
public class Person
{
public IEnumerable<PhoneNumber> PhoneNumbers { get; set; }
public string Name { get; set; }
}
IEnumerable<Person> people = new List<Person>();
// Select gets a list of lists of phone numbers
IEnumerable<IEnumerable<PhoneNumber>> phoneLists = people.Select(p => p.PhoneNumbers);
// SelectMany flattens it to just a list of phone numbers.
IEnumerable<PhoneNumber> phoneNumbers = people.SelectMany(p => p.PhoneNumbers);
// And to include data from the parent in the result:
// pass an expression to the second parameter (resultSelector) in the overload:
var directory = people
.SelectMany(p => p.PhoneNumbers,
(parent, child) => new { parent.Name, child.Number });
Live Demo on .NET Fiddle

Select many is like cross join operation in SQL where it takes the cross product.
For example if we have
Set A={a,b,c}
Set B={x,y}
Select many can be used to get the following set
{ (x,a) , (x,b) , (x,c) , (y,a) , (y,b) , (y,c) }
Note that here we take the all the possible combinations that can be made from the elements of set A and set B.
Here is a LINQ example you can try
List<string> animals = new List<string>() { "cat", "dog", "donkey" };
List<int> number = new List<int>() { 10, 20 };
var mix = number.SelectMany(num => animals, (n, a) => new { n, a });
the mix will have following elements in flat structure like
{(10,cat), (10,dog), (10,donkey), (20,cat), (20,dog), (20,donkey)}

var players = db.SoccerTeams.Where(c => c.Country == "Spain")
.SelectMany(c => c.players);
foreach(var player in players)
{
Console.WriteLine(player.LastName);
}
De Gea
Alba
Costa
Villa
Busquets
...

SelectMany() lets you collapse a multidimensional sequence in a way that would otherwise require a second Select() or loop.
More details at this blog post.

There are several overloads to SelectMany. One of them allows you to keep trace of any relationship between parent and children while traversing the hierarchy.
Example: suppose you have the following structure: League -> Teams -> Player.
You can easily return a flat collection of players. However you may lose any reference to the team the player is part of.
Fortunately there is an overload for such purpose:
var teamsAndTheirLeagues =
from helper in leagues.SelectMany
( l => l.Teams
, ( league, team ) => new { league, team } )
where helper.team.Players.Count > 2
&& helper.league.Teams.Count < 10
select new
{ LeagueID = helper.league.ID
, Team = helper.team
};
The previous example is taken from Dan's IK blog. I strongly recommend you take a look at it.

I understand SelectMany to work like a join shortcut.
So you can:
var orders = customers
.Where(c => c.CustomerName == "Acme")
.SelectMany(c => c.Orders);

The SelectMany() method is used to flatten a sequence in which each of the elements of the sequence is a separate.
I have class user same like this
class User
{
public string UserName { get; set; }
public List<string> Roles { get; set; }
}
main:
var users = new List<User>
{
new User { UserName = "Reza" , Roles = new List<string>{"Superadmin" } },
new User { UserName = "Amin" , Roles = new List<string>{"Guest","Reseption" } },
new User { UserName = "Nima" , Roles = new List<string>{"Nurse","Guest" } },
};
var query = users.SelectMany(user => user.Roles, (user, role) => new { user.UserName, role });
foreach (var obj in query)
{
Console.WriteLine(obj);
}
//output
//{ UserName = Reza, role = Superadmin }
//{ UserName = Amin, role = Guest }
//{ UserName = Amin, role = Reseption }
//{ UserName = Nima, role = Nurse }
//{ UserName = Nima, role = Guest }
You can use operations on any item of sequence
int[][] numbers = {
new[] {1, 2, 3},
new[] {4},
new[] {5, 6 , 6 , 2 , 7, 8},
new[] {12, 14}
};
IEnumerable<int> result = numbers
.SelectMany(array => array.Distinct())
.OrderBy(x => x);
//output
//{ 1, 2 , 2 , 3, 4, 5, 6, 7, 8, 12, 14 }
List<List<int>> numbers = new List<List<int>> {
new List<int> {1, 2, 3},
new List<int> {12},
new List<int> {5, 6, 5, 7},
new List<int> {10, 10, 10, 12}
};
IEnumerable<int> result = numbers
.SelectMany(list => list)
.Distinct()
.OrderBy(x=>x);
//output
// { 1, 2, 3, 5, 6, 7, 10, 12 }

Select is a simple one-to-one projection from source element to a result element. Select-
Many is used when there are multiple from clauses in a query expression: each element in the original sequence is used to generate a new sequence.

The formal description for SelectMany() is:
Projects each element of a sequence to an IEnumerable and flattens
the resulting sequences into one sequence.
SelectMany() flattens the resulting sequences into one sequence, and invokes a result selector function on each element therein.
class PetOwner
{
public string Name { get; set; }
public List<String> Pets { get; set; }
}
public static void SelectManyEx()
{
PetOwner[] petOwners =
{ new PetOwner { Name="Higa, Sidney",
Pets = new List<string>{ "Scruffy", "Sam" } },
new PetOwner { Name="Ashkenazi, Ronen",
Pets = new List<string>{ "Walker", "Sugar" } },
new PetOwner { Name="Price, Vernette",
Pets = new List<string>{ "Scratches", "Diesel" } } };
// Query using SelectMany().
IEnumerable<string> query1 = petOwners.SelectMany(petOwner => petOwner.Pets);
Console.WriteLine("Using SelectMany():");
// Only one foreach loop is required to iterate
// through the results since it is a
// one-dimensional collection.
foreach (string pet in query1)
{
Console.WriteLine(pet);
}
// This code shows how to use Select()
// instead of SelectMany().
IEnumerable<List<String>> query2 =
petOwners.Select(petOwner => petOwner.Pets);
Console.WriteLine("\nUsing Select():");
// Notice that two foreach loops are required to
// iterate through the results
// because the query returns a collection of arrays.
foreach (List<String> petList in query2)
{
foreach (string pet in petList)
{
Console.WriteLine(pet);
}
Console.WriteLine();
}
}
/*
This code produces the following output:
Using SelectMany():
Scruffy
Sam
Walker
Sugar
Scratches
Diesel
Using Select():
Scruffy
Sam
Walker
Sugar
Scratches
Diesel
*/
The main difference is the result of each method while SelectMany() returns a flattern results; the Select() returns a list of list instead of a flattern result set.
Therefor the result of SelectMany is a list like
{Scruffy, Sam , Walker, Sugar, Scratches , Diesel}
which you can iterate each item by just one foreach. But with the result of select you need an extra foreach loop to iterate through the results because the query returns a collection of arrays.

Some SelectMany may not be necessary. Below 2 queries give the same result.
Customers.Where(c=>c.Name=="Tom").SelectMany(c=>c.Orders)
Orders.Where(o=>o.Customer.Name=="Tom")
For 1-to-Many relationship,
if Start from "1", SelectMany is needed, it flattens the many.
if Start from "Many", SelectMany is not needed. (still be able to filter from "1", also this is simpler than below standard join query)
from o in Orders
join c in Customers on o.CustomerID equals c.ID
where c.Name == "Tom"
select o

Just for an alternate view that may help some functional programmers out there:
Select is map
SelectMany is bind (or flatMap for your Scala/Kotlin people)

Without getting too technical - database with many Organizations, each with many Users:-
var orgId = "123456789";
var userList1 = db.Organizations
.Where(a => a.OrganizationId == orgId)
.SelectMany(a => a.Users)
.ToList();
var userList2 = db.Users
.Where(a => a.OrganizationId == orgId)
.ToList();
both return the same ApplicationUser list for the selected Organization.
The first "projects" from Organization to Users, the second queries the Users table directly.

It's more clear when the query return a string (an array of char):
For example if the list 'Fruits' contains 'apple'
'Select' returns the string:
Fruits.Select(s=>s)
[0]: "apple"
'SelectMany' flattens the string:
Fruits.SelectMany(s=>s)
[0]: 97 'a'
[1]: 112 'p'
[2]: 112 'p'
[3]: 108 'l'
[4]: 101 'e'

Consider this example :
var array = new string[2]
{
"I like what I like",
"I like what you like"
};
//query1 returns two elements sth like this:
//fisrt element would be array[5] :[0] = "I" "like" "what" "I" "like"
//second element would be array[5] :[1] = "I" "like" "what" "you" "like"
IEnumerable<string[]> query1 = array.Select(s => s.Split(' ')).Distinct();
//query2 return back flat result sth like this :
// "I" "like" "what" "you"
IEnumerable<string> query2 = array.SelectMany(s => s.Split(' ')).Distinct();
So as you see duplicate values like "I" or "like" have been removed from query2 because "SelectMany" flattens and projects across multiple sequences.
But query1 returns sequence of string arrays. and since there are two different arrays in query1 (first and second element), nothing would be removed.

The SelectMany method knocks down an IEnumerable<IEnumerable<T>> into an IEnumerable<T>, like communism, every element is behaved in the same manner(a stupid guy has same rights of a genious one).
var words = new [] { "a,b,c", "d,e", "f" };
var splitAndCombine = words.SelectMany(x => x.Split(','));
// returns { "a", "b", "c", "d", "e", "f" }

One more example how SelectMany + Select can be used in order to accumulate sub array objects data.
Suppose we have users with they phones:
class Phone {
public string BasePart = "555-xxx-xxx";
}
class User {
public string Name = "Xxxxx";
public List<Phone> Phones;
}
Now we need to select all phones' BaseParts of all users:
var usersArray = new List<User>(); // array of arrays
List<string> allBaseParts = usersArray.SelectMany(ua => ua.Phones).Select(p => p.BasePart).ToList();

Suppose you have an array of countries
var countries = new[] { "France", "Italy" };
If you perform Select on countries, you will get each element of the array as IEnumerable<T>
IEnumerable<string> selectQuery = countries.Select(country => country);
In the above code, the country represents a string that refers to each country in the array. now iterate over selectQuery to get countries:
foreach(var country in selectQuery)
Console.WriteLine(country);
// output
//
// France
// Italy
If you want to print every character of countries you have to use nested foreach
foreach (var country in selectQuery)
{
foreach (var charOfCountry in country)
{
Console.Write(charOfCountry + ", ");
}
}
// output
// F, r, a, n, c, e, I, t, a, l, y,
OK. now try to perform SelectMany on countries. This time SelectMany gets each country as string (as before) and because of string type is a collection of chars, SelectMany tries to divide each country into its constituent parts (chars) and then returns a collection of chars as IEnumerable<T>
IEnumerable<char> selectManyQuery = countries.SelectMany(country => country);
In the above code, the country represents a string that refers to each country in the array as before, but the return value is the chars of each country
Actually SelectMany likes to fetch two levels inside of collections and flatten the second level as IEnumerable<T>
Now iterate over selectManyQuery to get chars of each country:
foreach(var charOfCountry in selectManyQuery)
Console.Write(charOfCountry + ", ");
// output
// F, r, a, n, c, e, I, t, a, l, y,

Here is a code example with an initialized small collection for testing:
class Program
{
static void Main(string[] args)
{
List<Order> orders = new List<Order>
{
new Order
{
OrderID = "orderID1",
OrderLines = new List<OrderLine>
{
new OrderLine
{
ProductSKU = "SKU1",
Quantity = 1
},
new OrderLine
{
ProductSKU = "SKU2",
Quantity = 2
},
new OrderLine
{
ProductSKU = "SKU3",
Quantity = 3
}
}
},
new Order
{
OrderID = "orderID2",
OrderLines = new List<OrderLine>
{
new OrderLine
{
ProductSKU = "SKU4",
Quantity = 4
},
new OrderLine
{
ProductSKU = "SKU5",
Quantity = 5
}
}
}
};
//required result is the list of all SKUs in orders
List<string> allSKUs = new List<string>();
//With Select case 2 foreach loops are required
var flattenedOrdersLinesSelectCase = orders.Select(o => o.OrderLines);
foreach (var flattenedOrderLine in flattenedOrdersLinesSelectCase)
{
foreach (OrderLine orderLine in flattenedOrderLine)
{
allSKUs.Add(orderLine.ProductSKU);
}
}
//With SelectMany case only one foreach loop is required
allSKUs = new List<string>();
var flattenedOrdersLinesSelectManyCase = orders.SelectMany(o => o.OrderLines);
foreach (var flattenedOrderLine in flattenedOrdersLinesSelectManyCase)
{
allSKUs.Add(flattenedOrderLine.ProductSKU);
}
//If the required result is flattened list which has OrderID, ProductSKU and Quantity,
//SelectMany with selector is very helpful to get the required result
//and allows avoiding own For loops what according to my experience do code faster when
// hundreds of thousands of data rows must be operated
List<OrderLineForReport> ordersLinesForReport = (List<OrderLineForReport>)orders.SelectMany(o => o.OrderLines,
(o, ol) => new OrderLineForReport
{
OrderID = o.OrderID,
ProductSKU = ol.ProductSKU,
Quantity = ol.Quantity
}).ToList();
}
}
class Order
{
public string OrderID { get; set; }
public List<OrderLine> OrderLines { get; set; }
}
class OrderLine
{
public string ProductSKU { get; set; }
public int Quantity { get; set; }
}
class OrderLineForReport
{
public string OrderID { get; set; }
public string ProductSKU { get; set; }
public int Quantity { get; set; }
}

A select operator is used to select value from a collection and SelectMany operator is used to selecting values from a collection of collection i.e. nested collection.

It is the best way to understand i think.
var query =
Enumerable
.Range(1, 10)
.SelectMany(ints => Enumerable.Range(1, 10), (a, b) => $"{a} * {b} = {a * b}")
.ToArray();
Console.WriteLine(string.Join(Environment.NewLine, query));
Console.Read();
Multiplication Table example.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Extensible relational division in LINQ - c#

Related

Group by linq for nested objects

using variable in linq group by clause

EntityFramework / LinQ load entity from database to dto

Select All distinct values in a column using LINQ

Difference Between Select and SelectMany

Categories

Resources