LINQ Query to count by distinct category / subcategory - c#

I have a table, generated from a LINQ query on a datatable, which has subcategory and category fields:
Name...........Category.........Subcategory
Kiss...........Rock.............Glam Rock
Metallica......Rock.............Hard Rock
Bon Jovi.......Rock.............Soft Rock
Slade..........Rock.............Glam Rock
Meatloaf.......Rock.............Soft Rock
Wilee..........Dance............Grime
Mgmt...........Dance............Nu Rave
Dizee..........Dance............Grime
The LINQ query I am using to generate this table is:
var qCategory = from c in dtCategory.AsEnumerable()
select new {
Artist = c.Field<string>("Artist"),
Category = c.Field<string>("Category"),
Subcategory = c.Field<string>("Subcategory")
};
Now I want to get a count of each category/subcategory pair. e.g. for the above example I want to return:
Category............Subcategory.......Count
Rock................Glam Rock.........2
Rock................Soft Rock........2
Rock................Hard Rock.........1
Dance...............Grime.............2
Dance...............Nu Rave...........1
How can I acheive this?

Try:
var counts = from artist in qCategory
group artist by new { artist.Category, artist.Subcategory }
into g
select new {
g.Key.Category,
g.Key.Subcategory,
Count = g.Count()
};
If you want to enforce that subcategories always have the same parent category (given that the sub-categories are named "Glam Rock" etc., I assume that this is in fact the case), do:
var counts = from artist in qCategory
group artist by artist.Subcategory into g
select new {
Category = g.Select(a => a.Category)
.Distinct()
.Single(),
Subcategory = g.Key,
Count = g.Count()
};
This will throw an exception if "Rap Rock" turns up as a subcategory of both "Rap" and "Rock".

qCategory.
GroupBy(item => new {Category = item.Category, Subcategory = item.Subcategory}).
Select(group => new {Category = group.Key.Category, Subcategory = group.Key.Subcategory, Count = group.Count()})

Related

LINQ left join, group by and Count generates wrong result

I'm struggling with linq (left join - group - count). Please help me.
Below is my code and it gives me this result.
Geography 2
Economy 1
Biology 1
I'm expecting this...
Geography 2
Economy 1
Biology 0
How can I fix it?
class Department
{
public int DNO { get; set; }
public string DeptName { get; set; }
}
class Student
{
public string Name { get; set; }
public int DNO { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Department> departments = new List<Department>
{
new Department {DNO=1, DeptName="Geography"},
new Department {DNO=2, DeptName="Economy"},
new Department {DNO=3, DeptName="Biology"}
};
List<Student> students = new List<Student>
{
new Student {Name="Peter", DNO=2},
new Student {Name="Paul", DNO=1},
new Student {Name="Mary", DNO=1},
};
var query = from dp in departments
join st in students on dp.DNO equals st.DNO into gst
from st2 in gst.DefaultIfEmpty()
group st2 by dp.DeptName into g
select new
{
DName = g.Key,
Count = g.Count()
};
foreach (var st in query)
{
Console.WriteLine("{0} \t{1}", st.DName, st.Count);
}
}
}
var query =
from department in departments
join student in students on department.DNO equals student.DNO into gst
select new
{
DepartmentName = department.DeptName,
Count = gst.Count()
};
I don't think any grouping is required for answering your question.
You only want to know 2 things:
- name of department
- number of students per department
By using the 'join' and 'into' you're putting the results of the join in the temp identifier gst. You only have to count the number of results in the gst.
var query = from dp in departments
from st in students.Where(stud => stud.DNO == dp.DNO).DefaultIfEmpty()
group st by dp.DeptName into g
select new
{
DName = g.Key,
Count = g.Count(x => x!=null)
};
You want to group the students by the department name but you want the count to filter out null students. I did change the join syntax slightly although that really does not matter to much.
Here is a working fiddle
Well, see what #Danny said in his answer, it's the best and cleanest fix for this case. By the way, you could also rewrite it to the lambda syntax:
var query = departments.GroupJoin(students,
dp => dp.DNO, st => st.DNO,
(dept,studs) => new
{
DName = dept.DNO,
Count = studs.Count()
});
I find this syntax much more predictable in results, and often, shorter.
BTW: .GroupJoin is effectively a "left join", and .Join is "inner join". Be careful to not mistake one for another.
And my answer is similar to #Igor
var query = from dp in departments
join st in students on dp.DNO equals st.DNO into gst
from st2 in gst.DefaultIfEmpty()
group st2 by dp.DeptName into g
select new
{
DName = g.Key,
Count = g.Count(std => std != null)
};
g.Count(std => std != null) is only one change you should take.

How to aggregate over one property while summing another?

I have a list of invoices and each record consist of a customer ID and an amount. The aim is to produce a list of payments where each payment is unique per customer (there might be multiple invoices per each customer) and sums the related invoices' amounts to a total.
Producing a list of distinct invoices (with respect to the customer ID) is very easy. The things is that I then only have the value of the first invoice's amount and not the sum.
List<Payment> distinct = invoices
.GroupBy(invoice => invoice.CustomerId)
.Select(group => group.First())
.Select(invoice => new Payment
{
CustomerId = invoice.CustomerId,
Total = invoice.Amount
}).ToList();
Is there a smooth LINQ-fu for that or do I need to go foreach on my list?
If you have something like this
Invoice[] invoices = new Invoice[3];
invoices[0] = new Invoice { Id = 1, Amount = 100 };
invoices[1] = new Invoice { Id = 1, Amount = 150 };
invoices[2] = new Invoice { Id = 2, Amount = 300 };
Then you could have your goal as
var results = from i in invoices
group i by i.Id into g
select new { Id = g.Key, TotalAmount = g.Sum(i => i.Amount)};
Based on the answer of jmelosegui:
List<Payment> distinct = invoices
.GroupBy(c => c.CustomerId)
.Select(c => new Payment
{
CustomerId = c.Key,
Total = c.Sum(x => x.Amount)
}).ToList();

LINQ Expression accessing more fields

I am really new to using LINQ and I was wondering what I need to do to the below expression to grab extra fields
public class Foo
{
public string Name {get;set;}
public string Manufacturer {get;set;}
public float Price {get;set;}
}
var result= (
from row in dt.AsEnumerable()
group row by row.Field<string>("NAME") into g
select new Foo
{
Name = g.Key,
Price=g.Min (x =>x.Field<float>("PRICE"))
//Manufacturer = ????
}
).ToList();
I basically need to get the Manufacturer from the MANUFACTURER field and set it's value in the object. I've tried:
row.Field<string>("MANUFACTURER")
//and
g.Field<string>("MANUFACTURER")
But I am having no luck accessing the field in the DataTable. Can anyone advise what I'm doing wrong please?
So you want to group by name. But how do you want to aggregate the manufacturers for each name-group?
Presuming that you just want to take the first manufacturer:
var result= (
from row in dt.AsEnumerable()
group row by row.Field<string>("NAME") into g
select new Foo
{
Name = g.Key,
Price=g.Min (x =>x.Field<float>("PRICE")),
Manufacturer = g.First().Field<string>("MANUFACTURER")
}
).ToList();
Maybe you instead want to concatenate all with a separator:
// ...
Manufacturer = string.Join(",", g.Select(r=> r.Field<string>("MANUFACTURER")))
As your logic stands you may have more than one Manufacturer if you only group by Name.
To illustrate this consider the following data, which is supported by your data structure.
Example
ProductA, ManufacturerA
ProductA, ManufacturerB
If you group by just "ProductA" then Manufacturer is a collection of ["ManufacturerA", "ManufacturerB"]
Potential Solution
You could group by Name and Manufacturer then access both Name and Manufacturer
var result= (
from row in dt.AsEnumerable()
group row by new
{
row.Field<string>("NAME"),
row.Field<string>("MANUFACTURER")
} into g
select new Foo
{
Name = g.Key.Name,
Manufacturer = g.Key.Manufacturer,
Price=g.Min (x =>x.Field<float>("PRICE"))
}
).ToList();
EDIT
Based on comment "I am trying to pull the name with the cheapest price and the manufacturer along with it."
var result= (
from row in dt.AsEnumerable()
group row by row.Field<string>("NAME") into g
let x = new
{
Name = g.Key.Name,
Price=g.Min (x =>x.Field<float>("PRICE"))
}
where (row.Name == x.Name && row.Price == x.Price)
select new Foo
{
Name = row.Name,
Manufacturer = row.Manufacturer,
Price= row.Price
}
).ToList();

Group by child table in LINQ

I have students table and I have subjects tables I need to group students by subjects . I tried following which doesnt show s.StudentSubjects.SubjectName . How can I write group by with child table .
Students -> StudentID | Name
StudentSubjects -> SubjectID | StudentID | SubjectName
var list = from s in students
group s by s.StudentSubjects.? into g
select new StudentSubjectsCounts
{
Name = g.Key,
Count = g.Count(),
};
Sounds like you should query off of StudentSubjects instead of Student:
var list = from ss in studentSubjects
group ss by s.SubjectName into g
select new StudentSubjectsCounts
{
Name = g.Key,
Count = g.Count(),
};
Or, to start from a list of students:
var list = students.SelectMany(s => s.StudentSubjects)
.GroupBy(ss => ss.SubjectName)
.Select(g => new StudentSubjectsCounts
{
Name = g.Key,
Count = g.Count(),
});
You should be able to group by the StudentSubject object itself
var list = from s in students
group s by s.StudentSubjects into g
select new StudentSubjectsCounts
{
Name = g.Key.SubjectName,
Count = g.Count(),
};
but if you don't want to, project the name using a Select
var list = from s in students
group s by s.StudentSubjects.Select(ss => ss.SubjectName) into g
select new StudentSubjectsCounts
{
Name = g.Key,
Count = g.Count(),
};

LINQ Query for GroupBy and Max in a Single Query

I have the following LINQ query but i want to modify it that I want to group by staffId and pick only those records whose ObservationDate is Max for each staffId.
from ob in db.TDTObservations.OfType<TDTSpeedObservation>()
select new
{
Id = ob.ID,
AcademicYearId = ob.Teachers.FirstOrDefault().Classes.FirstOrDefault().AcademicYearID,
observationDate = ob.ObservationDate,
schoolId = ob.Teachers.FirstOrDefault().Classes.FirstOrDefault().SchoolID,
staffId=ob.Teachers.FirstOrDefault().ID
};
var observations =
from ob in db.TDTObservations.OfType<TDTSpeedObservation>()
select new {
Id = ob.ID,
AcademicYearId = ob.Teachers.FirstOrDefault().Classes.FirstOrDefault().AcademicYearID,
observationDate = ob.ObservationDate,
schoolId = ob.Teachers.FirstOrDefault().Classes.FirstOrDefault().SchoolID,
staffId=ob.Teachers.FirstOrDefault().ID
};
var result = from o in observations
group o by o.staffId into g
select g.OrderByDescending(x => x.observationDate).First();
what about this: hereby you first group your entries (Teachers) by their ID together and then from each group (grp) you pick that one with the latest ObservationDate
var observations = from d in db.TDTObservations.OfType<TDTSpeedObservation>()
group d by d.Teachers.FirstOrDefault().ID into grp
select grp.OrderByDescending(g => g.ObservationDate).FirstOrDefault();

Categories

Resources