using variable in linq group by clause - c#

Lets say that i have a table with multiple columns (student, teacher, subject, marks) and I want to compare each of these column to another table with same columns (with sum(marks)).
I have the following method with column name as an argument, which is then used in group by clause.
public List<AllMarks> RIMarks(string filter)
{
var MarkTable = from p in MainDB.classTable
let fil = filter
group p by new { fil } into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
List<AllMarks> lstRI = MarkTable.ToList();
return lstRI;
}
public void Test()
{
var filter = new string[] {"Student", "Teacher", "Subject"}
foreach (f in filter)
{
// Call RIMarks(f) and do something
}
}
I have number of distinct students in my table, however with this method, what I get is just a single row with sum(marks(all students) for the first filter criteria (which is Student) and its not actually grouping by the student.
How do I use local variable in linq groupby clause ?
Update:
Sample DB:
Student Teacher Subject Marks
stu1 teac1 sub1 23
stu1 teac1 sub1 45
stu2 teac2 sub2 34

You're trying to group by the name of a particular property of some data object you have.
You want to reflect the property of the field you want then group by that field's values.
You can't just provide the name because the filter wont work and it'll just group the whole collection.
Test this out in LinqPad.
What you have are the marks being summed on the grouping, not sure if you wanted the average but
what you have here should lead you to getting the average.
void Main()
{
var mySchool = new List<School>{
new School{Student = "Student A", Teacher = "Teacher A", Subject = "Math", Marks = 80},
new School{Student = "Student B", Teacher = "Teacher A", Subject = "Math", Marks = 65},
new School{Student = "Student C", Teacher = "Teacher A", Subject = "Math", Marks = 95},
new School{Student = "Student A", Teacher = "Teacher B", Subject = "History", Marks = 80},
new School{Student = "Student B", Teacher = "Teacher B", Subject = "History", Marks = 100},
};
GroupByFilter("Student", mySchool);
GroupByFilter("Teacher", mySchool);
GroupByFilter("Subject", mySchool);
}
public void GroupByFilter(string filter, List<School> school)
{
PropertyInfo prop = typeof(School).GetProperties()
.Where(x => x.Name == filter)
.First();
var grouping = from s in school
group s by new {filter = prop.GetValue(s)} into gr
select new {
Filter = gr.Key.filter,
Marks = gr.Sum(x => x.Marks)
};
grouping.Dump(); // this is linqpad specific
}
// Define other methods and classes here
public class School{
public string Student {get;set;}
public string Teacher {get;set;}
public string Subject {get;set;}
public int Marks {get;set;}
}
Results
Group By Student
Filter Marks
Student A 160
Student B 165
Student C 95
Group By Teacher
Filter Marks
Teacher A 240
Teacher B 180
Group By Subject
Filter Marks
Math 240
History 180

please try below code.
public List<AllMarks> RIMarks(string filter)
{
if (filter == "Student") {
var MarkTable = from p in MainDB.classTable
group p by new { p.Student} into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
}
else if (filter == "Teacher") {
var MarkTable = from p in MainDB.classTable
group p by new { p.Teacher} into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
}
else if (filter == "Subject") {
var MarkTable = from p in MainDB.classTable
group p by new { p.Subject} into g
select new AllMarks
{
Column = g.Key.fil,
Marks = g.Sum(f => f.Mark)
};
}
}

Related

Dynamic LINQ Sub Query

I have LINQ query that I want to generate dynamically:
var groupData =
from l in data
group l by l.Field1 into field1Group
select new MenuItem()
{
Key = field1Group.Key,
Count = field1Group.Count(),
Items = (from k in field1Group
group k by k.Field2 into field2Group
select new MenuItem()
{
Key = field2Group.Key,
Count = field2Group.Count()
}).ToList()
};
The ultimate goal is to be able to dynamically group the data by any combination of fields with no limit on the nested queries.
I can get as far as the first level but I'm struggling with the nested sub queries:
string field1 = "Field1";
string field2 = "Field2";
var groupDataD =
data.
GroupBy(field1, "it").
Select("new ( it.Key, it.Count() as Count )");
Is this possible with chained dynamic LINQ? Or is there a better way to achieve this?
The following should work (though personally I would rather avoid using such code):
Follow this answer to add the following in ParseAggregate, :
Expression ParseAggregate(Expression instance, Type elementType, string methodName, int errorPos)
{
// Change starts here
var originalIt = it;
var originalOuterIt = outerIt;
// Change ends here
outerIt = it;
ParameterExpression innerIt = Expression.Parameter(elementType, elementType.Name);
it = innerIt;
Expression[] args = ParseArgumentList();
// Change starts here
it = originalIt;
outerIt = originalOuterIt;
// Change ends here
...
}
Add Select, GroupBy, ToList into IEnumerableSignatures, and respective conditions in ParseAggregate, as explained in this answer:
interface IEnumerableSignatures
{
...
void GroupBy(object selector);
void Select(object selector);
void ToList();
...
}
Expression ParseAggregate(Expression instance, Type elementType, string methodName, int errorPos)
{
...
if (signature.Name == "Min" ||
signature.Name == "Max" ||
signature.Name == "GroupBy" ||
signature.Name == "Select")
...
}
Finally, Your query would be:
string field1 = "Field1";
string field2 = "Field2";
var result =
data
.GroupBy(field1, "it")
.Select($#"new (
it.Key,
it.Count() as Count,
it.GroupBy({field2})
.Select(new (it.Key, it.Count() as Count))
.ToList() as Items
)");
Note that "it" holds a different instance when used in the parent query vs. the subquery. I tried to take advantage of "outerIt" to overcome this conflation, but unfortunately without success (but maybe you'd succeed? maybe 1, 2 would help)
A simple example for future reference:
public class Person
{
public string State { get; set; }
public int Age { get; set; }
}
public static Main()
{
var persons = new List<Person>
{
new Person { State = "CA", Age = 20 },
new Person { State = "CA", Age = 20 },
new Person { State = "CA", Age = 30 },
new Person { State = "WA", Age = 60 },
new Person { State = "WA", Age = 70 },
};
var result = persons
.GroupBy("State", "it")
.Select(#"new (
it.Key,
it.Count() as Count,
it.GroupBy(Age)
.Select(new (it.Key, it.Count() as Count))
.ToList() as Items
)");
foreach (dynamic group in result)
{
Console.WriteLine($"Group.Key: {group.Key}");
foreach (dynamic subGroup in group.Items)
{
Console.WriteLine($"SubGroup.Key: {subGroup.Key}");
Console.WriteLine($"SubGroup.Count: {subGroup.Count}");
}
}
}

LINQ left join, group by and Count generates wrong result

I'm struggling with linq (left join - group - count). Please help me.
Below is my code and it gives me this result.
Geography 2
Economy 1
Biology 1
I'm expecting this...
Geography 2
Economy 1
Biology 0
How can I fix it?
class Department
{
public int DNO { get; set; }
public string DeptName { get; set; }
}
class Student
{
public string Name { get; set; }
public int DNO { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Department> departments = new List<Department>
{
new Department {DNO=1, DeptName="Geography"},
new Department {DNO=2, DeptName="Economy"},
new Department {DNO=3, DeptName="Biology"}
};
List<Student> students = new List<Student>
{
new Student {Name="Peter", DNO=2},
new Student {Name="Paul", DNO=1},
new Student {Name="Mary", DNO=1},
};
var query = from dp in departments
join st in students on dp.DNO equals st.DNO into gst
from st2 in gst.DefaultIfEmpty()
group st2 by dp.DeptName into g
select new
{
DName = g.Key,
Count = g.Count()
};
foreach (var st in query)
{
Console.WriteLine("{0} \t{1}", st.DName, st.Count);
}
}
}
var query =
from department in departments
join student in students on department.DNO equals student.DNO into gst
select new
{
DepartmentName = department.DeptName,
Count = gst.Count()
};
I don't think any grouping is required for answering your question.
You only want to know 2 things:
- name of department
- number of students per department
By using the 'join' and 'into' you're putting the results of the join in the temp identifier gst. You only have to count the number of results in the gst.
var query = from dp in departments
from st in students.Where(stud => stud.DNO == dp.DNO).DefaultIfEmpty()
group st by dp.DeptName into g
select new
{
DName = g.Key,
Count = g.Count(x => x!=null)
};
You want to group the students by the department name but you want the count to filter out null students. I did change the join syntax slightly although that really does not matter to much.
Here is a working fiddle
Well, see what #Danny said in his answer, it's the best and cleanest fix for this case. By the way, you could also rewrite it to the lambda syntax:
var query = departments.GroupJoin(students,
dp => dp.DNO, st => st.DNO,
(dept,studs) => new
{
DName = dept.DNO,
Count = studs.Count()
});
I find this syntax much more predictable in results, and often, shorter.
BTW: .GroupJoin is effectively a "left join", and .Join is "inner join". Be careful to not mistake one for another.
And my answer is similar to #Igor
var query = from dp in departments
join st in students on dp.DNO equals st.DNO into gst
from st2 in gst.DefaultIfEmpty()
group st2 by dp.DeptName into g
select new
{
DName = g.Key,
Count = g.Count(std => std != null)
};
g.Count(std => std != null) is only one change you should take.

Group by linq for nested objects

I am making a group by linq statement where i convert a single list of data into an list with a nested list. Here is my code so far:
[TestMethod]
public void LinqTestNestedSelect2()
{
// initialization
List<combi> listToLinq = new List<combi>() {
new combi{ id = 1, desc = "a", name = "A", count = 1 },
new combi{ id = 1, desc = "b", name = "A", count = 2 },
new combi{ id = 2, desc = "c", name = "B", count = 3 },
new combi{id = 2, desc = "d", name = "B", count = 4 },
};
// linq group by
var result = (from row in listToLinq
group new { des = row.desc, count = row.count } by new { name = row.name, id = row.id } into obj
select new A { name = obj.Key.name, id = obj.Key.id, descriptions = (from r in obj select new B() { des = r.des, count = r.count }).ToList() }).ToList();
// validation of the results
Assert.AreEqual(2, result.Count);
Assert.AreEqual(2, result[0].descriptions.Count);
Assert.AreEqual(2, result[0].descriptions.Count);
Assert.AreEqual(2, result[1].descriptions.Count);
Assert.AreEqual(2, result[1].descriptions.Count);
}
public class A
{
public int id;
public string name;
public List<B> descriptions;
}
public class B
{
public int count;
public string des;
}
public class combi
{
public int id;
public string name;
public int count;
public string desc;
}
This is fine if the objects are small like the example. However I will implement this for objects with a lot more properties. How can I efficiently write this statement so I don't have to write field names twice in my linq statement?
I would like to return the objects in the statement and I want something like:
// not working wishfull thinking code
var result = (from row in listToLinq
group new { des = row.desc, count = row.count } by new { name = row.name, id = row.id } into obj
select new (A){ this = obj.key , descriptions = obj.ToList<B>()}).ToList();
Background: I am re writing a web api that retrieves objects with nested objects in a single database call for the sake of db performance. It's basically a big query with a join that retrieves a crap load of data which I need to sort out into objects.
probably important: the ID is unique.
EDIT:
based on the answers so far I have made a solution which sort of works for me, but is still a bit ugly, and I would want it to be better looking.
{
// start part
return (from row in reader.AsEnumerable()
group row by row.id into grouping
select CreateA(grouping)).ToList();
}
private static A CreateA(IGrouping<object, listToLinq> grouping)
{
A retVal = StaticCreateAFunction(grouping.First());
retVal.descriptions = grouping.Select(item => StaticCreateBFunction(item)).ToList();
return ret;
}
I hope the StaticCreateAFunction is obvious enough for what it does. In this scenario I only have to write out each property once, which is what I really wanted. But I hope there is a more clever or linq-ish way to write this.
var result = (from row in listToLinq
group new B { des = row.desc, count = row.count } by new A { name = row.name, id = row.id } into obj
select new A { name = obj.Key.name, id = obj.Key.id, descriptions = obj.ToList() }).ToList();
You can add to each of the A and B classes a constructor that receives a combi and then it takes from it only what it needs. For example for a:
public class A
{
public A(combi c)
{
id = c.id;
name = c.name;
}
}
public class B
{
public B(combi c)
{
count = c.count;
des = c.desc;
}
}
Then your query can look like:
var result = (from row in listToLinq
group row by new { row.id, row.name } into grouping
select new A(grouping.First())
{
descriptions = grouping.Select(item => new B(item)).ToList()
}).ToList();
If you don't like the grouping.First() you can then override Equals and GetHashCode and then in the group by do by a new a with the relevant fields (which will be those in the Equals) and then add a copy constructor from a
Another way, in which you decouple the A/B classes from the combi is to extract the convert logic to a collection of static methods.

Get a list of duplicate entries in List based on 2 fields using Linq

I am using the following Linq to determine if there are any invalid entries in my list during some custom validation - I want to know if any Persons have been assigned the same Number based on the Company they work for which works fine:
var duplicates = Persons.GroupBy(x =>
new { x.Number, x.CompanyId}, (key) => new { key.Number, key.CompanyId })
.Where(y => y.Count() > 1);
For a simple class of Person:
class Person
{
public string Name { get; set; }
public int Number { get; set; }
public int CompanyId { get; set; }
}
So build some test data:
List<Person> Persons = new List<Person>();
// add people (users would do this!)
Persons.Add(new Person() { Name = "Person 1", Number = 1, CompanyId = 1 }); // invalid
Persons.Add(new Person() { Name = "Person 2", Number = 2, CompanyId = 1 });
Persons.Add(new Person() { Name = "Person 3", Number = 3, CompanyId = 1 });
Persons.Add(new Person() { Name = "Person 4", Number = 1, CompanyId = 1 }); // invalid
Persons.Add(new Person() { Name = "Person 5", Number = 2, CompanyId = 2 }); // invalid
Persons.Add(new Person() { Name = "Person 6", Number = 2, CompanyId = 2 }); // invalid
Check if any duplicates and handle:
var duplicates = Persons.GroupBy(x =>
new { x.Number, x.CompanyId}, (key) => new { key.Number, key.CompanyId })
.Where(y => y.Count() > 1);
if (duplicates.Any())
{
// build a string
}
What I want to do is to get a list of the invalid entries and inform the user. So in the above case, I would want to output the following text:
Person 1 and Person 4 have been assigned the same Number #1 for Company #1.
Person 5 and Person 6 have been assigned the same Number #2 for Company #2.
Change your group by to return the name as the selection and group by your key. String.Join will then merge the list
var duplicates = Persons
.GroupBy(key => new { key.Number, key.CompanyId }, a=>a.Name)
.Where(y => y.Count() > 1);
var sb = new StringBuilder();
foreach (var duplicate in duplicates)
{
sb.AppendLine(String.Format("{0} have been assigned the same Number {1} for Company #{2}",
String.Join(" and ", duplicate), duplicate.Key.Number,
duplicate.Key.CompanyId));
}
var message = sb.ToString();
Now check if message is empty to know if you have duplicates instead of your Any() statement.
I find it easier to write grouping LINQ queries correctly using the query syntax rather than the fluent syntax. Here's a query that'll get you the strings you want:
from p in Persons
group p by new { p.Number, p.CompanyId } into g
where g.Count () > 1
select string.Format(
"{0} have been assigned the same number #{1} for company {2}",
string.Join(" and ", g.Select (x => x.Name)),
g.Key.Number,
g.Key.CompanyId);
Note that that query won't work as a LINQ-to-SQL/Entities query, it'll only work against in memory data.

Extensible relational division in LINQ

In this example class IcdPatient represents a many-to-many relationship between a Patient table (not shown in this example) and a lookup table Icd.
public class IcdPatient
{
public int PatientId { get; set; }
public int ConditionCode { get; set; }
public static List<IcdPatient> GetIcdPatientList()
{
return new List<IcdPatient>()
{
new IcdPatient { PatientId = 100, ConditionCode = 111 },
new IcdPatient { PatientId = 100, ConditionCode = 222 },
new IcdPatient { PatientId = 200, ConditionCode = 111 },
new IcdPatient { PatientId = 200, ConditionCode = 222 },
new IcdPatient { PatientId = 3, ConditionCode = 222 },
};
}
}
public class Icd
{
public int ConditionCode { get; set; }
public string ConditionName { get; set; }
public static List<Icd> GetIcdList()
{
return new List<Icd>()
{
new Icd() { ConditionCode =111, ConditionName ="Condition 1"},
new Icd() { ConditionCode =222, ConditionName ="Condition 2"},
};
}
}
I would like for the user to be able to enter as many conditions as they want, and get a LINQ object back that tells them how many PatientIds satisfy that query. I've come up with:
List<string> stringFilteredList = new List<string> { "Condition 1", "Condition 2" };
List<int> filteringList = new List<int> { 111,222 };
var manyToMany = IcdPatient.GetIcdPatientList();
var icdList = Icd.GetIcdList();
/*Working method without joining on the lookup table*/
var grouped = from m in manyToMany
group m by m.PatientId into g
where g.Count() == filteringList.Distinct().Count()
select new
{
PatientId = g.Key,
Count = g.Count()
};
/*End*/
foreach (var item in grouped)
{
Console.WriteLine(item.PatientId);
}
Let's say that IcdPatient has a composite primary key on both fields, so we know that each row is unique. If we find the distinct number of entries in filteringList and do a count on the number of times a PatientId shows up, that means we've found all the people who have all conditions. Because the codes can be esoteric, I would like to do something like
let the user table in the ConditionName in type Icd and perform the same operation. I've not used LINQ this way a lot and I've gathered:
List<int> filteringList = new List<int> { 111,222 };
List<string> stringFilteredList= new List<string>{"Condition 1","Condition 2" };
filteringList.Distinct();
var manyToMany = IcdPatient.GetIcdPatientList();
var icdList = Icd.GetIcdList();
/*Working method without joining on the lookup table*/
var grouped = from m in manyToMany
join i in icdList on
m.ConditionCode equals i.ConditionCode
//group m by m.PatientId into g
group new {m,i} by new { m.ConditionCode }into g
where g.Count() == filteringList.Distinct().Count()
select new
{
Condition = g.Key.ConditionCode
};
/*End*/
but can't get anything to work. This is essentially a join on top of my first query, but I'm not getting what I need to group on.
You don't need to group anything in this case, just use a join and a contains:
List<string> stringFilteredList= new List<string>{"Condition 1","Condition 2" };
var patients =
from icd in Icd.GetIcdList()
join patient in IcdPatient.GetIcdPatientList() on icd.ConditionCode equals patient.ConditionCode
where stringFilteredList.Contains(icd.ConditionName)
select patient.PatientId;
Let's say that IcdPatient has a composite primary key on both fields, so we know that each row is unique. If we find the distinct number of entries in filteringList and do a count on the number of times a PatientId shows up, that means we've found all the people who have all conditions. Because the codes can be esoteric, I would like to do something like let the user table in the ConditionName in type Icd and perform the same operation.
I believe you're asking:
Given a list of ConditionCodes, return a list of PatientIds where every patient has every condition in the list.
In that case, the easiest thing to do is group your IcdPatients table by Id, so that we can tell every condition that a patient has by looking once. Then we check that every ConditionCode we're looking for is in the group. In code, that looks like:
var result = IcdPatient.GetIcdPatientList()
// group up all the objects with the same PatientId
.GroupBy(patient => patient.PatientId)
// gather the information we care about into a single object of type {int, List<int>}
.Select(patients => new {Id = patients.Key,
Conditions = patients.Select(p => p.ConditionCode)})
// get rid of the patients without every condition
.Where(conditionsByPatient =>
conditionsByPatient.Conditions.All(condition => filteringList.Contains(condition)))
.Select(conditionsByPatient => conditionsByPatient.Id);
In query format, that looks like:
var groupedInfo = from patient in IcdPatient.GetIcdPatientList()
group patient by patient.PatientId
into patients
select new { Id = patients.Key,
Conditions = patients.Select(patient => patient.ConditionCode) };
var resultAlt = from g in groupedInfo
where g.Conditions.All(condition => filteringList.Contains(condition))
select g.Id;
Edit: If you'd also like to let your user specify the ConditionName rather than the ConditionId then simply convert from one to the other, storing the result in filteringList, like so:
var conditionNames = // some list of names from the user
var filteringList = Icd.GetIcdList().Where(icd => conditionNames.Contains(icd.ConditionName))
.Select(icd => icd.ConditionCode);

Categories

Resources