I want to be able to execute the following query:
select distinct studentname, subject, grade from studenttable;
The student table has the following fields:
studentid
studentname
subject
grade
The linq query I have now is:
var students=dc.Select(s => new {s.studentname, s.subject, s.grade}).Distinct();
dc is the data context.
This query works, but how do I get the student id while still satisfying the distinct condition on the studentname, subject, grade set?
The issue here is that you've collected three properties of the necessary data which, while they will suffice to pass through Distinct, don't have any link back to their original data, and any such link would break the default implementation of Distinct.
What you CAN do, however, is to use the overload of distinct which takes an IEqualityComparer. This will allow you to compare for equality only on the desired fields, while running Distinct over the entire collection.
var students = dc
.AsEnumerable() // In case you're using a linq-to-sql framework, this will ensure the query execute in-memory
.Distinct(new SpecialtyComparer());
//...
public class SpecialtyComparer : IEqualityComparer<StudentTable>
{
public int GetHashCode(StudentTable s)
{
return s.studentname.GetHashCode()
&& s.subject.GetHashCode()
&& s.grade.GetHashCode();
}
public bool Equals(StudentTable s1, StudenTable s2)
{
return s1.studentname.Equals(s2.studentname)
&& s1.subject.Equals(s2.subject)
&& s1.grade.Equals(s2.grade);
}
}
I believe your design is broken, but I'll answer your specific question....
I'm assuming you're trying to group by name, subject and grade, and retrieve the first representative student of each group.
In this case, you can group by Tuples. A tuple will give you an Equals and GetHashCode method for free, so can be used in Group operations.
IEnumerable<Student> distinctStudents = students
.AsEnumerable()
.GroupBy(s => Tuple.Create
(
s.studentname,
s.subject,
s.grade
)
)
.Select(g => g.First()); /// Returns the first student of each group
Here is a dot net fiddle example: https://dotnetfiddle.net/7K13DJ
When doing a Distinct() on a list of objects, you are aggregating those rows into a smaller number of rows, discarding any duplicates. So, your result will not have the studentid anymore. To preserve the studentid property, you need to use a GroupBy. This will return to you your key (a student of studentname, subject, grade), and a list of original rows. You can see an example in this question here: (GroupBy and count the unique elements in a List).
You will have a list of studentids to choose from, since there might be many rows with the same studentname, subject, and grade. (Otherwise you would not be doing a distinct -- they would be unique and the tuple { sudentname, subject, grade } would be a natural key). So the question you may need to ask yourself is "Which studentid do I need?"
Related
I have model, where properties are
Id, Name, Category, DateCreated, Rating
I have working one decision, but need to rewrite it in a fluent style without from.
return await ( from x in _context.Products
orderby x.DateCreated
group x.Category by x.Category into g
orderby g.Count() descending
select g.Key
).Take(3)
.Select(element => ((Categories)element).ToString())
.ToListAsync();
Could you help me?
So you have a sequence of Products, and you want to make groups of Products. Every group contains Products of one Category. This Category is the Key of the Group.
You don't want all groups, you only want the 3 Groups that has the most Products.
You also are not interested what is in the Groups, you are only interested in the string representation of the Category of all Products in the Group.
Whenever you want to make groups of items, but you want something more than just Groups of items, consider to use the overload of Queryable.GroupBy that has a parameter resultSelector
var result = dbContext.Products.GroupBy(product => product.Category,
// parameter resultSelector: take any found Category, and all Products that have
// this Category to make one new:
(category, productsWithThisCategory) => new
{
Category = category,
Count = productsWithThisCategory.Count(),
})
// only keep the 3 Groups that has the most Products
.OrderByDescending(productGroup => productGroup.Count)
.Take(3)
Well, you are not interested in the Count anymore, you only want the string representation of the Category.
.Select(productGroup => productGroup.Category.ToString());
IMHO it is not wise to convert your Category to a string. You will probably need more bytes to transfer the data to a local process, and if users of your procedure (= software, not operators) want to interpret the fetched data they have to use relatively slow String comparisons. Besides: if you change a Category, the compiler won't complain, but your code will fail.
So consider to return the Category as the type it really is (enum?). Only convert to strings when you need to convert it to readable text (for operators, text files, etc)
.Select(productGroup => productGroup.Category);
Here is a query:
from order in db.tblCustomerBuys
where selectedProducts.Contains(order.ProductID)
select order.CustomerID;
selectedProducts is a list containing some target products IDs, for example it is { 1, 2, 3}.
The query above will return customerIDs where they have bought one of the selectedProducts. for example if someone has bought product 1 or 2, its ID will be in result.
But I need to collect CustomerIDs where they have bought all of the products. for example if someone has bought product 1 AND 2 AND 3 then it will be in result.
How to edit this query?
the tblCustomerBuys are like this:
CustomerID - ID of Customer
ProductID - the product which the customer has bought
something like this:
CustomerID ProdcutID
---------------------------
110 1
110 2
112 3
112 3
115 5
Updated:
due to answers I should do grouping, for some reason I should use this type of query:
var ID = from order in db.tblCustomerBuys
group order by order.CustomerID into g
where (selectedProducts.All(selProdID => g.Select(order => order.ProductID).Contains(selProdID)))
select g.Key;
but it will give this error:
Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator.
The updated query is the general LINQ solution of the issue.
But since your query provider does not support mixing the in memory sequences with database tables inside the query (other than Contains which is translated to SQL IN (value_list)), you need an alternative equivalent approach of All method, which could be to count the (distinct) matches and compare to the selected items count.
If the { CustomerID, ProductID } combination is unique in tblCustomerBuys, then the query could be as follows:
var selectedCount = selectedProducts.Distinct().Count();
var customerIDs =
from order in db.tblCustomerBuys
group order by order.CustomerID into customerOrders
where customerOrders.Where(order => selectedProducts.Contains(order.ProductID))
.Count() == selectedCount
select customerOrders.Key;
And if it's not unique, use the following criteria:
where customerOrders.Where(order => selectedProducts.Contains(order.ProductID))
.Select(order => order.ProductID).Distinct().Count() == selectedCount
As your question is written, it is a bit difficult to understand your structure. If I have understood correctly, you have an enumerable selectedProducts, which contains several Ids. You also have an enumeration of order objects, which have two properties we care about, ProductId and CustomerId, which are integers.
In this case, this should do the job:
ver result = db.tblCustomerBuys.GroupBy(order => order.CustomerId)
.Where(group => !selectedProducts.Except(group).Any())
.Select(group => group.Key);
What we are doing here is we are grouping all the customers together by their CustomerId, so that we can treat each customer as a single value. Then we are treating group as a superset of selectedProducts, and using a a piece of linq trickery commonly used to check if one enumeration is a subset of another. We filter db.tblCustomerBuys based on that, and then select the CustomerId of each order that matches.
You can use Any condition of Linq.
Step 1 : Create list of int where all required product id is stored
Step 2: Use Any condition of linq to compare from that list
List<int> selectedProducts = new List<int>() { 1,2 } // This list will contain required product ID
db.tblCustomerBuys.where(o=> selectedProducts .Any(p => p == o.ProductID)).select (o=>o.order.CustomerID); // This will return all customerid who bought productID 1 or 2
I am struggling with converting MySQL query to linq syntax in C# (for use of Entity Framework). MySQL query looks like this:
SELECT *
FROM Availability as tableData
WHERE ID = (
SELECT Availability.ID
FROM Availability
WHERE Availability.FrameID = tableData.FrameID
ORDER BY Availability.Date DESC limit 1)
I don't know how to convert this part FROM table AS someName.
So far the only solution I have, is to execute raw SQL query such as:
dataContext.Availability.SqlQuery("SELECT * FROM Availability as tableData WHERE ID = (SELECT ID FROM Availability WHERE FrameID = tableData.FrameID ORDER BY Availability.Date DESC limit 1)").ToArray();
But it would be nice to know if linq can provide such a query.
Thanks in advance, for your answers!
If you need only latest record for every frame id, then use grouping:
dataContext.Availability
.GroupBy(a => a.FrameID)
.Select(g => g.OrderByDescending(a => a.Date).FirstOrDefault());
This query produces required result, though generated sql will be a little different. It will look like
SELECT /* limit1 fields */
FROM (
SELECT DISTINCT tableData.FrameID
FROM Availability as tableData) AS distinct1
OUTER APPLY (
SELECT TOP(1) /* project1 fields */
FROM (SELECT /* extent1 fields */
FROM Availability AS extent1
WHERE Availability.FrameID = distinct1.FrameID) AS project1
ORDER BY project1.Date DESC) AS limit1
NOTE: First() extension is not supported by EF
Take all the Avilabilities, group by FrameId, order each group by date, take the first entry of each group.
The ToList() at the end fetches all the results and puts them in a List.
var tableDate = dataContext.Availability
.GroupBy(x => x.FrameId)
.Select(x => x.OrderByDescending(y => y.Date).FirstOrDefault())
.ToList();
Yes Linq can do this, but you need to have a starting sequence on which the linq should operate. Usually this sequence has the same type as your table, in your case Availability.
From your sql I gather that each record in the Availabilities table has at least properties Id, FrameId and Date:
class Availability
{
public int Id {get; set;}
public int FrameId {get; set;
public DateTime Date {get; set;}
}
Of course this can also be an anonymous type. The importance is that you have somehow a sequence of items having these properties:
IQueryable<Availability> availabilities = ...
You wrote:
I need only one record (with max Date of insert) for every FrameID
So every Availability has a FrameId, and you want for every FrameId the record with the highest Date value.
You could use Enumerable.GroupBy and group by FrameId
var groupsWithSameFrameId = availabilities.GroupBy(availability => availability.FrameId);
The result is a sequence of groups. Every group contains the sequence of all availabilities with the same FrameId. In other words: if you take a group, you'll have a group.Key with a FrameId value and a sequence of all availabilities that have this FrameId value.
We won't use the group.Key.
If you sort the sequence of elements in each group in descending order by Date and take the first element, you'll have the date with the highest value
var recordWithMaxDateOfInsert = groupsWithSameFrameId
.Select(group => group.OrderByDescending(groupElement => groupElement.Date)
.First();
From every group sort all elements of the group by descending Date value and take the first element of the sorted group.
Result: from your original availabilities, you have for every frameId the availability with the highest value for date.
For a school project I need to filter students who have signed up for multiple courses at the same timeblock. Instead of querying the DB via procedures/views I want to use LINQ to filter it in memory for learning purposes.
Everything seems alright according to the debugger however the result of my linq query is 0 and I can't figure out how.
Here's the code:
foreach (Timeblock tb in ctx.Timeblocks)
{
List<Student> doublestudents = new List<Student>();
//Get the schedules matching the timeblock.
Schedule[] schedules = (from sched in ctx.Schedules
where sched.Timeblock.Id == tb.Id
select sched).ToArray();
/\/\/\Gives me 2 schedules matching that timeblock.
if (schedules.Count() > 1)
{
doublestudents = (from s in ctx.Students
where s.Courses.Contains(schedules[0].Course) && s.Courses.Contains(schedules[1].Course)
select s).ToList();
Console.WriteLine(doublestudents.Count); <<< count results in 0 students.
}
}
While debugging it seems everything should work alright.
Each student has a List and each Course hsa a List
schedules[0].Course has Id 1
schedules[0].Course has Id 6
The student with Id 14 has both these courses in it's list.
Still the linq query does not return this student. Can this be because it's not the same reference of course it wont find a match at the .Contains()?
It's driving me totally crazy since every way I try this it wont return any results while there are matches...
You are comparing on Course which is a reference type. This means the objects are pointing to locations in memory rather than the actual values of the Course object itself, so you will never get a match because the courses of the student and the courses from the timeblock query are all held in different areas of memory.
You should instead use a value type for the comparison, like the course ID. Value types are the actual data itself so using something like int (for integer) will let the actual numerical values be compared. Two different int variables set to the same number will result in an equality.
You can also revise the comparison to accept any number of courses instead of just two so that it's much more flexible to use.
if (schedules.Count() > 1)
{
var scheduleCourseIds = schedules.Select(sch => sch.Course.Id).ToList();
doublestudents = (from s in ctx.Students
let studentCourseIds = s.Courses.Select(c => c.Id)
where !scheduleCourseIds.Except(studentCourseIds).Any()
select s).ToList();
Console.WriteLine(doublestudents.Count);
}
Some notes:
Compare the Course IDs (assuming these are unique and what you use to match them in the database) so that you're comparing value types and will get a match.
Use the let keyword in Linq to create temporary variables you can use in the query and make everything more readable.
Use the logic for one set containing all the elements of another set (found here) so you can have any number of duplicated courses to match against.
The problem is that your schedule[0].Course object and the s.Courses, from the new query, are completely different.
you may use the element's key to evaluate your equality condition/expression, as:
if (schedules.Count() > 1)
{
doublestudents = (from s in ctx.Students
where s.Courses.Any(x=> x.Key == schedules[0].Course.Key) && s.Courses.Any(x=> x.Key == schedules[1].Course.Key)
select s).ToList();
Console.WriteLine(doublestudents.Count); <<< count results in 0 students.
}
}
In order to achieve this you will need to include
using System.Linq
As you have guessed, this is probably related to reference equality. Here is a quick fix:
doublestudents =
(from s in ctx.Students
where s.Courses.Any(c => c.Id == schedules[0].Course.Id) &&
s.Courses.Any(c => c.Id == schedules[1].Course.Id)
select s).ToList();
Please note that I am assuming that the Course class has a property called Id which is the primary key. Replace it as needed.
Please note that this code assumes that there are two schedules. You need to work on the code to make it work for any number of schedules.
Another approach is to override the Equals and GetHashCode methods on the Course class so that objects of this type are compared based on their values (the values of their properties, possibly the ID property alone?).
I have a many to many table structure called PropertyPets. It contains a dual primary key consisting of a PropertyID (from a Property table) and one or more PetIDs (from a Pet table).
Next I have a search screen where people can multiple select pets from a jquery multiple select dropdown. Let's say somebody selects Dogs and Cats.
Now, I want to be able to return all properties that contain BOTH dogs and cats in the many to many table, PropertyPets. I'm trying to do this with Linq to Sql.
I've looked at the Contains clause, but it doesn't seem to work for my requirement:
var result = properties.Where(p => search.PetType.Contains(p.PropertyPets));
Here, search.PetType is an int[] array of the Id's for Dog and Cat (which were selected in the multiple select drop down). The problem is first, Contains requires a string not an IEnumerable of type PropertyPet. And second, I need to find the properties that have BOTH dogs and cats and not just simply containing one or the other.
Thank you for any pointers.
You can do this using a nested where clause.
You need to filter p.PropertyPets using contains - return all rows where PetID is in search.PetType.
Then only return rows from properties where all search id's have been found - eg number of rows >= number of serach id's
All together:
var result = from p in properties
where p.PropertyPets.Where(c => search.PetType.Contains(c.PetID)).Count() >= search.PetType.Count()
select p;
For the part where Contains requires a string would not be true, Contains should require an int if your search.PetType is int[]. That means that you need to "convert" p.PropertyPets into an int. To convert p.PropertyPets to IEnumerable<int> you need to select the PropertyID field: p.PropertyPets.Select(propertyPet => propertyPet.PropertyID), but that won't get you a single int as required but a whole bunch. (.First() would give you one int but not solve your problem.
What you really want to do is
var result = properties.Where(p =>
search.PetType.Except(p.PropertyPets.Select(propertyPet =>
propertyPet.PropertyID)).Count() == 0);
But Except is not available in LINQ2SQL.
The best option I can find is to apply Contains for each item in search.PetType.
Something like this:
var result = properties;
foreach(var petType in search.PetType)
{
result = from p in result
where p.PropertyPets.Select(propertyPet =>
propertyPet.PropertyID).Contains(petType)
select p;
}