DISTINCT() and ORDERBY issue - c#

I am learning about LINQ-to-SQL and everything was going well until something strange happened:
I tried to make an example of distinct, so, using the Northwind dabatase I wrote the following query:
var query =
from o in db.Orders
orderby o.CustomerID
select new
{
o.CustomerID
};
If I print the SQL generated by LINQ-to-SQL for the query stored in query it looks like this:
SELECT [t0].[CustomerID]
FROM [dbo].[Orders] AS [t0]
ORDER BY [t0].[CustomerID]
So, as usual, the query brings all the CustomerID for each Order in the Orders table ordered alphabetically.
But! If I use the Distinct() method like this:
var query = (
from o in db.Orders
orderby o.CustomerID
select new
{
o.CustomerID
}).Distinct();
The query brings the expected results of the Distinct clause, but the CustomerIDs are not ordered despite I wrote orderby o.CustomerID!
The SQL query for this second LINQ query is the following:
SELECT DISTINCT [t0].[CustomerID]
FROM [dbo].[Orders] AS [t0]
As we can see **the ORDER BY clause is missing. Why is that?
Why does the ORDER BY clause disappears when I use the Distinct() method?

From the Queryable.Distinct documentation;
The expected behavior is that it returns an unordered sequence of the unique items in source.
In other words, any order the existing IQueryable has is lost when you use Distinct() on it.
What you want is probably something more like this, an OrderBy() after the Distinct() is done;
var query = (from o in db.Orders
select new
{
o.CustomerID
}).Distinct().OrderBy(x => x.CustomerID);

Try rearranging the members to place the OrderBy after the Distinct. You'll have to revert to method chaining:
db.Orders.Select(o=>o.CustomerId).Distinct().OrderBy(id=>id);
This would be the more efficient way to set up the query in Enumerable Linq anyway, because the OrderBy would then operate only on the unique items and not on all of them. Also, according to MSDN, Enumerable.Distinct does not guarantee the return order of the elements anyway, so ordering before deduping is pointless.

Due to the use of distinct, the order of the returned list is not guaranteed. LinqToSql is smart enough to recognize this, therefor it ignores it.
If you place the order by AFTER your Distinct, everything will happen as you desire.
var query = (from o in db.Orders
select new
{
o.CustomerID
}).Distinct().OrderBy(o => o.CustomerID);
or
var query = db.Orders.Select(o => o.CustomerID).Distinct().OrderBy(o => o.CustomerID);
Please see this article for clarification:
http://programminglinq.com/blogs/marcorusso/archive/2008/07/20/use-of-distinct-and-orderby-in-linq.aspx

You can simulate ORDERBY and DISTINCT with this counstruction:
var distinctItems = employees.GroupBy(x => x.EmpID).OrderBy(x => x).Select(y => y.First());

Related

uses First/FirstOrDefault/Last/LastOrDefault operation without OrderBy and filter which may lead to unpredictable results

I have a linq query which gave me the warning but it still works. I want to get rid of the warning.
uses First/FirstOrDefault/Last/LastOrDefault operation without OrderBy and filter which may lead to unpredictable results.
The linq query is
var list = (from u in _db.user
join r in _db.resource on u.userId equals r.userId
join t in _db.team on u.bossId equals t.bossId
where r.pid == pid
select new MyDto
{
pid = pid,
userId = u.userId,
teamId = t.teamId,
name = t.name
}).GroupBy(d => d.userId).Select(x => x.First()).OrderBy(y => y.userId).ToList();
I use EntityFramework Core 2.1
UPDATE:
I changed the code by the comments.
var list = (from u in _db.user
join r in _db.resource on u.userId equals r.userId
join t in _db.team on u.bossId equals t.bossId
where r.pid == pid
select new MyDto
{
pid = pid,
userId = u.userId,
teamId = t.teamId,
name = t.name
})
.GroupBy(d => d.userId)
.Select(x => x.OrderBy(y => y.userId)
.First())
.ToList();
Then there is a different warning.
The LINQ expression 'GroupBy([user].userId, new MyDto() {pid =
Convert(_8_locals1_pid_2, Int16), userId = [user].UserId, .....) could
not be translated and will be evaluated locally.
We have this expression
.Select(x => x.First())
Which record will be first for that expression? There's no way to know, because at this point the OrderBy() clause which follows hasn't processed yet. You could get different results each time you run the same query on the same data, depending on what order the records were returned from the database. The results are not predictable, exactly as the error message said.
But surely the database will return them in the same order each time? No, you can't assume that. The order of results in an SQL query is not defined unless there is an ORDER BY clause with the query. Most of the time you'll get primary key ordering (which does not have to match insert order!), but there are lots of things that can change this: matching a different index, JOIN to a table with a different order or different index, parallel execution with another query on the same table + round robin index walking, and much more.
To fix this, you must call OrderBy() before you can call First().
Looking a little deeper, this is not even part of the SQL. This work is happening on your client. That's not good, because any indexes on the table are no longer available. It should be possible to do all this work on the database server, but selecting the first record of a group may mean you need a lateral join/APPLY or row_number() windowing function, which are hard to reproduce with EF. To completely remove all warnings, you may have to write a raw SQL statement:
select userId, teamId, name, pid
from (
select u.userId, t.teamId, t.name, r.pid, row_number() over (order by u.userId) rn
from User u
inner join resource r on r.userId = u.userId
inner join team t on t.bossId = u.bossId
where r.pid = #pid
) d
where d.rn = 1
Looking around, it is possible to use row_number() in EF, but at this point I personally find the SQL much easier to work with. My view is ORMs don't help for these more complicated queries, because you still have to know the SQL you want, and you also have to know the intricacies of the ORM in order to build it. In other words, the tool that was supposed to make your job easier made it harder instead.

How to write a linq query to select data from two table?

Hello this is my linq query,
var RoutineRemarks = (from i in _context.TableA.Include(a => a.pm_routine_report_type)
from j in _context.TableB.Include(a => a.PM_Evt_Cat).Include(b => b.department).Include(c => c.employees).Include(d => d.provncs)
orderby i.seen_by_executive_on descending
orderby j.English_seen_by_executive_on descending
// Here i face the problem, i want to select i+j
select i+j).ToList();
At the end it allows me to only select either i or j, but i want to select both, how can i do that?
Try in this way
select new {I=i, J=j}).ToList();
I also agree with #GertArnold. it is most likely that your main query needs join, but it is hard to tell what you need to do without knowing your ERD

LINQ to Entities: Group then Order By

from what I've read, I can use LINQ to first group, then order each Group by using "SelectMany", which is described here: How to group a IQueryable by property 1 but order by property 2?
But this doesn't work for IQueryable I guess.
We basically get a BusinessObject with an Main-Entity and an IEnumable of Entities, so I'd like to first order by the Main-Entity sequence, then by each Name of the Subentities.
So I guess my query would look like this in LINQ to objects:
var qry = GetQueryFromSomeWhere();
qry = qry.OrderBy(f => f.MainEntity.SequenceNumber)
.ThenBy(f => f.SubEntities.SelectMany(f => f.Name));
I could order this Names in the Query-Service, but it should be up the consumer to order the entities as he needs.
Is there a possibility to make this work kindahow without loading all Entities in the Memory?
If I'am correctly understanding you want to sort records inside each group by record Name. I think that you could accomplish this by ordering records before doing a group by, try this code:
var q = from m in MainEntities
join s in SubEntities on m.Id equals s.MainId
orderby m.SequenceNumber, s.Name
group new { m, s } by m into grp
orderby grp.Key.SequenceNumber
select grp;

Using LINQ to select desired results between two related IEnumerable query objects

I think this is kind of a basic question but I'm getting confused. I have two objects, Orders and OrderTags. In the database, Orders has no relation to OrderTags, but OrderTags has a FK relation to Orders.
So I capture both objects in my context like so:
orders = context.Orders;
tags = context.OrderTags.Where(tag=> tag.ID = myID);
Now I want to reduce the orders list to only be equal to the orders that exist in my tags list. Here is my best pseudocode of what I want to do:
orders = orders.Where(every order id exists somewhere in the tags list of order ids)
For clarification, each Tag object has a TagID and an OrderID. So I only want the orders that correspond to the tags I have looked up. Can anyone assist me with the syntax so I can get what I'm looking for?
Using a LINQ query:
var results = (from o in context.Orders
join t in context.Tags on o.OrderId equals t.OrderId
where t.ID == myID
select o ).ToList();
Using LINQ query:
orders = orders.Where(order => tags.Contains(tag => tag.ID == order.OrderID)).ToList();
Using a LINQ query with lambda expressions:
orders.RemoveAll(x => !tags.ConvertAll(y => y.tagId).Contains(x.tagID));
Something like this should work.
orders = orders.Where(o=>tags.Contains(t=>o.ID == t.OrderID));
You could also just perform a join.

Use LINQ to convert comma separated strings in a table into a distinct collection of values

I'm working through this MVC3 tutorial and have entered the genre of a film as a comma separated string.
In part 6 we take the genres from the table to populate a drop down list.
I'd like to populate the drop down list with a distinct collection of single genres but I just can't get it to work.
This is what the tutorial suggest as a start point
var GenreLst = new List<string>();
var GenreQry = from d in db.Movies
orderby d.Genre
select d.Genre;
GenreLst.AddRange(GenreQry.Distinct());
... and this is where I'd got to
var GenreLst = new List<string>();
var GenreQry = (from d in db.Movies
orderby d.Genre
select d.Genre ).Select(s=>s.Split(','))
.Distinct();
GenreLst.AddRange( GenreQry );
Linq2Sql doesn't know s.Split(',') method, so it should throw an exception, you can do this:
var GenreQry = (from d in db.Movies
orderby d.Genre
select d.Genre ).Distinct().ToList();
GenreLst.AddRange( GenreQry.SelectMany(x=>x.Split(',')).Distinct());
about above code:
When calling ToList() in the end of query, your data will be fetched and your query in fact is list,
in second part, SelectMany flats separated strings as a IEnumberable of strings.
Edit: Also in first part you can call .AsEnumerable() instead of .ToList() for fetching data, it seems better way.
In case you find the SelectMany syntax a bit confusing, consider the following (which compiles into a select many method call under the covers but I find easier to read):
var GenreQry = (from d in db.Movies.AsEnumerable()
from s in d.Split(',')
select s)
.Distinct()
.OrderBy(s => s);

Categories

Resources