linq to sql syntax different but should get the same results - c#

I was playing around with expression trees and various Linq syntax. I wrote the following:
using (NorthwindDataContext DB = new NorthwindDataContext())
{
DataLoadOptions dlo = new DataLoadOptions();
// Version 1
dlo.AssociateWith<Customer>(c => c.Orders.Where(o => o.OrderID < 10700).Select(o => o));
// Version 2
dlo.AssociateWith<Customer>(c => from o in c.Orders
where o.OrderID < 10700
select o);
}
The Version 1 method returns an error saying "The operator 'Select' is not supported in Subquery."
While Version 2 runs just fine. From what I understand I am writing the exact same thing, but one is with the "dot" notation syntax and the other is query expression syntax.
Am I missing something here? Why the error on one but not the other "if" they are in fact the same query?

To expand on Daniel's answer, the select o is known as a degenerate query expression - and it's removed by the C# compiler. So your query is translated to:
c.Orders.Where(o => o.OrderID < 10700)
Note that without the where clause, however, the compiler would still include the Select call, so:
from o in c.Orders
select o
is translated to
c.Orders.Select(o => o)
From section 7.15.2.3 of the language spec:
A degenerate query expression is one
that trivially selects the elements of
the source. A later phase of the
translation removes degenerate queries
introduced by other translation steps
by replacing them with their source.
It is important however to ensure that
the result of a query expression is
never the source object itself, as
that would reveal the type and
identity of the source to the client
of the query. Therefore this step
protects degenerate queries written
directly in source code by explicitly
calling Select on the source. It is
then up to the implementers of Select
and other query operators to ensure
that these methods never return the
source object itself.

You don't need the .Select(o => o) in your query.

Related

EF Core query where Were clause is a collection?

I am trying to build a sane query in EF Core that returns a collection of things that are, in turn derived from a collection of things. Basically in raw SQL one would do a JOIN.
Its in ASP.NET Core so the initial collection is the list of Roles on the SecurityPrincipal object:
var roles = User.FindAll(ClaimTypes.Role).Select(r=>r.Value);
These roles are then mapped to Groups in our Database, so I can look those up:
var groupsQuery = dbContext.Groups.Where(g=>roles.Any(r=>r==g.GroupName));
var groups = await groupsQuery.ToListAsync();
This query is quite happy and returns a collection of groups as expected. The groups however have access to another resource, which is what I really want and because its a many to many relationship there is a bridging table.
This is me trying to query the AssetGroup joining table so I can get all the Assets referenced by all the Groups that map to a Role on the SecurityPrincipal.
var assetGroupsQuery = dbContext.AssetsGroups.Where(ag => groupsQuery.Any(ag => ag.Id == a.GroupId));
var assetGroups = await assetGroupsQuery.ToListAsync();
When I perform the second query I get a lot of spam in my output window:
The LINQ expression 'where ([ag].Id == [ag].GroupId)' could not be translated and will be evaluated locally.
The LINQ expression 'Any()' could not be translated and will be evaluated locally.
The LINQ expression 'where {from Group g in __groups_0 where ([ag].Id == [ag].GroupId) select [ag] => Any()}' could not be translated and will be evaluated locally.
The LINQ expression 'where ([ag].Id == [ag].GroupId)' could not be translated and will be evaluated locally.
The LINQ expression 'Any()' could not be translated and will be evaluated locally.
Any clues on how one should phrase a nested query like this so EF Core can compose a single SQL query properly?
In general avoid using Any or any LINQ operator other than Contains on in memory collection like your roles (which according to the code should be of type IEnumerable<string>).
In other words, instead of
.Where(g => roles.Any(r => r == g.GroupName))
use the functionally equivalent
.Where(g => roles.Contains(g.GroupName))
The later is guaranteed to be translated to SQL IN, while the former isn't.
Interestingly and at the same time misleading is that EF Core tries to be smart and translate the former the same way as Contains, and succeeds when the containing query is executed, but not when used as part of another query.
It could be considered a current EF Core implementation defect. But the workaround/solution is (as mentioned in the beginning) to not rely on it and always use Contains.

How to Linq-ify this query?

I currently have this Linq query:
return this.Context.StockTakeFacts
.OrderByDescending(stf => stf.StockTakeId)
.Where(stf => stf.FactKindId == ((int)kind))
.Take(topCount)
.ToList<IStockTakeFact>();
The intent is to return every fact for the topCount of StockTakes but instead I can see that I will only get the topCount number of facts.
How do I Linq-ify this query to achieve my aim?
I could use 2 queries to get the top-topCount StockTakeId and then do a "between" but I wondered what tricks Linq might have.
This is what I'm trying to beat. Note that it's really more about learning that not being able to find a solution. Also concerned about performance not for these queries but in general, I don't want to just to easy stuff and find out it's thrashing behind the scenes. Like what is the penalty of that contains clause in my second query below?
List<long> stids = this.Context.StockTakes
.OrderByDescending(st => st.StockTakeId)
.Take(topCount)
.Select(st => st.StockTakeId)
.ToList<long>();
return this.Context.StockTakeFacts
.Where(stf => (stf.FactKindId == ((int)kind)) && (stids.Contains(stf.StockTakeId)))
.ToList<IStockTakeFact>();
What about this?
return this.Context.StockTakeFacts
.OrderByDescending(stf => stf.StockTakeId)
.Where(stf => stf.FactKindId == ((int)kind))
.Take(topCount)
.Select(stf=>stf.Fact)
.ToList();
If I've understood what you're after correctly, how about:
return this.Context.StockTakes
.OrderByDescending(st => st.StockTakeId)
.Take(topCount)
.Join(
this.Context.StockTakeFacts,
st => st.StockTakeId,
stf => stf.StockTakeId,
(st, stf) => stf)
.OrderByDescending(stf => stf.StockTakeId)
.ToList<IStockTakeFact>();
Here's my attempt using mostly query syntax and using two separate queries:
var stids =
from st in this.Context.StockTakes
orderby st.StockTakeId descending
select st.StockTakeId;
var topFacts =
from stid in stids.Take(topCount)
join stf in this.Context.StockTakeFacts
on stid equals stf.StockTakeId
where stf.FactKindId == (int)kind
select stf;
return topFacts.ToList<IStockTakeFact>();
As others suggested, what you were looking for is a join. Because the join extension has so many parameters they can be a bit confusing - so I prefer query syntax when doing joins - the compiler gives errors if you get the order wrong, for instance. Join is by far preferable to a filter not only because it spells out how the data is joined together, but also for performance reasons because it uses indexes when used in a database and hashes when used in linq to objects.
You should note that I call Take in the second query to limit to the topCount stids used in the second query. Instead of having two queries, I could have used an into (i.e., query continuation) on the select line of the stids query to combine the two queries, but that would have created a mess for limiting it to topCount items. Another option would have been to put the stids query in parentheses and invoked Take on it. Instead, separating it out into two queries seemed the cleanest to me.
I ordinarily avoid specifying generic types whenever I think the compiler can infer the type; however, IStockTakeFact is almost certainly an interface and whatever concrete type implements it is likely contained by this.Context.StockTakeFacts; which creates the need to specify the generic type on the ToList call. Ordinarily I omit the generic type parameter to my ToList calls - that seems to be an element of my personal tastes, yours may differ. If this.Context.StockTakeFacts is already a List<IStockTakeFact> you could safely omit the generic type on the ToList call.

Differences in LINQ vs Method expression

Why the Linq expression IL results in omission of the Select projection whereas the corresponding method expression keeps the Select projection ?
I suppose these two pieces of code does the same.
var a = from c in companies
where c.Length >10
select c;
//
var b = companies.Where(c => c.Length > 10).Select(c => c);
//IL - LINQ
IEnumerable<string> a = this.companies.
Where<string>(CS$<>9__CachedAnonymousMethodDelegate1);
//IL
IEnumerable<string> b = this.companies.Where<string>
(CS$<>9__CachedAnonymousMethodDelegate4).Select<string, string>
(CS$<>9__CachedAnonymousMethodDelegate5);
Then why the difference in IL?
EDITED :
then why
var a = from c in companies
select c;
result in SELECT projection even inside IL. it can also be omitted right ?
The C# compiler is clever and remove useless statement from Linq. Select c is useless so the compiler remove it. When you write Select(c=>c) the compiler can't say that's the instruction is useless because it' a function call and so it doesn't remove it.
If you remove it yourself IL become the same.
EDIT :
Linq is a "descriptive" language : you say what you want and the compiler transforms it well. You don't have any control on that transformation. The compiler try to optimize function call and don't use Select because you don't do projection so it's useless.
When you write Select(c => c) you call a function explicitely so the compiler won't remove it.
var a = from c in companies select c;
var a = c.Select(elt=>elt);
Select is usefull in this example. If you remove it a has the type of c; otherwise a is an IEnumerable
#mexianto is of course correct that this is a compiler optimization.
Note that this is explicitly called out in the language specification under "Degenerate Query expressions." Also note that the compiler is smart enough to not perform the optimization when doing so would return the original source object (the user might want to use a degenerate query to make it difficult for the client to mutate the source object, assuming that it is mutable).
7.16.2.3 Degenerate query expressions
A query expression of the form
from x in e select x
is translated into
( e ) . Select ( x => x )
[...] A degenerate query expression is one that
trivially selects the elements of the source. A later phase of the
translation removes degenerate queries introduced by other translation
steps by replacing them with their source. It is important however to
ensure that the result of a query expression is never the source
object itself, as that would reveal the type and identity of the
source to the client of the query. Therefore this step protects
degenerate queries written directly in source code by explicitly
calling Select on the source. It is then up to the implementers of
Select and other query operators to ensure that these methods never
return the source object itself.
In your second example, the call to Select is not redundant. If you would omit the Select call, the query would just return the original collection, whereas Select returns an IEnumerable.
In your first example, Where already returns an IEnumerable and the select clause doesn't do any work, so it is omitted.
Because in the query version there is no actual select projecting 'c' into something else, it is just passing on 'c' as-is. Which results in only a call to 'Where'.
In the second variation, you explicitly call 'Select' and thus do a projection. Yes, you are only returning the same objects, but the compiler will not see this.

How does a LINQ expression know that Where() comes before Select()?

I'm trying to create a LINQ provider. I'm using the guide LINQ: Building an IQueryable provider series, and I have added the code up to LINQ: Building an IQueryable Provider - Part IV.
I am getting a feel of how it is working and the idea behind it. Now I'm stuck on a problem, which isn't a code problem but more about the understanding.
I'm firing off this statement:
QueryProvider provider = new DbQueryProvider();
Query<Customer> customers = new Query<Customer>(provider);
int i = 3;
var newLinqCustomer = customers.Select(c => new { c.Id, c.Name}).Where(p => p.Id == 2 | p.Id == i).ToList();
Somehow the code, or expression, knows that the Where comes before the Select. But how and where?
There is no way in the code that sorts the expression, in fact the ToString() in debug mode, shows that the Select comes before the Where.
I was trying to make the code fail. Normal I did the Where first and then the Select.
So how does the expression sort this? I have not done any change to the code in the guide.
The expressions are "interpreted", "translated" or "executed" in the order you write them - so the Where does not come before the Select
If you execute:
var newLinqCustomer = customers.Select(c => new { c.Id, c.Name})
.Where(p => p.Id == 2 | p.Id == i).ToList();
Then the Where is executed on the IEnumerable or IQueryable of the anonymous type.
If you execute:
var newLinqCustomer = customers.Where(p => p.Id == 2 | p.Id == i)
.Select(c => new { c.Id, c.Name}).ToList();
Then the Where is executed on the IEnumerable or IQueryable of the customer type.
The only thing I can think of is that maybe you're seeing some generated SQL where the SELECT and WHERE have been reordered? In which case I'd guess that there's an optimisation step somewhere in the (e.g.) LINQ to SQL provider that takes SELECT Id, Name FROM (SELECT Id, Name FROM Customer WHERE Id=2 || Id=#i) and converts it to SELECT Id, Name FROM Customer WHERE Id=2 || Id=#i - but this must be a provider specific optimisation.
No, in the general case (such as LINQ to Objects) the select will be executed before the where statement. Think of it is a pipeline, your first step is a transformation, the second a filter. Not the other way round, as it would be the case if you wrote Where...Select.
Now, a LINQ Provider has the freedom to walk the expression tree and optimize it as it sees fit. Be aware that you may not change the semantics of the expression though. This means that a smart LINQ to SQL provider would try to pull as many where clauses it can into the SQL query to reduce the amount of data travelling over the network. However, keep the example from Stuart in mind: Not all query providers are clever, partly because ruling out side effects from query reordering is not as easy as it seems.

LINQ Query Syntax to Lambda

Wondering if there is any way to get the lambda expressions that result from a LINQ "query" syntax expression.
Given:
var query = from c in dc.Colors
where c.ID == 213
orderby c.Name, c.Description
select new {c.ID, c.Name, c.Description };
Is there any way to get the generated "lambda" code / expression?
var query = dc.Colors
.Where(c => c.ID == 213)
.OrderBy(c => c.Name)
.ThenBy(c => c.Description)
.Select(c => new {c.ID, c.Name, c.Description, });
I know these are very simple examples and that the C# compiler generates a lambda expression from the query expression when compiling the code. Is there any way to get a copy of that expression?
I am hoping to use this as a training tool for some of my team members that aren't very comfortable with lambda expressions. Also, I have used Linq Pad, but ideally this can be accomplised without a 3rd party tool.
Simply go:
string lambdaSyntax = query.Expression.ToString();
The disadvantage compared to LINQPad is that the result is formatted all one line.
You could try compiling the assembly and then having a look at it using Reflector.
This might be a bit more complicated than you want though, because the compiler will compile things right down to the direct method calls (everything will be static method calls, not extension methods, and the lambdas will get compiled into their own functions which are usually called something like <ClassName>b_88f)
You'll certainly figure out what's going on though :-)
ReSharper has that feature. It will take a LINQ to Lambda and back again at the stroke of a key. Also very (very) useful for other things.

Categories

Resources