Why is my linq alias out of scope? - c#

I am trying to better understand more complex linq statements. And thanks to a few articles on the web I am coming around. One thing I don't understand is why my query alias is out of context on this statement:
(from query in _context.WebQueries
select query).Where((from qry in _context.WebQueries
join qg in _context.WebQueryGroups on qry.QueryKey equals qg.QueryKey
where qg.QueryGroupNameKey == key
//This is out of scope
select qry.QueryKey).Contains(query.QueryKey));
//if you replaced it with this same problem
.Contains(qry.QueryKey));
I know that I can use an anonymous object call and gain the results I want. I will just have to iterate the object and pull out the List that I want:
(from query in _context.WebQueries
select new {query, key = query.QueryKey})
.Where(q => !(from qry in _context.WebQueries
join qg in _context.WebQueryGroups on qry.QueryKey equals qg.QueryKey
where qg.QueryGroupNameKey == key
select qry.QueryKey).Contains(q.key));
That returns an object with the list I want and the int that I want to reference later in the query.
Why are both query and qry out of scope though? I would much rather just return the linq statement in my method instead of having to parse an object to get the list to return. An article discussing this issue would be great.

I notice you did not get an answer to your specific question, which is as I understand it "what are the scoping rules in queries?"
First off, let's carefully define "scope". The scope of an entity is a region of program text in which a particular entity can be referred to by an unqualified name.
The key to understanding range variable scope is in understanding how queries are translated by the compiler. It is a syntactic translation. When you say:
from r in s where t select u
that is translated syntactically by the compiler into:
((s).Where(r => t)).Select(r => u)
In the translated version there are two rs, both lambda formal parameters, and the regular scoping rules for lambdas apply; each is in scope only for the body of the lambda.
So now you know why you cannot use a range variable outside of a query; that range variable is actually the formal parameter of one or more lambdas, and so is only valid in what is going to be the bodies of those lambdas.
You can learn the rest of the rules by reading the C# specification section on query translation. I note that the rules for "transparent identifier" queries have tricky scoping, so read that section carefully. I've been meaning to write a blog article on that.
UPDATE: I got around to writing that blog entry; you can read it here:
http://ericlippert.com/2014/07/31/transparent-identifiers-part-one/
http://ericlippert.com/2014/08/01/transparent-identifiers-part-two/

Although there is an answer that solves the underlying problem and makes the query work, the actual question isn't answered yet. The answer to the question why the variables are out of scope will probably help you understand LINQ better.
The statement...
from query in _context.WebQueries select query
can be rewritten as:
_context.WebQueries.Select(query => query)
(the part .Select(query => query) is redundant, but I leave it here for sake of the explanation)
this statement can be rewritten as a lambda expression with a method body:
WebQueries.Select(query => { return query; })
(I'll explain later why I don't use _context.WebQueries anymore)
this can be rewritten as an expression with an anonymous method:
WebQueries.Select(delegate(WebQuery query) { return query; })
and this can be rewritten into an expression using a named method:
WebQueries.Select(ReturnArg)
where ReturnArg is this method:
WebQuery ReturnArg(WebQuery query)
{
return query;
}
This is C# history in reverse order: we used to have named methods and delegates only. Later, in order to implement LINQ and other features, anonymous methods and lambda expressions were introduced. But the thing to note here is that under the hood, for the compiler, method syntax still applies, so the lambda expression query => query is nothing but a method having a parameter that is named query. As with all methods, the parameter is scoped to the method body.
In LINQ terminology, this parameter is called range variable, because it will serve as a reference to each successive element in the query.
In short: the range variable is scoped to the LINQ statement in which it is defined. (from query in _context.WebQueries select query) is one LINQ statement. The subsequent Where is a new LINQ statement.
The reason why I stopped using _context.WebQueries is that EF doesn't accept lambda expressions with method body. That is because the method body silently turns the parameter of the Where method from an expression into a Func, and EF only accepts expressions. In fact, the whole statement is never executed in CLR, but translated into SQL and executed by the database engine. However, for the C# compiler the correctness rules still apply. A range variable is a range variable, irrespective of what it's going to be used for.

I don't know what you're trying to accomplish, but this does the same thing:
(from query in _context.WebQueries
from qry in _context.WebQueries
join qg in _context.WebQueryGroups on qry.QueryKey equals qg.QueryKey
where qg.QueryGroupNameKey == request.Key
where qry.QueryKey == query.QueryKey
select query).Distinct();
However, this looks really strange. The join doesn't serve any function, for example.
Why not this?
from query in _context.WebQueries
join qg in _context.WebQueryGroups on query.QueryKey equals qg.QueryKey
where qg.QueryGroupNameKey == request.Key
select query

Related

What are the ways to create the LINQ query expression using query syntax dynamically?

What are the possible ways to create a LINQ expression dynamically, but using the query syntax? Is the query syntax a C# thing only, and if so, is the only viable way of creating such expressions using Roslyn dynamic compilation?
When writing LINQ expressions manually, I find them more natural when written using method chaining syntax, for example ctx.Foo.Where(foo => foo.Type.Name == "Bar") but there are some cases where I would need to write them like this:
from foo in ctx.Foo
join fooType in ctx.Types on foo.TypeId equals fooType.Id
where fooType.Name == "Bar"
I love how expression trees ensure type safety when creating expressions dynamically, but how would one create expressions using the query syntax?
Thanks everyone for your comments.
So it turns out it's not possible to do this because query syntax is just a C# language syntactic sugar.
Additionally, if someone else stumbles upon this question, take a look at the excellent answer by #Gert: https://stackoverflow.com/a/15599143/828023
That answer explains that the query syntax is "sugar", while the method syntax shows what really goes on under the hood, where for example join x in y on z equals x.something into somethingElse is actually a GroupJoin method call and there is no way to express this with expression trees without actually calling GroupJoin.

What is the difference between Joining two different DB Context using ToList() and .AsQueryable()?

Case 1:
I am Joined two different DB Context by ToList() method in Both Context.
Case 2:
And also tried Joining first Db Context with ToList() and second with AsQueryable().
Both worked for me. All I want to know is the difference between those Joinings regarding Performance and Functionality. Which one is better ?
var users = (from usr in dbContext.User.AsNoTracking()
select new
{
usr.UserId,
usr.UserName
}).ToList();
var logInfo= (from log in dbContext1.LogInfo.AsNoTracking()
select new
{
log.UserId,
log.LogInformation
}).AsQueryable();
var finalQuery= (from usr in users
join log in logInfo on usr.UserId equals log.UserId
select new
{
usr.UserName,
log.LogInformation
}.ToList();
I'll elaborate answer that was given by Jehof in his comment. It is true that this join will be executed in the memory. And there are 2 reasons why it happens.
Firstly, this join cannot be performed in a database because you are joining an object in a memory (users) with a deferred query (logInfo). Based on that it is not possible to generate a query that could be send to a database. It means that before performing the actual join a deferred query is executed and all logs are retrieved from a database. To sum up, in this scenario 2 queries are executed in a database and join happens in memory. It doesn't matter if you use ToList + AsQueryable or ToList + ToList in this case.
Secondly, in your scenario this join can be performed ONLY in a memory. Even if you use AsQueryable with the first context and with the second context it will not work. You will get System.NotSupportedException exception with the message:
The specified LINQ expression contains references to queries that are associated with different contexts.
I wonder why you're using 2 DB contexts. Is it really needed? As I explained because of that you lost a possibility to take full advantage of deferred queries (lazy evaluation features).
If you really have to use 2 DB contexts, I'll consider adding some filters (WHERE conditions) to queries responsible for reading users and logs from DB. Why? For small number of records there is no problem. However, for large amount of data it is not efficient to perform joins in memory. For this purpose databases were created.
It hasn't been explained yet why the statements actually work and why EF doesn't throw an exception that you can only use sequences of primitive types in a LINQ statement.
If you swap both lists ...
var finalQuery= (from log in logInfo
join usr in users on log.UserId equals usr.UserId
...
EF will throw
Unable to create a constant value of type 'User'. Only primitive types or enumeration types are supported in this context.
So why does your code work?
That will become clear if we convert your statement to method syntax (which the runtime does under the hood):
users.Join(logInfo, usr => usr.UserId, log => log.UserId
(usr,log) => new
{
usr.UserName,
log.LogInformation
}
Since users is an IEnumerable, the extension method Enumerable.Join is resolved as the appropriate method. This method accepts an IEnumerable as the second list to be joined. Therefore, logInfo is implicitly cast to IEnumerable, so it runs as a separate SQL statement before it partakes in the join.
In the version from log in logInfo join usr ..., Queryable.Join is used. Now usr is converted into an IQueryable. This turns the whole statement into one expression that EF unsuccessfully tries to translate into one SQL statement.
Now a few words on
Which one is better?
The best option is the one that does just enough to make it work. That means that
You can remove AsQueryable(), because logInfo already is an IQueryable and it is cast to IEnumerable anyway.
You can replace ToList() by AsEnumerable(), because ToList() builds a redundant intermediate result, while AsEnumerable() only changes the runtime type of users, without triggering its execution yet.
ToList()
Execute the query immediately
You will get all the elements ready in memory
AsQueryable()
lazy (execute the query later)
Parameter: Expression<Func<TSource, bool>>
Convert Expression into T-SQL (with specific provider), query remotely and load result to your application memory.
That’s why DbSet (in Entity Framework) also inherits IQueryable to get efficient query.
It does not load every record. E.g. if Take(5), it will generate select top 5 * SQL in the background.

Of what type is the result of a LINQ query?

Examples on LINQ gives this
var query = context.Contacts
.Where(q => q.FirstName == "Tom");
I'm wondering what object is "query"? And also is it possible (advisable) to pass it to a method (within the same class)?
The query object is most likely of type IQueryable<Contact>. You can of course pass it to a method, whether that is in the same class or in another class does not matter.
But keep in mind that LINQ does use a mechanism named "deferred execution". That means that query does not get enumerated immediately, but rather when it is needed. All the stuff you put in your query (the Where-clause for example) gets executed then. For more information about deferred execution have a look at MSDN: Query Execution.
NB: You can find out the exact type of the query variable if you hover you mouse over it or the var keyword in Visual Studio.

Using the result of a sql linq query leads to the database beeing queried twice

I query a database and push the result out to the console and a file using two methods like this:
var result = from p in _db.Pages
join r in rank on p.PkId equals r.Key
orderby r.NumPages descending
select new KeyNameCount
{
PageID = p.PkId,
PageName = p.Name,
Count = r.NumPages
};
WriteFindingsToFile(result, parentId);
WriteFindingsToConsole(result, parentId);
IEnumerable<T>, not IQuerable<T> is used as parametertype for result when used as a parameter in the two method calls.
In both calls the result is iterated in a foreach. This leads to two identical calls against the database, one for each method.
I could refactor the two methods into one and only use one foreach, but that would fast become very hard to maintain (adding write to html, write to xml, etc.)
I am pretty sure this is a farly common question, but using google has not made me any wiser, so I turn to you guys :-)
Every time you access a LINQ query it will requery the database to refresh the data. To stop this happening use .ToArray() or .ToList().
e.g.
var result = (from p in _db.Pages
select p).ToList(); //will query now
Write(result); //will not requery
Write(result2); //will not requery
It's important to understand that a raw LINQ query is run when it is used, not when it is written in the code (e.g. don't dispose of your _db before then).
It can be surprising when you realise how it really works. This allows method chaining and later modification of the query to be reflected in the final query run on the DB. It is important to always keep in mind as it can cause run-time bugs that will not be caught at compile time, usually because the DB connection is closed before the list is used, as you are passing around what appear to be a simple IEnumerable.
EDIT: Changed to remove my opinion and reflect the discussion in the comments. Personally I think the compiler should assume that the end result of chained queries is immediately run unless you explicitly say that it'll be further modified later. Just to avoid the run-time bugs it inevitably causes.
If you look at the function definition for IQuerayble, you will see that it also implements IEnumerable. So you can pass IQueryable as parameter to an IEnumerable function without actually enumerating it.
But because of Linqs deffered execution pattern, the IQueryable will only be executed against the database when you iterate over it (with a for loop for example as in your case, or with ToList() or functions like First/Single).
Here is a blog post that explains how this works.
If you change your code to the following you will hit the database only once and then pass the result in memory to your functions:
var result = (from p in _db.Pages
join r in rank on p.PkId equals r.Key
orderby r.NumPages descending
select new KeyNameCount
{
PageID = p.PkId,
PageName = p.Name,
Count = r.NumPages
}).ToList();
WriteFindingsToFile(result, parentId);
WriteFindingsToConsole(result, parentId);
Any time you iterate over your LINQ result the databased will be queried too.
This is called Deferred query execution. You may have a deeper look into at one (out of many) corresponding MSDN articles!
Execution of the query is deferred until the query variable is iterated over in a foreach or For Each loop

Entity Framework query builder methods: why "it" and not lambdas?

I'm just getting started with EF and a query like the following strikes me as odd:
var departmentQuery =
schoolContext.Departments.Include("Courses").
OrderBy("it.Name");
Specifically, what sticks out to me is "it.Name." When I was tooling around with LINQ to SQL, pretty much every filter in a query-builder query could be specified with a lambda, like, in this case, d => d.Name.
I see that there are overrides of OrderBy that take lambdas that return an IOrderedQueryable or an IOrderedEnumable, but those obviously don't have the Execute method needed to get the ObjectResult that can then be databound.
It seems strange to me after all I've read about how lambdas make so much sense for this kind of stuff, and how they are translated into expression trees and then to a target language - why do I need to use "it.Name"?
I get lamdba expressions with mine; I can do Where (it.SomeProperty == 1)... do you have System.Linq as a namespace? You can try restructuring as:
var departmentQuery = from d in schoolContext.Departments.Include("Courses")
orderby d.Name
select d;
Those are some possibilities.
HTH.

Categories

Resources