EF Core Join from two IQueryables throws System.ArgumentNullException

EF Core Join from two IQueryables throws System.ArgumentNullException - c#

Having System.ArgumentNullException when trying to Join with ef core from two datasets.
I'm using asp.netcore's built in Identity feature and I have two contexts: MyDbContext and IdentityDbContext. I have a User entity in MyDbContext which holds an IdentityGuid, so I could know which user is associated to which IdentityUser.
I would like to do a Join and get a KeyValuePair and provide that as a ViewModel.
var users = _context.Users;
var identityUsers = _identityContext.Users;
var selection = users.Join(identityUsers,
u => u.IdentityGuid,
iu => iu.Id,
(u, iu) => new KeyValuePair<User, IdentityUser>(u, iu));
return View(await selection.ToListAsync());
Executing that Join throws System.ArgumentNullException, however if I will do a ToList() on both datasets before Join then it works just fine. Here is the Exception:
ArgumentNullException: Value cannot be null.
Parameter name: entityType
Microsoft.EntityFrameworkCore.Utilities.Check.NotNull(T value, string parameterName)

As Miamy mentioned in his comment, EF Core does not support cross-context deferred execution joins. One (or both) of the them will need to have .ToList() called to cause them to both be evaluated and pulled into local memory for the join.
This could also be done in theory by writing the join sql manually and using EF to execute (as this would allow the cross join) but generally the solution should be to pull the data into into local with a .ToList() and then join.
It's an unfortunate thing that cross joins like that arent supported, though I dont believe they are in NHibernate either, so its an all round limitation to my knowledge

Related

HasDbFunction, table value function without a mapped entity class

Everything I've found so far, if you are calling a table value function the return value must be an IQueryable. For example:
public IQueryable<AlbumsByGenre> ufn_AlbumsByGenre_ITVF(int genreId)
=> FromExpression(() => ufn_AlbumsByGenre_ITVF(genreId));
Most often when I'm using a table value function the table type that is returns is a DTO. That is, it doesn't match any actual tables in the database. Here is a example:
CREATE FUNCTION dbo.ufn_AlbumsByGenre_ITVF(#GenreId int)
RETURNS TABLE
AS
RETURN(
SELECT
ar.ArtistName,
al.AlbumName,
g.Genre
FROM Genres g
INNER JOIN Albums al
ON g.GenreId = al.GenreId
INNER JOIN Artists ar
ON al.ArtistId = ar.ArtistId
WHERE g.GenreId = #GenreId
);
Creating an entity for the return type results in an unnecessary, unused, and unwanted table in the database. In this instance the table name is "AlbumsByGenre".
Is there any way to have the return type be an unmapped type to prevent the unnecessary table?

Currently (as of EF Core 6.0) the type must be a model type (with or without key). There are plans for Raw SQL queries for unmapped types for EF Core 7.0 which might or might not allow the mapping you are asking for.
So for now your type must be registered in the model (cannot be unmapped). But creating associated table is not mandatory and can be avoided by configuring it with ToView(null), e.g.
modelBuilder.Entity<AlbumsByGenre>()
.HasNoKey() // keyless
.ToView(null); // no table or view

For me .ToView(null) still generates a table for the AlbumsByGenre.
My workaround is to map this type to the raw SQL query (tested on Entity Framework Core 6.0.9). Here I call the function with default parameters (for the author's case, the NULL value should be then allowed).
modelBuilder.Entity<AlbumsByGenre>()
.HasNoKey()
.ToSqlQuery("SELECT * FROM ufn_AlbumsByGenre_ITVF(NULL)");

Yes just create a class that has ArtistName, AlbumName and Genre however it should not be an IQueryable just a List<AlbumsByGenre> or IEnumerable<AlbumsByGenre> or an ICollection.
IQueryable is a delayed querybuilder for a table or view.
SQL functions results are not further queryable in SQL Server, so just drop the IQueryable.

Why does EF Core 3 not mark a collection as loaded when including .Query()?

In one of my applications I'm using EF Core 3.1 with lazy-loading proxies enabled. According to the documentation here and this StackOverflow answer, it seems like I should be able to explicitly load a collection and have that collection marked as loaded if I do something like:
await context.Entry(parent)
.Collection(p => p.Children)
.Query()
.Include(c => c.Grandchildren)
.Where(p => p.Id > 100)
.LoadAsync();
This should fetch a filtered subset of the child entities for the parent entity and pull back a collection of related entities on the children themselves.
At that point, I would expect that parent.Children is considered loaded by EF Core, such that I can access that collection later in my code and it won't try to go back to my database i.e. won't try to lazy-load it.
In my application I'm finding that that's not at all the case. In the screenshot below, you can see that the Order collection on the productConsultant entity isn't marked as loaded after I try to do something similar:
I can see it goes to the database and performs close to the query I specify (it includes the "where" condition but doesn't join on to the related entities):
SELECT [o].[Id],
[o].[Assignee],
[o].[CommissionAuthorisedInPeriod],
[o].[CommissionPaidInPeriod],
[o].[CsiMaxScore],
[o].[CsiScore],
[o].[CustomerName],
[o].[DateDelivered],
[o].[DateSigned],
[o].[DeliveredInPeriod],
[o].[EmailStatus],
[o].[HasZeroCsiScore],
[o].[IsFleetOrder],
[o].[IsSubPrime],
[o].[OrderStreamCheckpoint],
[o].[PaymentMethod],
[o].[Status],
[t].[Id],
[t].[Vehicle],
[t].[VehicleSaleType]
FROM [Order] AS [o]
LEFT JOIN
(
SELECT [o0].[Id],
[o0].[Vehicle],
[o0].[VehicleSaleType]
FROM [Order] AS [o0]
WHERE [o0].[VehicleSaleType] IS NOT NULL
AND [o0].[Vehicle] IS NOT NULL
) AS [t] ON [o].[Id] = [t].[Id]
WHERE ([o].[Assignee] = #__p_0)
AND ([o].[CommissionAuthorisedInPeriod] = #__activePeriod_1);
However, because it doesn't mark the collection as loaded, when I try to reference the Order collection within the ProductConsultant class, it goes off and does another database query for all the orders the product consultant has ever delivered, even though I've tried explicitly loading the order collection earlier with only a subset of orders:
public void PayCommissions(Period period)
{
var payableOrders = from o in Orders
where o.CommissionAuthorisedInPeriod == period
where !o.HasBeenPaid
select o;
// ...
}
SELECT [o].[Id],
[o].[Assignee],
[o].[CommissionAuthorisedInPeriod],
[o].[CommissionPaidInPeriod],
[o].[CsiMaxScore],
[o].[CsiScore],
[o].[CustomerName],
[o].[DateDelivered],
[o].[DateSigned],
[o].[DeliveredInPeriod],
[o].[EmailStatus],
[o].[HasZeroCsiScore],
[o].[IsFleetOrder],
[o].[IsSubPrime],
[o].[OrderStreamCheckpoint],
[o].[PaymentMethod],
[o].[Status],
[t].[Id],
[t].[Vehicle],
[t].[VehicleSaleType]
FROM [Order] AS [o]
LEFT JOIN
(
SELECT [o0].[Id],
[o0].[Vehicle],
[o0].[VehicleSaleType]
FROM [Order] AS [o0]
WHERE [o0].[VehicleSaleType] IS NOT NULL
AND [o0].[Vehicle] IS NOT NULL
) AS [t] ON [o].[Id] = [t].[Id]
WHERE [o].[Assignee] = #__p_0;
If I were to change the original query such that it didn't do anything but load the collection, then it does mark the collection as loaded, but this is pointless as it pulls back far too much data and doesn't allow me to include nested entity collections:
Like dozens of other problems I've encountered with EF Core 3, this is something that could probably be easily resolved with EF Core 5 (using filtered includes). Until I'm able to upgrade this project to target .NET 5, Is there a way to get this to work with EF Core 3? Is there something I'm missing?
Update
Based on Stephen's answer, I've made the changes shown below. This allows me to fetch back only the relevant data from the database, and prevent EF Core from doing the second query later on:

The short of it is that EF Core's IsLoaded property is true if and only if it can make certain that all related entities are loaded. Projecting after the query invalidates this guarantee.
IsLoaded
true if all the related entities are loaded or the IsLoaded has been explicitly set to true
A potential workaround is to explicitly set this property to tell EF that in this context, you know better, and to not issue queries to reload the entities.

The reason is cause after you do the .query() then you do a . Include which EF thinks your asking for a new query and throws out your old query filters and thus why it hits the database a second time.

Cannot Cast String to GUID in LINQ to Entities

The following code works fine in LinqPad:
(from u in AspNetUsers
let userOrders = (from uo in UserOrders where uo.UserId == new Guid(u.Id) select uo)
select u)
However, when I try to execute the exact same code from my MVC application, it gives me the following error:
var users = (from u in ctx.Users
let userOrders = (from uo in ctx.UserOrders where uo.UserId == new Guid(u.Id) select uo)
select u);
Only parameterless constructors and initializers are supported in LINQ to Entities.
I don't recall ever having any issues converting a string to a GUID in Linq to Entities before. Am I missing some reference?
I have already included all the references I could think of:
using System.Linq;
using System.Data.Entity;
Why does it work in LinqPad but not in MVC?
Edit: Oddly enough, this seems to work:
let userOrders = (from uo in ctx.UserOrders where uo.UserId.ToString() == u.Id select uo)

When you are inside a context the object is not realized yet. Why are you not just using the auto built in navigation properties anyways?
When you have a foreign key in a database or data structure. That creates a 'navigation' property in Entity Framework. As such you can get to child objects a whole lot more quickly. A lot of times they are automatically given depending on your EF options of lazy versus eager loading. The 'include' forces that navigation to be obtained. So if just wanted the orders (seperate table) from my person table I would interrogate person and then get it's child table. I know also there is the concept of 'realized' with Entity Framework and until you can legitimately put something 'ToList' or 'ToString' where if you try to wire things up too much under the hood before they are realized they will not work.
static void Main(string[] args)
{
using (var context = new TesterEntities())
{
var peopleOrders = context.tePerson.Include("teOrder").First(p => p.PersonId == 1).teOrder.ToList();
peopleOrders.ForEach(x => Console.WriteLine($"{x.OrderId} {x.Description}"));
}
}

Ultimately, I just decided to go with the easier approach of casting my GUID .ToString() like so:
let userOrders = (from uo in ctx.UserOrders where uo.UserId.ToString() == u.Id select uo)
The reason being, the default implementation of AspNetUsers defines the column as being varchar, not uniqueidentifier, so in order to change it I would likely have to do something like this How to make EF-Core use a Guid instead of String for its ID/Primary key which I am not presently interested in doing!

How to get an EF query to compile the most optimised SQL?

I'm fairly new to EF, and this is something that has been bugging me for a couple of days now:
I have a User entity. It has a parent WorkSpace, which has a collection of Users.
Each User also has a collection of children Schedule, in a User.Schedules property.
I'm navigating through the objects like this:
var query = myUser.WorkSpace.Users.SelectMany(u => u.Schedules);
When enumerating the results of query (myUser is an instance of User that has been loaded previously using .Find(userid)), I noticed that EF makes one query to the db for each User in WorkSpace.Users
How come EF doesn't get the results in one single query, starting from the primary key of myUser, and joining on this with all the tables involved?
If I do something else directly from the context such as this, it works fine though:
context.Users.Where(u => u.ID = userid).SelectMany(u => u.WorkSpace.Users.SelectMany(u => u.Schedules))
Is there something I'm doing wrong?

Let take the first query:
var query = myUser.WorkSpace.Users.SelectMany(u => u.Schedules);
If you look at the type of the query variable, you'll see that it is IEnumerable<Schedule>, which means this is a regular LINQ to Objects query. Why? Because it starts from materialized object, then accession another object/collection etc. This combined with the EF lazy loading feature is leading to the multiple database query behavior.
If you do the same for the second query:
var query = context.Users.Where(u => u.ID = userid)
.SelectMany(u => u.WorkSpace.Users.SelectMany(u => u.Schedules))
you'll notice that the type of the query is now IQueryable<Schedule>, which means now you have a LINQ to Entities query. This is because neither context.Users nor other object/collections used inside the query are real objects - they are just metadata used to build, execute and materialize the query.
To recap, you are not doing something wrong. The lazy loading works this way. If you don't care about so called N+1 query issue, you can use the first approach. If you do care, then use the second.

What is the difference between Joining two different DB Context using ToList() and .AsQueryable()?

Case 1:
I am Joined two different DB Context by ToList() method in Both Context.
Case 2:
And also tried Joining first Db Context with ToList() and second with AsQueryable().
Both worked for me. All I want to know is the difference between those Joinings regarding Performance and Functionality. Which one is better ?
var users = (from usr in dbContext.User.AsNoTracking()
select new
{
usr.UserId,
usr.UserName
}).ToList();
var logInfo= (from log in dbContext1.LogInfo.AsNoTracking()
select new
{
log.UserId,
log.LogInformation
}).AsQueryable();
var finalQuery= (from usr in users
join log in logInfo on usr.UserId equals log.UserId
select new
{
usr.UserName,
log.LogInformation
}.ToList();

I'll elaborate answer that was given by Jehof in his comment. It is true that this join will be executed in the memory. And there are 2 reasons why it happens.
Firstly, this join cannot be performed in a database because you are joining an object in a memory (users) with a deferred query (logInfo). Based on that it is not possible to generate a query that could be send to a database. It means that before performing the actual join a deferred query is executed and all logs are retrieved from a database. To sum up, in this scenario 2 queries are executed in a database and join happens in memory. It doesn't matter if you use ToList + AsQueryable or ToList + ToList in this case.
Secondly, in your scenario this join can be performed ONLY in a memory. Even if you use AsQueryable with the first context and with the second context it will not work. You will get System.NotSupportedException exception with the message:
The specified LINQ expression contains references to queries that are associated with different contexts.
I wonder why you're using 2 DB contexts. Is it really needed? As I explained because of that you lost a possibility to take full advantage of deferred queries (lazy evaluation features).
If you really have to use 2 DB contexts, I'll consider adding some filters (WHERE conditions) to queries responsible for reading users and logs from DB. Why? For small number of records there is no problem. However, for large amount of data it is not efficient to perform joins in memory. For this purpose databases were created.

It hasn't been explained yet why the statements actually work and why EF doesn't throw an exception that you can only use sequences of primitive types in a LINQ statement.
If you swap both lists ...
var finalQuery= (from log in logInfo
join usr in users on log.UserId equals usr.UserId
...
EF will throw
Unable to create a constant value of type 'User'. Only primitive types or enumeration types are supported in this context.
So why does your code work?
That will become clear if we convert your statement to method syntax (which the runtime does under the hood):
users.Join(logInfo, usr => usr.UserId, log => log.UserId
(usr,log) => new
{
usr.UserName,
log.LogInformation
}
Since users is an IEnumerable, the extension method Enumerable.Join is resolved as the appropriate method. This method accepts an IEnumerable as the second list to be joined. Therefore, logInfo is implicitly cast to IEnumerable, so it runs as a separate SQL statement before it partakes in the join.
In the version from log in logInfo join usr ..., Queryable.Join is used. Now usr is converted into an IQueryable. This turns the whole statement into one expression that EF unsuccessfully tries to translate into one SQL statement.
Now a few words on
Which one is better?
The best option is the one that does just enough to make it work. That means that
You can remove AsQueryable(), because logInfo already is an IQueryable and it is cast to IEnumerable anyway.
You can replace ToList() by AsEnumerable(), because ToList() builds a redundant intermediate result, while AsEnumerable() only changes the runtime type of users, without triggering its execution yet.

ToList()
Execute the query immediately
You will get all the elements ready in memory
AsQueryable()
lazy (execute the query later)
Parameter: Expression<Func<TSource, bool>>
Convert Expression into T-SQL (with specific provider), query remotely and load result to your application memory.
That’s why DbSet (in Entity Framework) also inherits IQueryable to get efficient query.
It does not load every record. E.g. if Take(5), it will generate select top 5 * SQL in the background.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.