IQueryable Intersect is currently not supported - c#

When I trying to do this
//data.Photos it's IEnumerable<Photo>. Comparer worked by Id.
List<Photo> inDb = db.Photos
.Intersect(data.Photos, new PhotoComparer())
.ToList();
I get an exception:
NotSupportedException: Could not parse expression
'value(Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable`1[ReportViewer.Models.DbContexts.Photo]).Intersect(__p_0, __p_1)'
This overload of the method #x27;System.Linq.Queryable.Intersect' is currently not supported.
// This works
List<Photo> inDb = db.Photos
.ToList()
.Intersect(data.Photos, new PhotoComparer())
.ToList();
// But will it take a long time - or not ?
What did I need to use Intersect with IQueryable and IEnumerable collection?

Due to the "custom comparer", although it's functionality might be trivial, the framework is currently not able to translate your statement to SQL (which I suspect you are using).
Next, it seems that you have a in memory collection, on which you want to perform this intersect.
So if you're wondering about speed, in order to get it working you'll need to send your data to the database server, and based on the Id's retrieve your data.
So basically, you are looking for a way to perform an inner join, which would be the SQL equivalent of the intersect.
Which you could do with the flowing linq query:
//disclaimer: from the top of my head
var list= from dbPhoto in db.Photos
join dataPhoto in data.Photos on dbPhoto.Id equals dataPhoto.Id
select dbPhoto;
This will not work though, since as far as I know EF isn't able to perform an join against an in-memory dataset.
So, alternatively you could:
fetch the data as IEnumerable (but yes, you'll be retrieving the whole set first)
use a Contains, be carefull though, if you're not using primitive types this can translate to a bunch of SQL OR statements
But basically it depends on the amount of data you're querying. You might want to reconsider your setup and try to be able to query the data based on some ownership, like user or other means.

Related

MongoDB and returning collections efficiently

I am very new to Mongo (this is actually day 1) and using the C# driver that is available for it. One thing that I want to know (as I am not sure how to word it in Google) is how does mongo handle executing queries when I want to grab a part of the collection.
What I mean by this is that I know that with NHibernate and EF Core, the query is first built and it will only fire when you cast it. So say like an IQueryable to IEnnumerable, .ToList(), etc.
Ex:
//Query is fired when I call .ToList, until that point it is just building it
context.GetLinqQuery<MyObject>().Where(x => x.a == 'blah').ToList();
However, with Mongo's examples it appears to me that if I want to grab a filtered result I will first need to get the collection, and then filter it down.
Ex:
var collection = _database.GetCollection<MyObject>("MyObject");
//Empty filter for ease of typing for example purposes
var filter = Builders<MyObject>.Filter.Empty;
var collection.Find(filter).ToList();
Am I missing something here, I do not think I saw any overload in the GetCollection method that will accept a filter. Does this mean that it will first load the whole collection into memory, then filter it? Or will it still be building the query and only execute it once I call either .Find or .ToList on it?
I ask this because at work we have had situations where improper positioning of .ToList() would result is seriously weak performance. Apologies if this is not the right place to ask.
References:
https://docs.mongodb.com/guides/server/read_queries/
The equivalent to your context.GetLinqQuery<MyObject>() would be to use AsQueryable:
collection.AsQueryable().Where(x => x.a == "blah").ToList();
The above query will be executed server side* and is equivalent to:
collection.Find(Builders<MyObject>.Filter.Eq(x => x.a, "blah")).ToEnumerable().ToList();
* The docs state that:
Only LINQ queries that can be translated to an equivalent MongoDB query are supported. If you write a LINQ query that can’t be translated you will get a runtime exception and the error message will indicate which part of the query wasn’t supported.

Linq query timing out, how to streamline query

Our front end UI has a filtering system that, in the back end, operates over millions of rows. It uses a an IQueryable that is built up over the course of the logic, then executed all at once. Each individual UI component is ANDed together (for example, Dropdown1 and Dropdown2 will only return rows that have both of what is selected in common). This is not a problem. However, Dropdown3 has has two types of data in it, and the checked items need to be ORd together, then ANDed with the rest of the query.
Due to the large amount of rows it is operating over, it keeps timing out. Since there are some additional joins that need to happen, it is somewhat tricky. Here is my code, with the table names replaced:
//The end list has driver ids in it--but the data comes from two different places. Build a list of all the driver ids.
driverIds = db.CarDriversManyToManyTable.Where(
cd =>
filter.CarIds.Contains(cd.CarId) && //get driver IDs for each car ID listed in filter object
).Select(cd => cd.DriverId).Distinct().ToList();
driverIds = driverIds.Concat(
db.DriverShopManyToManyTable.Where(ds => filter.ShopIds.Contains(ds.ShopId)) //Get driver IDs for each Shop listed in filter object
.Select(ds => ds.DriverId)
.Distinct()).Distinct().ToList();
//Now we have a list solely of driver IDs
//The query operates over the Driver table. The query is built up like this for each item in the UI. Changing from Linq is not an option.
query = query.Where(d => driverIds.Contains(d.Id));
How can I streamline this query so that I don't have to retrieve thousands and thousands of IDs into memory, then feed them back into SQL?
There are several ways to produce a single SQL query. All they require to keep the parts of the query of type IQueryable<T>, i.e. do not use ToList, ToArray, AsEnumerable etc. methods that force them to be executed and evaluated in memory.
One way is to create Union query containing the filtered Ids (which will be unique by definition) and use join operator to apply it on the main query:
var driverIdFilter1 = db.CarDriversManyToManyTable
.Where(cd => filter.CarIds.Contains(cd.CarId))
.Select(cd => cd.DriverId);
var driverIdFilter2 = db.DriverShopManyToManyTable
.Where(ds => filter.ShopIds.Contains(ds.ShopId))
.Select(ds => ds.DriverId);
var driverIdFilter = driverIdFilter1.Union(driverIdFilter2);
query = query.Join(driverIdFilter, d => d.Id, id => id, (d, id) => d);
Another way could be using two OR-ed Any based conditions, which would translate to EXISTS(...) OR EXISTS(...) SQL query filter:
query = query.Where(d =>
db.CarDriversManyToManyTable.Any(cd => d.Id == cd.DriverId && filter.CarIds.Contains(cd.CarId))
||
db.DriverShopManyToManyTable.Any(ds => d.Id == ds.DriverId && filter.ShopIds.Contains(ds.ShopId))
);
You could try and see which one performs better.
The answer to this question is complex and has many facets that, individually, may or may not help in your particular case.
First of all, consider using pagination. .Skip(PageNum * PageSize).Take(PageSize) I doubt your user needs to see millions of rows at once in the front end. Show them only 100, or whatever other smaller number seems reasonable to you.
You've mentioned that you need to use joins to get the data you need. These joins can be done while forming your IQueryable (entity framework), rather than in-memory (linq to objects). Read up on join syntax in linq.
HOWEVER - performing explicit joins in LINQ is not the best practice, especially if you are designing the database yourself. If you are doing database first generation of your entities, consider placing foreign-key constraints on your tables. This will allow database-first entity generation to pick those up and provide you with Navigation Properties which will greatly simplify your code.
If you do not have any control or influence over the database design, however, then I recommend you construct your query in SQL first to see how it performs. Optimize it there until you get the desired performance, and then translate it into an entity framework linq query that uses explicit joins as a last resort.
To speed such queries up, you will likely need to perform indexing on all of the "key" columns that you are joining on. The best way to figure out what indexes you need to improve performance, take the SQL query generated by your EF linq and bring it on over to SQL Server Management Studio. From there, update the generated SQL to provide some predefined values for your #p parameters just to make an example. Once you've done this, right click on the query and either use display estimated execution plan or include actual execution plan. If indexing can improve your query performance, there is a pretty good chance that this feature will tell you about it and even provide you with scripts to create the indexes you need.
It looks to me that using the instance versions of the LINQ extensions is creating several collections before you're done. using the from statement versions should cut that down quite a bit:
driveIds = (from var record in db.CarDriversManyToManyTable
where filter.CarIds.Contains(record.CarId)
select record.DriverId).Concat
(from var record in db.DriverShopManyToManyTable
where filter.ShopIds.Contains(record.ShopId)
select record.DriverId).Distinct()
Also using the groupby extension would give better performance than querying each driver Id.

Linq: Method has no supported translation to SQL - but how to dump to memory?

I have a Linq query that reads from a SQL table and 1 of the fields it returns are from a custom function (in C#).
Something like:
var q = from my in MyTable
select new
{
ID = my.ID,
Amount = GetAmount(ID)
};
If I do a q.Dump() in LinqPad, it shows the results, which tells me that it runs the custom function without trying to send it to SQL.
Now I want to union this to another query, with:
var q1 = (from p in AnotherQuery.Union(q)...
and the I get the error that Method has no supported translation to SQL.
So, my logic tells me that I need to dump q in memory and then try to union to that. I've tried doing that with ToList() and creating a secondary query that populates itself from the List, but that leads to a long list of different errors. Am I on the right track, by trying to get q in memory and union on that, or are there better ways of doing this?
You can't use any custom functions in a LINQ query that gets translated - only the functions supported by the given LINQ provider. If you want your query to happen on the server, you need to stick with the supported functions (even if it sometimes means having to inline code that would otherwise be reused).
The difference between your two queries boils down to when (and where) the projection happens. In your first case, the data from MyTable is returned from the DB - in your sample, just the ID. Then, the projection happens on top of this - the GetAmount method is called in your application for each of ID.
On the other hand, there's no such way for this to happen in your second query, since you're not using GetAmount in the final projection.
You either need to replace the custom function with inlined query the provider understands, or refactor all your queries to use the supported functions in addition with whatever you need to do in-memory. There's no point in giving you any sample code, since it depends entirely on your actual query, and what you're really trying to query for.

Linq query nhibernate; not supported exception

I'm fairly new to nHibernate having come from an EF background and I'm struggling with the following query :
_patientSearchResultModel = (from patient in _patientRepository.Query(patientSearch.BuildPatientSpecification())
join admission in _admissionRepository.Query(patientSearch.BuildAdmissionSpecification())
on patient.Id equals admission.Patient.Id
orderby admission.AdmissionDate
select new PatientSearchResultModel(patient.Id,
admission.Id,
false,
_phaseTypeMapper.GetPhaseTypeModel(admission.PhaseType),
patient.Last, patient.First,
admission.InPatientLocation,
admission.AdmissionDate,
admission.DischargeDate,
admission.RRI,
null,
admission.CompletionStatus,
admission.FollowupStatus)).ToList();
The intent of this query is to allow users to filter the two queries on parameters built up using the two Build???Specification functions and return the resultset. There could be many admission records and I would only like one PatientSearchResultModel per patient object, with the admission object being the newest one by Admission Date.
These objects are coming from nHibernate and it keeps return a Not Supported exception. There is also an association between Patient and Admissions thus : Patient.Admissions but i couldn't figure out how to then add the query filters return from the function Build???Specifications.
I'd be really grateful if someone could point me in the right direction; am I up against the Linq provider implementation here in nHibernate and need to move to Criteria or is it my Linq query ?
If anyone has any links or suggestions for good books or other learning materials in this area that would also be really helpful too.
I see several potential problems:
If you're using NHibernate 2.x + Linq2NHibernate explicit joins like that are not supported; in other versions they're just considered a smell.
I dont think NHibernate supports calling parameterized constructors in select clauses
I'm very sure NHibernate does not support calling instance methods in the select lambda
I'd suggest using the lambda syntax and SelectMany to alleviate potential join issues. Points #2 & #3 can be solved by projecting into an anonymous type, calling AsEnumerable then projecting into your model type.
Overall I'd suggest restructuring your code like:
var patientSpec = patientSearch.BuildPatientSpecification();
var admissionSpec = patientSearch.BuildAdmissionSpecification();
_patientSearchResultModel = _patientRepository.Where(patientSpec)
.SelectMany(p=>p.Admissions).Where(admissionSpec)
.Select(a=> new {
PatientId = a.Patient.Id,
AdminssionId = a.Id,
a.PhaseType,
a.Patient.Last,
a.Patient.First,
a.InPatientLocation,
a.AdmissionDate,
a.DischargeDate,
a.RRI,
a.CompletionStatus,
a.FollowupStatus
}).AsEnumerable()
.Select(x=> new PatientSearchResultModel(x.PatientId, x.AdmissionId ...))
.ToList();
Divide your query into parts and check which part runs and which doesn't.
My take on this is that select new ... is not supported in Linq to nHibernate.
I would recomend using something else, because it is simply too imature and feature-less to use seriously.
As with most popular LINQ-to-Database query providers, NHibernate will try to translate the whole query into a SQL statement to run against the database. This requires that all elements of your query are possible to express in the SQL flavour you're using.
In your query, the select new statement cannot be expressed in SQL, because you're making a call to the constructor of your PatientSearchResultModel class and are making a call to a GetPhaseTypeModel method.
You should restructure your query to express what you want to execute on the SQL database, then call AsEnumerable() to force the remainder of the query to be evaluated in-memory. After that call, you can call the constructor of your class and any .NET methods, and they will be executed as native code.
This query is too complex to describe it using Linq. It would give wrong result finally (if Patient has more than one admission records, result would have duplicate entries).
I see two steps for solution:
1) At development stage, use in-memory query. So, take Patients using ToList() first (query db at this moment). Some predicates (Patient filter like MRN, First, Last) could be used at this stage.
And then do search in-memory. Not performance, but working solution. Mark it for refactor to optimize later.
2) Finally, use NHibernate IQuery (ISQLQuery) and build sql query manually to make sure it would work as expected and work fast enough on SQL Server side. This is just read-only query and do not require Nhibernate query engine (Linq to Nhibernate) at all.

NHibernate - Equivalent of CountDistinct projection using LINQ

I'm in the midst of trying to replace a the Criteria queries I'm using for a multi-field search page with LINQ queries using the new LINQ provider. However, I'm running into a problem getting record counts so that I can implement paging. I'm trying to achieve a result
equivalent to that produced by a CountDistinct projection from the Criteria API using LINQ. Is there a way to do this?
The Distinct() method provided by LINQ doesn't seem to behave the way I would expect, and appending ".Distinct().Count()" to the end of a LINQ query grouped by the field I want a distinct count of (an integer ID column) seems to return a non-distinct count of those values.
I can provide the code I'm using if needed, but since there are so many fields, it's
pretty long, so I didn't want to crowd the post if it wasn't needed.
Thanks!
I figured out a way to do this, though it may not be optimal in all situations. Just doing a .Distinct() on the LINQ query does, in fact, produce a "distinct" in the resulting SQL query when used without .Count(). If I cause the query to be enumerated by using .Distinct().ToList() and then use the .Count() method on the resulting in-memory collection, I get the result I want.
This is not exactly equivalent to what I was originally doing with the Criteria query, since the counting is actually being done in the application code, and the entire list of IDs must be sent from the DB to the application. In my case, though, given the small number of distinct IDs, I think it will work, and won't be too much of a performance bottleneck.
I do hope, however, that a true CountDistinct() LINQ operation will be implemented in the future.
You could try selecting the column you want a distinct count of first. It would look something like: Select(p => p.id).Distinct().Count(). As it stands, you're distincting the entire object, which will compare the reference of the object and not the actual values.

Categories

Resources