deferred execution of where in sql

deferred execution of where in sql - c#

i have an editor where people can create custom sql-queries. those queries are NHibernate.IQuery objects.
in our project we are using NHibernate.Linq.LinqExtensionMethods.Query to get an IQueryable object which can be used to apply filters that are executed deferred (everything on the DB).
now i want to create another filter that is based on the custom sql-queries.
what i want is something like this:
queryable.Where(x=> <sql-query contains x>)
the only thing i can do right now is execute the sql-query beforehand and then filter the queryable with the resulting list of elements.
IList<T> elements = query.List<T>();
queryable.Where(x => elements.Contains(x)).ToList();
the problem with that approach is, that the list of elements can be huge. if its possible, i would like to perform the whole statement directly on the database, so that i dont have to transfer all objects to my application, then send all objects back to the database as the filter-parameters...
edit: the first query (the one yielding the list of elements to check for in the second query) is constructed from a plain sql-string as follows:
ISQLQuery sqlQuery = CurrentSession.CreateSQLQuery(myQueryString);//create query
sqlQuery.AddEntity(typeof (T)); //set result type
IQuery query = sqlQuery.SetReadOnly(true);

If you have id's, then you should select them in your query, and compare it to the corresponding id column in the second query.
var elementIdsToUseInContain = query
.Select(x => x.YourIdProperty);
var result = queryable
.Where(x => elementIdsToUseInContain.Contains(x.YourIdPropertyToCompareTo))
.ToList();
Note: I haven't tested it, but it is fairly standard Linq. It depends on if NHibernate supports it, which i do not know.

Related

Split a string in a linq select expression

The geoStartLoc holds the string in this format "54.5,44.5". I am trying to split and store the results in lat and longitude. I am getting the following error
LINQ to Entities does not recognize the method 'System.String[] Split(Char[])' method, and this method cannot be translated into a store expression. (select statement)
var data = from UberTrip in db.UberTrips
group UberTrip by new { UberTrip.startLoc, UberTrip.geoStartLoc }
into startLocGroup
select new LocationGroup() {
startLocation = startLocGroup.Key.startLoc,
latitude = startLocGroup.Key.geoStartLoc.Split(',').Count().ToString(),
//longitude= startLocGroup.Key.
countTrips = startLocGroup.Count()
};

This is a schema problem in your database. It's bad to ever store delimited values in a column. You should really have two columns: one for geoLatitude, and one for geoLongitude.
But since you probably can't make that change on your own, what you'll need to do is make sure the data is pulled down to your program's code and then split it there. Right now, linq is trying to take the expression tree created by this code and convert it to an SQL query, and it can't do it for the Split() call because not all supported database targets have an analogous Split method available. You need to save that part for after the data is loaded to memory in your program.
To accomplish this, just retrieve the full geoStartLoc string (into an anonymous type if you have to), use .ToList() to force the query compile and retrieve all the data, and then use a .Select() to convert to your LocationGroup objects.

Whenever you get this error message you are performing a query AsQueryable instead of AsEnumerable.
The main difference is that AsQueryable will usually be performed in another process like a database, while AsEnumerable will be performed in-memory in your process.
An IEnumerable object knows how to create an Enumerator. An Enumerator, knows how to do two things: "Give me the first element of your sequence", and "Give me the next element of your sequence (or null if there is no next element).
If your query is AsEnumerable it knows can use all classes and data from your process to create the enumerator. Therefore it can call functions like String.Split.
The AsQueryable does not hold all data to create the enumerator, it holds an Expression and a Provider. The Provider knows where the query needs to be performed on (usually a database, but it can also be a json string or a web service), and it knows how to translate the Expression into a format that the executor understands. In your case, the Provider knows how to translate the Expression into a SQL statement suitable for your database.
Of course SQL does not know your own defined functions. Although there are a lot of .NET functions that can be translated into SQL, not every function can. String.Split is one of them.
See: Supported and Unsupported LINQ Methods (LINQ to Entities)
To solve your problems you could bring the queried data to local memory. This is done using the extension function Enumerable.AsEnumerable(). After that you can use your sequence as if it is an IEnumerable.
The disadvantage of AsEnumerable is that it brings the queried data to local memory. This would be a waste if you remove a lot of this data to create your end result. Therefore make sure that you do your AsEnumerable with not too much data.
Luckily you need the Split in your final Select, so all data you need as input of your final select is thrown away, unless you will do a Skip / Take / FirstOrDefault / etc. In that case it is better to limit your Select by doing your Skip / Take etc before AsEnumerable().
I'm more familiar with MethodSyntax.
Your query divided into smaller steps (if desired make one LINQ):
// still AsQueryable:
var startLocGroups = db.UberTrips
.GroupBy(uberTrip => new {uberTip.startLoc, uberTrip.geoStartLoc)
// make asEnumerable
var localStartLocGroups = startLocGroups.AsEnumerable();
// now you can do your Select
var result = localStartLocGroups
.Select(group => new LocationGroup()
{
startLocation = startLocGroup.Key.startLoc,
latitude = group.Key.geoStartLoc.Split(',')
.Count()
.ToString(),
//longitude= sgroup.Key.
countTrips = group.Count(),
});

Since that isn't supported by the underlying Linq to Entities you'll have to do it after you get the data but before you return the results.
This way you do the query to get the data without invoking the string split function and then modify the results into the proper format.
var data = from UberTrip in db.UberTrips
group UberTrip by new { UberTrip.startLoc, UberTrip.geoStartLoc } into startLocGroup
select new {
startLocation = startLocGroup.Key.startLoc,
geoStartLoc = startLocGroup.Key.geoStartLoc,
countTrips = startLocGroup.Count()
};
return data.Select(trip => new LocationGroup() {
startLocation = trip.startLocation,
latitude = trip.geoStartLoc.Split(',')[0],
longitude= trip.geoStartLoc.Split(',')[1],
countTrips = trip.countTrips
}).ToList();

Linq query timing out, how to streamline query

Our front end UI has a filtering system that, in the back end, operates over millions of rows. It uses a an IQueryable that is built up over the course of the logic, then executed all at once. Each individual UI component is ANDed together (for example, Dropdown1 and Dropdown2 will only return rows that have both of what is selected in common). This is not a problem. However, Dropdown3 has has two types of data in it, and the checked items need to be ORd together, then ANDed with the rest of the query.
Due to the large amount of rows it is operating over, it keeps timing out. Since there are some additional joins that need to happen, it is somewhat tricky. Here is my code, with the table names replaced:
//The end list has driver ids in it--but the data comes from two different places. Build a list of all the driver ids.
driverIds = db.CarDriversManyToManyTable.Where(
cd =>
filter.CarIds.Contains(cd.CarId) && //get driver IDs for each car ID listed in filter object
).Select(cd => cd.DriverId).Distinct().ToList();
driverIds = driverIds.Concat(
db.DriverShopManyToManyTable.Where(ds => filter.ShopIds.Contains(ds.ShopId)) //Get driver IDs for each Shop listed in filter object
.Select(ds => ds.DriverId)
.Distinct()).Distinct().ToList();
//Now we have a list solely of driver IDs
//The query operates over the Driver table. The query is built up like this for each item in the UI. Changing from Linq is not an option.
query = query.Where(d => driverIds.Contains(d.Id));
How can I streamline this query so that I don't have to retrieve thousands and thousands of IDs into memory, then feed them back into SQL?

There are several ways to produce a single SQL query. All they require to keep the parts of the query of type IQueryable<T>, i.e. do not use ToList, ToArray, AsEnumerable etc. methods that force them to be executed and evaluated in memory.
One way is to create Union query containing the filtered Ids (which will be unique by definition) and use join operator to apply it on the main query:
var driverIdFilter1 = db.CarDriversManyToManyTable
.Where(cd => filter.CarIds.Contains(cd.CarId))
.Select(cd => cd.DriverId);
var driverIdFilter2 = db.DriverShopManyToManyTable
.Where(ds => filter.ShopIds.Contains(ds.ShopId))
.Select(ds => ds.DriverId);
var driverIdFilter = driverIdFilter1.Union(driverIdFilter2);
query = query.Join(driverIdFilter, d => d.Id, id => id, (d, id) => d);
Another way could be using two OR-ed Any based conditions, which would translate to EXISTS(...) OR EXISTS(...) SQL query filter:
query = query.Where(d =>
db.CarDriversManyToManyTable.Any(cd => d.Id == cd.DriverId && filter.CarIds.Contains(cd.CarId))
||
db.DriverShopManyToManyTable.Any(ds => d.Id == ds.DriverId && filter.ShopIds.Contains(ds.ShopId))
);
You could try and see which one performs better.

The answer to this question is complex and has many facets that, individually, may or may not help in your particular case.
First of all, consider using pagination. .Skip(PageNum * PageSize).Take(PageSize) I doubt your user needs to see millions of rows at once in the front end. Show them only 100, or whatever other smaller number seems reasonable to you.
You've mentioned that you need to use joins to get the data you need. These joins can be done while forming your IQueryable (entity framework), rather than in-memory (linq to objects). Read up on join syntax in linq.
HOWEVER - performing explicit joins in LINQ is not the best practice, especially if you are designing the database yourself. If you are doing database first generation of your entities, consider placing foreign-key constraints on your tables. This will allow database-first entity generation to pick those up and provide you with Navigation Properties which will greatly simplify your code.
If you do not have any control or influence over the database design, however, then I recommend you construct your query in SQL first to see how it performs. Optimize it there until you get the desired performance, and then translate it into an entity framework linq query that uses explicit joins as a last resort.
To speed such queries up, you will likely need to perform indexing on all of the "key" columns that you are joining on. The best way to figure out what indexes you need to improve performance, take the SQL query generated by your EF linq and bring it on over to SQL Server Management Studio. From there, update the generated SQL to provide some predefined values for your #p parameters just to make an example. Once you've done this, right click on the query and either use display estimated execution plan or include actual execution plan. If indexing can improve your query performance, there is a pretty good chance that this feature will tell you about it and even provide you with scripts to create the indexes you need.

It looks to me that using the instance versions of the LINQ extensions is creating several collections before you're done. using the from statement versions should cut that down quite a bit:
driveIds = (from var record in db.CarDriversManyToManyTable
where filter.CarIds.Contains(record.CarId)
select record.DriverId).Concat
(from var record in db.DriverShopManyToManyTable
where filter.ShopIds.Contains(record.ShopId)
select record.DriverId).Distinct()
Also using the groupby extension would give better performance than querying each driver Id.

How can I get a nested IN clause with a linq2sql query?

I am trying to implement some search functionality within our app and have a situation where a User can select multiple Topics from a list and we want to return all activities that match at least one of the selected Topics. Each Activity can have 0-to-many Topics.
I can write a straight SQL query that gives me the results I want like so:
SELECT *
FROM dbo.ACTIVITY_VERSION av
WHERE ACTIVITY_VERSION_ID IN (
SELECT ACTIVITY_VERSION_ID
FROM dbo.ACTIVITY_TOPIC at
WHERE at.TOPIC_ID IN (3,4)
)
What I can't figure out is how to write a LINQ query (we are using Linq to Sql) that returns the same results.
I've tried:
activities.Where(x => criteria.TopicIds.Intersect(x.TopicIds).Any());
this works if activities is a list of in memory objects (i.e. a Linq to Objects query), but I get an error if I try to use the same code in a query that hits the database. The error I receive is:
Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator.
I believe that this means that Linq to Sql doesn't know how to translate either Intersect or Any (or possibly both). If that is the case, I understand why it isn't working, but I still don't know how to make it do what I want it to and my Google-fu has not provided me with anything that works.

Haven't tested it. But this is how you ll go about it.
List<int> IDs = new List<int>();
IDs.Add(3);
IDs.Add(4);
var ACTIVITY_VERSION_IDs = ACTIVITY_TOPIC
.Where(AT => IDs.Contains(AT.TOPIC_ID))
.Select(AT=>AT.ACTIVITY_VERSION_ID)
var results = ACTIVITY_VERSION
.Where(AV => ACTIVITY_VERSION_IDs.Contains(AV.ACTIVITY_VERSION_ID))

Differences when using IEnumerable and IQueryable as a type of ObjectSet

As I understand it when I use LINQ extension methods (with lambda expression syntax) on IQueryable that is in the fact instance of ObjectSet they are translated into LINQ to SQL queries. What I mean is that command
IQueryable<User> users = db.UserSet;
var users32YearsOld = users.Where(user => user.Age == 32);
is exactly the same as
IQueryable<User> users = db.UserSet;
var users32YearsOld = from user in users where user.Age == 32 select user;
So non of them hits database until they users32YearsOld are enumerated in for cycle or such. (Hope I understand this correctly).
But what is going to happen if I don't mask that ObjectSet as IQueryable but as IEnumerable ? So if the type of it is IEnumerable ?
IEnumerable<User> users = db.UserSet;
var users32YearsOld = users.Where(user => user.Age == 32);
Is it going to hit the database immediately (if so then when ? Right on the first line or on the second) ? Or is it going to behave as the previous command that is will not hit database until users32YearsOld is enumerated ? Will there be any difference if I use following instead ?
IEnumerable<User> users = db.UserSet;
var users32YearsOld = from user in users where user.Age == 32 select user;
Thank you

Undeleting my answer because I just tested it and it works exactly as I described:
None of mentioned queries will hit the database because there was no enumeration. The difference between IQueryable query and IEnumerable query is that in the case of IQueryable the filtering will be executed on the database server whereas in the case of IEnumerable all objects will be loaded from the database to a memory and the filtering will be done in .NET code (linq-to-objects). As you can imagine that is usually performance killer.
I wrote simple test in my project:
[TestMethod]
public void Test()
{
// ObjectQuery<Department> converted ot IEnumerable<Department>
IEnumerable<Department> departmetns = CreateUnitOfWork().GetRepository<Department>().GetQuery();
// No query execution here - Enumerable has also deffered exection
var query = departmetns.Where(d => d.Id == 1);
// Queries ALL DEPARTMENTS here and executes First on the retrieved result set
var result = departmetns.First();
}

Here's a simple explanation:
IEnumerable<User> usersEnumerable = db.UserSet;
IQueryable<User> usersQueryable = db.UserSet;
var users = /* one of usersEnumerable or usersQueryable */;
var age32StartsWithG = users.Where(user => user.Age == 32)
.Where(user => user.Name.StartsWith("G");
If you use usersEnumerable, when you start enumerating over it, the two Wheres will be run in sequence; first the ObjectSet will fetch all objects and the objects will be filtered down to those of age 32, and then these will be filtered down to those whose name starts with G.
If you use usersQueryable, the two Wheres will return new objects which will accumulate the selection criteria, and when you start enumerating over it, it will translate all of the criteria to a query. This makes a noticeable difference.
Normally, you don't need to worry, since you'll either say var users or ObjectSet users when you declare your variable, which means that C# will know that you are interested in invoking the most specific method that's available on ObjectSet, and the IQueryable query operator methods (Where, Select, ...) are more specific than the IEnumerable methods. However, if you pass around objects to methods that take IEnumerable parameters, they might end up invoking the wrong methods.
You can also use the way this works to your advantage by using the AsEnumerable() and AsQueryable() methods to start using the other approach. For example, var groupedPeople = users.Where(user => user.Age > 15).AsEnumerable().GroupBy(user => user.Age); will pull down the right users with a database query and then group the objects locally.
As other have said, it's worth repeating that nothing happens until you start enumerating the sequences (with foreach). You should now understand why it couldn't be any other way: if all results were retrieved at once, you couldn't build up queries to be translated into a more efficient query (like an SQL query).

The difference is that IEnumerable performs the filters if they are more, one at a time. For example, from 100 elements will output 20 by the first filter and then will filter second time the needed 10. It will make one query to the database but will download unnecessary data. Using IQueryable will download again with one query but only the required 10 items. The following link gives some excellent examples of how these queries work:
https://filteredcode.wordpress.com/2016/04/29/ienumerable-vs-iqueryable-part-2-practical-questions/

You are correct about IQueryable. As for IEnumerable, it would hit the database immediately upon assigning IEnumerable user.
There is no real difference between using Linq Extensions vs. syntax in the example you provided. Sometimes one or the other will be more convenient (see linq-extension-methods-vs-linq-syntax), but IMO it's more about personal preference.

NHibernate Criteria with multiple results

Using HQL, you can build a query like the following which will return an array of objects for each row selected by simply using a comma delimited list in the select clause:
select mother, offspr, mate.Name
from Eg.DomesticCat as mother
inner join mother.Mate as mate
left outer join mother.Kittens as offspr
How would you do this with an ICriteria? Specifically, I am trying to build a paged query which also returns the total row count in the same query.
Ayende shows how you can do this with HQL (http://ayende.com/Blog/archive/2007/04/27/Paged-data--Count-with-NHibernate-The-really-easy-way.aspx). I need to replicate this using an ICriteria.

I would presume that the answer lies in prefetching the associations that you want to. That way you can fetch the needed part of your object graph in a single shot. You would do this in criteria queries like so.
ICriteria query = session.CreateCriteria(typeof (Cat))
.SetFetchMode("Mate", FetchMode.Join)
.SetFetchMode("Kittens", FetchMode.Join);
IList<Cat> results = query.List<Cat>();
This will give you back a list of cats with both the Mate and the Kittens prepopulated. You can then navigate to these properties without incurring an N+1 penalty. If you need a more flattened result I'd do it using linq, perhaps like this.
var results = query.List<Cat>()
.Select((c) => new {mother = c, mate = c.Mate, offspr = c.Kittens});
This will give you back a flattened list of anonymous types with the given properties. This will work if all you need is prefetching the object graph. However if you need to prefetch things such as counts or sums then you will need to examine the Projections and Alias parts of Criteria queries.
One more thing. If you are trying to exactly duplicate your above query, you can do so like this.
ICriteria query = session.CreateCriteria(typeof (Cat), "mother")
.CreateAlias("Mate", "mate", JoinType.InnerJoin)
.CreateAlias("Kittens", "offspr", JoinType.LeftOuterJoin)
.SetResultTransformer(Transformers.AliasToEntityMap);
This will return you basically the same as your hql query, however instead of using an indexed list it will use a dictionary that maps the alias to the entity.

You can convert your criteria to a count criteria by this way
DetachedCriteria countCriteria = CriteriaTransformer.TransformToRowCount(criteria); // criteria = your criteria
rowCount = countCriteria.GetExecutableCriteria(session).UniqueResult<int>();

Ayende also added Multi Criteria support: http://ayende.com/Blog/archive/2007/05/20/NHibernate-Multi-Criteria.aspx. That's in the 2.0 release, apparently.

You can also do paging by manipulating the resultset on the Criteria:
ICriteria query = session.CreateCriteria(typeof (Cat), "mother")
query.SetFirstResult(x); // x = page start
query.SetMaxResults(y); // y = page size
List results = query.List();

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.