Simple Linq to SQL Compiled.query for count - c#

I have a function that does a database count every minute and I would like to compile the query to do so. The table is being written to (and deleted from) by multiple sources constantly and I want to gather results on the number of unique rows in use throughout the day.
My normal linq query is:
var query =
from result in dcontext.constantly_changing_table
select result.used_transaction_id;
// I'm looking for the count of unique used_transaction_id's
Console.WriteLine((query.Distinct()).Count());
How do I turn this into a compiled query?
Background. I've been doing LINQ for a week and I can write complicated join statements with it. This is day one on compiled queries and I'm a bit lost. I'm not so interested in the fact that this may (or may not ) improve performance on the query, I just want to see a compiled query that works that I can build on.

Related

Linq query timing out, how to streamline query

Our front end UI has a filtering system that, in the back end, operates over millions of rows. It uses a an IQueryable that is built up over the course of the logic, then executed all at once. Each individual UI component is ANDed together (for example, Dropdown1 and Dropdown2 will only return rows that have both of what is selected in common). This is not a problem. However, Dropdown3 has has two types of data in it, and the checked items need to be ORd together, then ANDed with the rest of the query.
Due to the large amount of rows it is operating over, it keeps timing out. Since there are some additional joins that need to happen, it is somewhat tricky. Here is my code, with the table names replaced:
//The end list has driver ids in it--but the data comes from two different places. Build a list of all the driver ids.
driverIds = db.CarDriversManyToManyTable.Where(
cd =>
filter.CarIds.Contains(cd.CarId) && //get driver IDs for each car ID listed in filter object
).Select(cd => cd.DriverId).Distinct().ToList();
driverIds = driverIds.Concat(
db.DriverShopManyToManyTable.Where(ds => filter.ShopIds.Contains(ds.ShopId)) //Get driver IDs for each Shop listed in filter object
.Select(ds => ds.DriverId)
.Distinct()).Distinct().ToList();
//Now we have a list solely of driver IDs
//The query operates over the Driver table. The query is built up like this for each item in the UI. Changing from Linq is not an option.
query = query.Where(d => driverIds.Contains(d.Id));
How can I streamline this query so that I don't have to retrieve thousands and thousands of IDs into memory, then feed them back into SQL?
There are several ways to produce a single SQL query. All they require to keep the parts of the query of type IQueryable<T>, i.e. do not use ToList, ToArray, AsEnumerable etc. methods that force them to be executed and evaluated in memory.
One way is to create Union query containing the filtered Ids (which will be unique by definition) and use join operator to apply it on the main query:
var driverIdFilter1 = db.CarDriversManyToManyTable
.Where(cd => filter.CarIds.Contains(cd.CarId))
.Select(cd => cd.DriverId);
var driverIdFilter2 = db.DriverShopManyToManyTable
.Where(ds => filter.ShopIds.Contains(ds.ShopId))
.Select(ds => ds.DriverId);
var driverIdFilter = driverIdFilter1.Union(driverIdFilter2);
query = query.Join(driverIdFilter, d => d.Id, id => id, (d, id) => d);
Another way could be using two OR-ed Any based conditions, which would translate to EXISTS(...) OR EXISTS(...) SQL query filter:
query = query.Where(d =>
db.CarDriversManyToManyTable.Any(cd => d.Id == cd.DriverId && filter.CarIds.Contains(cd.CarId))
||
db.DriverShopManyToManyTable.Any(ds => d.Id == ds.DriverId && filter.ShopIds.Contains(ds.ShopId))
);
You could try and see which one performs better.
The answer to this question is complex and has many facets that, individually, may or may not help in your particular case.
First of all, consider using pagination. .Skip(PageNum * PageSize).Take(PageSize) I doubt your user needs to see millions of rows at once in the front end. Show them only 100, or whatever other smaller number seems reasonable to you.
You've mentioned that you need to use joins to get the data you need. These joins can be done while forming your IQueryable (entity framework), rather than in-memory (linq to objects). Read up on join syntax in linq.
HOWEVER - performing explicit joins in LINQ is not the best practice, especially if you are designing the database yourself. If you are doing database first generation of your entities, consider placing foreign-key constraints on your tables. This will allow database-first entity generation to pick those up and provide you with Navigation Properties which will greatly simplify your code.
If you do not have any control or influence over the database design, however, then I recommend you construct your query in SQL first to see how it performs. Optimize it there until you get the desired performance, and then translate it into an entity framework linq query that uses explicit joins as a last resort.
To speed such queries up, you will likely need to perform indexing on all of the "key" columns that you are joining on. The best way to figure out what indexes you need to improve performance, take the SQL query generated by your EF linq and bring it on over to SQL Server Management Studio. From there, update the generated SQL to provide some predefined values for your #p parameters just to make an example. Once you've done this, right click on the query and either use display estimated execution plan or include actual execution plan. If indexing can improve your query performance, there is a pretty good chance that this feature will tell you about it and even provide you with scripts to create the indexes you need.
It looks to me that using the instance versions of the LINQ extensions is creating several collections before you're done. using the from statement versions should cut that down quite a bit:
driveIds = (from var record in db.CarDriversManyToManyTable
where filter.CarIds.Contains(record.CarId)
select record.DriverId).Concat
(from var record in db.DriverShopManyToManyTable
where filter.ShopIds.Contains(record.ShopId)
select record.DriverId).Distinct()
Also using the groupby extension would give better performance than querying each driver Id.

Which query is optimized?

I am fetching a list of products including their prices. I want to get just enable prices.
I wrote two type of queries:
context.Products.Include("Prices").Where(p=>p.Prices.Where(pr=>pr.Enable==true).Count()>0).ToList();
And the other one is:
context.Products.Include("Prices").ToList().RemoveAll(p => p.Prices.Where(pr => pr.Enable == true).ToList().Count == 0);
Which one is more optimized?
Assuming you are using an EntityFramework context, the first one is way better.
This is because Linq to SQL will translate the statement into an SQL statement. The Where statements will result in an according SQL Where. So only the necessary subset of the elements are retrieved.
The second statement retrieves all Products and Prices and then removes the unwanted elements.
This assumes that you have a remote database. If your database is running locally or you already have all Products and Prices in memory its not so easy to tell (you would have to use the profiler for that).
This kind of question really depends on a lot of things, so it is not so easy to say which is better.
But from the code, the first one is doing the where clause at sql side, where the second code is getting all the data out from sql and do the where in application.
so it will depend on the sql server, the application hardware and data amount.

Linq results are displayed in different ordering

I am currently fiddeling around with the AdventureWorks sample database and LinqPad, for scratching out some ideas.
This is the query in question:
SalesOrderHeaders.GroupBy (soh => new {soh.CustomerID, soh.BillToAddressID})
.Where(soh => soh.Skip(1).Any())
.Dump();
The idea was to find duplicates based on some criteria and then display them except the first set of data. The result should be deleted from the table.
After executing the query I get result A)
After executing the query again I get Result B)
I do not care about the correct results of the query, but about the ordering of the resultset. Only those two possibilities exist and they alternate on every run of the query.
Surely I could just order by Key, but I am more interested in why does this even occur? Why is the order chaning/alternating?
Sql server select query's result set order is not deterministic. It's just how sql server works and it's not a bug in neither linq or linqpad. The only way to get deterministic results on your query, as yourself noted, is to use an OrderBy clause.
Edit: About getting same results in SSMS if you run the query multiple times, see this. This post explains why you might get the same results if you execute a query multiple times and why you shouldn't rely on it.
Ordering is never deterministic as suggested earlier, but try and insert orderby clause in either your sql query or the Linq query, that's the only way to make it deterministic.
In fact let's look little deeper in the database for the reason. DB would be fetching all the data from the disk via i/o. data is stored in the internal sql server structures like pages, extents, segments (these are Oracle data blocks, I hope sql server have the similar stuff). Now when the query is fired, database would know all the different locations to fetch the data from, but this is not a serial operation, it is sort of parallel fetch, where different datasets are then combined to provide a user view. Now as we know in case of threads , it can never be deterministic that who goes first and who returns first, that's totally OS scheduling of the threads, hope that clarifies further little more.
OrderBy clause will work on fetched data to make in a particular order, so will always yield deterministic result.

Speed up linq query without where clauses

Quick LINQ performance question.
I have a database with many many records and it's used for a webshop.
All query logic and paging is done with LINQ, and it performs quite well.
This is, because the usual search for products contains one or more where clause, and that shortens my result set to a couple of hundred results at max.
But.. there is an option to list all products (when no search criteria is provided), and that query is slow.. real slow. Even though i'm just asking for a single page with .Skip(20).Take(10), it's still slow because the total result is something like 140000 products. Is there a way to limit this (or all) query, so that the speed of the whole thing is kept okay?
I don't want to force my customers to provide one or more criteria.. but on the other hand i have no problem with telling them that they can never find more than 2000 products.
Thanks for helping!
Tys
Why don't you limit the number of records on the sql side as described in this post
http://www.sqlservercurry.com/2009/06/skip-and-take-n-number-of-records-in.html
Watch out for any "premature" enumerations when you pass down queries/results in your code!
There are also several LINQ visualizers available, which can help to see what the LINQ expressions actually translate to. Or you can play around with expressions in LINQPad before integrating in your codeā€¦
What you can do is to have Linq use stored procedure from the database.
In that case, it will be faster because it is the database engine who will do the work and return it to Linq; the database engine is made for that, and it is closer to data than Linq.
I suggest you give it a try and give us feedback
You can check what indexes has the table and what PK is. It could be the table has no index at all so records compared by field values. Also you can catch the query in the SqlProfiler, run it separately and analyse its query plan.

Dynamically Generating a Linq/Lambda Where Clause

I've been searching here and Google, but I'm at a loss. I need to let users search a database for reports using a form. If a field on the form has a value, the app will get any reports with that field set to that value. If a field on a form is left blank, the app will ignore it. How can I do this? Ideally, I'd like to just write Where clauses as Strings and add together those that are not empty.
.Where("Id=1")
I've heard this is supposed to work, but I keep getting an error: "could not be resolved in the current scope of context Make sure all referenced variables are in scope...".
Another approach is to pull all the reports then filter it one where clause at a time. I'm hesitant to do this because 1. that's a huge chunk of data over the network and 2. that's a lot of processing on the user side. I'd like to take advantage of the server's processing capabilities. I've heard that it won't query until it's actually requested. So doing something like this
var qry = ctx.Reports
.Select(r => r);
does not actually run the query until I do:
qry.First()
But if I start doing:
qry = qry.Where(r => r.Id = 1).Select(r => r);
qry = qry.Where(r => r.reportDate = '2010/02/02').Select(r => r);
Would that run the query? Since I'm adding a where clause to it. I'd like a simple solution...in the worst case I'd use the Query Builder things...but I'd rather avoid that (seems complex).
Any advice? :)
Linq delays record fetching until a record must be fetched.
That means stacking Where clauses is only adding AND/OR clauses to the query, but still not executing.
Execution of the generated query will be done in the precise moment you try to get a record (First, Any etc), a list of records(ToList()), or enumerate them (foreach).
.Take(N) is not considered fetching records - but adding a (SELECT TOP N / LIMIT N) to the query
No, this will not run the query, you can structure your query this way, and it is actually preferable if it helps readability. You are taking advantage of lazy evaluation in this case.
The query will only run if you enumerate results from it by using i.e. foreach or you force eager evaluation of the query results, i.e. using .ToList() or otherwise force evaluation, i.e evaluate to a single result using i.e First() or Single().
Try checking out this dynamic Linq dll that was released a few years back - it still works just fine and looks to be exactly what you are looking for.

Categories

Resources