Is SQL View faster than Table while using Linq? - c#

My WPF application has a lookup screen for selecting customers. The customer table contain nearly 10,000 records. Its very slow when loading and filtering records using my Linq query(I am not doing any ordering of records). Is there a way to increase speed? Heard about using indexed views. Can someone please give some ideas?
lstCustomerData = dbContext.customers.Where(c => c.Status == "Activated").ToList();
dgCustomers.ItemsSource = lstCustomerData;
filtering:
string searchKey = TxtCustName.Text.Trim();
var list = (from c in lstCustomerData
where (c.LastName == null ? "" : c.LastName.ToUpper()).Contains(searchKey.ToUpper())
select c).ToList();
if (list != null)
dgCustomers.ItemsSource = list;

Depends on what is slow. is the SQL Query slow? Is the UI rendering slow? Are you sorting/fintering in memory or going back to the DB?
You should profile your app to find out exactly what the slowest piece is, then tackle that first.
If the Linq query you added is what is slow then adding an index to the Status column in your database may help.
You might get some improvement by changing your Where clause:
var list = (from c in lstCustomerData
where (c.LastName != null && c.LastName.ToUpper()).Contains(searchKey.ToUpper())
select c).ToList();
if (list != null)
dgCustomers.ItemsSource = list;
since it doesn't have to compare an empty string. However if you have very few NULL records than this probably won't help much.
In this case, however, all of the filtering is done in memory so using an indexed view in the DB won't help unless you push the filtering back to the source repository.

Related

LINQ Query is running slow but only on two of my statements

I have 2 linq statements below that are part of a larger query. I have about 6 other statements that do very similar things as the the below 2 statements. My query without these 2 statements executes in about 237ms. When I add these 2 it adds on about 10 seconds of time.
The demandXPCILStatuses table has about 30k records and the demand has about 13k.
The PCILStatuses table has 6 records in it.
After doing timing on other tables that have about the same amount of records I have pretty much ruled it being too much data which I never really thought it was anyways but thought I would run some tests.
DemandXPCILStatus = (from demandXPCILStatus in demandXPCILStatuses
where demand.ID == demandXPCILStatus.DemandID
&& demandXPCILStatus.Active == true
select demandXPCILStatus).FirstOrDefault(),
PCILStatus = (from demandXPCILStatus in demandXPCILStatuses
join PCILStatus in PCILStatuses
on new { A = demandXPCILStatus.PCILStatusID,
B = demandXPCILStatus.DemandID,
C = demandXPCILStatus.Active }
equals new { A = PCILStatus.ID, B = demand.ID, C = true }
select PCILStatus).FirstOrDefault(),
Here is how my tables are designed
I [![DemandXPCILStatus][1]][1]
[![PCILStatus][2]][2]
I tried to post an image of my database design but I don't have enough points to do that.
So here is how it is designed
DemandXPCILStatus
ID (PK, int, not null)
DemandID (int, not null)
PCILStatusID (int not null)
PCILTime (datetime, null)
LastUpdatedOn (datetime, null)
Active (bit, null)
PCILStatus
ID (PK, int, not null)
Status (nvarchar(50), null)
Code (nvarchar(10), null)
Class (nvarchar(30), null)
At this point I don't know what else to try. Any suggestions? FYI this is my first LINQ query so I have almost no idea what I am doing.
I am using Dapper to retrieve data and put it into memory before running the query. The table DemandXPCILStatus was returning just over 30k records. I know I didn't post the rest of my query but it is a pretty heavy use of LINQ and I guess 30k records was just too many for performance issues. I filtered out data on that table before putting into memory and that portion of the query went from like 4.5 seconds to like 2ms.
I guess I was a little unclear on the amount of data linq could handle and map to complex objects. But now that I know I fixed up my query and it went from running and displaying in about 15 seconds to 1.2 seconds.

Is it possible to combine 2 LINQ queries, each filtering data, before fetching the results of the queries?

I need to retrieve data from 2 SQL tables, using LINQ. I was hoping to combine them using a Join. I've looked this problem up on Stack Overflow, but all the questions and answers I've seen involve retrieving the data using ToList(), but I need to use lazy loading. The reason for this is there's too much data to fetch it all. Therefore, I've got to apply a filter to both queries before performing a ToList().
One of these queries is easily specified:
var solutions = ctx.Solutions.Where(s => s.SolutionNumber.Substring(0, 2) == yearsToConsider.PreviousYear || s.SolutionNumber.Substring(0, 2) == yearsToConsider.CurrentYear);
It retrieves all the data from the Solution table, where the SolutionNumber starts with either the current or previous year. It returns an IQueryable.
The thing that's tough for me to figure out is how to retrieve a filtered list from another table named Proficiency. At this point all I've got is this:
var profs = ctx.Proficiencies;
The Proficiency table has a column named SolutionID, which is a foreign key to the ID column in the Solution table. If I were doing this in SQL, I'd do a subquery where SolutionID is in a collection of IDs from the Solution table, where those Solution records match the same Where clause I'm using to retrieve the IQueryable for Solutions above. Only when I've specified both IQueryables do I want to then perform a ToList().
But I don't know how to specify the second LINQ query for Proficiency. How do I go about doing what I'm trying to do?
As far as I understand, you are trying to fetch Proficiencies based on some Solutions. This might be achieved in two different ways. I'll try to provide solutions in Linq as it is more readable. However, you can change them in Lambda Expressions later.
Solution 1
var solutions = ctx.Solutions
.Where(s => s.SolutionNumber.Substring(0, 2) == yearsToConsider.PreviousYear || s.SolutionNumber.Substring(0, 2) == yearsToConsider.CurrentYear)
.Select(q => q.SolutionId);
var profs = (from prof in ctx.Proficiencies where (from sol in solutions select sol).Contains(prof.SolutionID) select prof).ToList();
or
Solution 2
var profs = (from prof in ctx.Proficiencies
join sol in ctx.Solutions on prof.SolutionId equals sol.Id
where sol.SolutionNumber.Substring(0, 2) == yearsToConsider.PreviousYear || sol.SolutionNumber.Substring(0, 2) == yearsToConsider.CurrentYear
select prof).Distinct().ToList();
You can trace both queries in SQL Profiler to investigate the generated queries. But I'd go for the first solution as it will generate a subquery that is faster and does not use Distinct function that is not recommended unless you have to.

How to retrieve data from very large datasets with optional parameters?

I have an app that retrieves data requested by the user. All parameters except Type are optional. If a parameter is not specified, all items are retrieved. If it is specified, only items corresponding that parameter are retrieved. For example, here I retrieve products by year of release (-1 is the default value, if the user hasn't specified one):
var products = context.Products.Where(p => p.type == Type).ToList();
if (!(Year == -1))
products = products.Where(p => p.year == Year).ToList();
This works perfectly fine for some of the years. E.g., if I search 2001, I get all entries needed. But since products has a limited size and only retrieves 1500 entries, later years are simply not retrieved, not in the products list, and it comes up as no data for that year, even though there is data in the DB.
How can I get around this problem?
One of the nice things about deferred execution on LINQ is it can help make code that has variable filtering rules a lot more neat and readable. If you're not sure what deferred execution is, in a nutshell it's a mechanism that only runs the LINQ query when you ask for the results rather than when you make the statements that comprise the query.
In essence this means we can have code like:
//always adults
var p = person.Where(x => x.Age > 18);
//we maybe filter on these
if(email != null)
p = p.Where(x => x.Email == email);
if(socialSN != null)
p = p.Where(x => x.SSN == socialSN);
var r = p.ToList(); //the query is only actually run now
The multiple calls to where here are cumulative; they will conceptually build a where clause but not execute the query until ToList is called. At this point, if a database is in use then the db sees the query with all its Where clauses and can leverage indexes and statistics
If we were to use ToList after every Where, then the first Where would hit the db and it's whole dataset would download to the client app, and the runtime would set about converting an enumerable to a list (a lot of copying and memory allocating). The subsequent Where would filter the list in the client app, enumerating it but then converting it to a list again - the big problem being its done in the memory of the client app as some naive unindexed loop, and all those millions of dollars of r&d Microsoft poured into making their SQL Server query optimizer pull huge amounts of data very quickly, are wasted :)
Consider also that that first clause in my example set- Age>18 could be huge; a million people of a spread of ages over age 12, for example - A large amount of data is true for that predicate. Email or SSN would be a far smaller dataset, probably indexed etc. It's a contrived example sure but hopefully well illustrates the point about performance; by ToList()ing too early we end up downloading too much data

Using LINQ against a watin table syntax issue

I'm trying to get around the fact that watin table access is very slow by using LINQ to search the table (I have yet to find out if this is actually faster). There are about 4500 rows in the table I'm looking thus performance is important.
Ideally from my code I would like to have a collection of TableRow objects based on the LINQ query but I'm struggling a bit with the syntax.
My code so far is:
var Rows = main.TableRows.Where(x => (x.TableCells[0].ToString() == "Investments") && (x.TableCells[1].ToString() == DistributionId) && (x.TableCells[2].ToString() == RiskNumber));
This does not return a TableRowCollection and I'm not sure how to get it to do this?
Alternatively if you know that this will not be faster and there is a faster/ more sensible way I would greatly appreciate being informed.

Linq query where in list, performance what is the best?

I have a simple linq query that gets a slug from a database for a product.
var query = from url in urlTable
where url.ProductId == productId &&
url.EntityName == entityName &&
url.IsActive
orderby url.Id descending
select url.Slug
I am trying to optimize this, since it is run for every product and on a category page this is run x times the number of products.
I could do this (if i'm not mistaking), send in a list of products and do a new query.
var query = from url in urlTable
where productList.Contains(url.ProductId) &&
url.EntityName == entityName &&
url.IsActive
orderby url.Id descending
select url.Slug
But I have read somewhere that the performance of Contains is bad. Is there any other way to do this? What is the best method performance wise?
But I have read somewhere that the performance of Contains is bad.
I believe you're mixing this up with string.Contains, which indeed is a bad idea on large data sets, because it simply can't use any index at all.
In any case, why are you guessing on performance? You should profile and see what's better for yourself. Also, look at the SQL produced by each of the queries and look at their respective query plans.
Now, with that out of the way, the second query is better, simply because it grabs as much as it can during one query, thus removing a lot of the overhead. It isn't too noticeable if you're only querying two or three times, but once you get into say a hundred, you're in trouble. Apart from being better in the client-server communication, it's also better on the server, because it can use the index very effectively, rather than looking up X items one after another. Note that that's probably negligible for primary keys, which usually don't have a logarithmic access time.
The second option is better. I would add the product-id to the result so you can differentiate between products.
var query = from url in urlTable
where productList.Contains(url.ProductId) &&
url.IsActive
orderby url.Id descending
select new { ProductId, Slug }
Please note that your list of product-id's is converted to sql-parameters IN (#p1, #p2, #p3) and there is a maximum amount of sql-parameters per sql-query. I think limit is somewhere around 2000 parameter. So if you are quering for more than 2000 products, this solution will not work.
var query = from productId in productList
join url in urlTable on productId equals url.ProductId
where url.IsActive
orderby url.Id descending
select url.Slug;
I believe this query would have a better performance.

Categories

Resources