Optimizing A LINQ To Objects Query - c#

I'm trying to optimize the below LINQ query to improve it's speed performance. The number of objects it's searching against could be in the tens of thousands.
var lQuery = from o in oEvents
where (o.oSalesEvent != null && o.oSalesEvent.OccurDate < oCalcMgr.OccurDate && (
(oCalcMgr.InclTransTypes == Definitions.TransactionTypes.SalesAll) ?
(o.oSalesEvent.EventStateID == ApprovedID || o.oSalesEvent.EventStateID == PendingID) :
o.oSalesEvent.EventStateID == ApprovedID)) &&
((oCalcMgr.InclTransTypes == Definitions.TransactionTypes.SalesAll) ?
(o.oSalesMan.oEmployment.EventStateID == ApprovedID || o.oSalesMan.oEmployment.EventStateID == PendingID) :
o.oSalesMan.oEmployment.EventStateID == ApprovedID)
select new { SaleAmount = o.SaleAmount.GetValueOrDefault(), CompanyID = o.oSalesEvent.CompanyID };
The query basically says, give me the sales amounts and company ids from all sale events that occurred prior to a certain date. The sale event's status and the salesman's employment status should either always be "approved" or they can be also "pending" if specified.
As you can see there's a date comparison and a couple of integer comparisons. Which integer comparison used is based on whether or not a property matches a certain Enum value.
I have some ideas of my own on ways to go about the optimization, but I want to hear others thoughts, who might have more insight into how LINQ would translate this query behind the scenes.
Thanks

It seems to me that your biggest challenge is that you're doing multiple criteria checks in your Linq statement that will take alot of time.
What about creating a new property in oEvents - Say "IsEligable" and set it's value within the Set statements of your other variables (much faster than constant re-querying of each variable in the Linq).
Then, by the time you get to this part of your code, you could update your Linq to be something along the lines of:
var lQuery = from o in oEvents
where (o.oSalesEvent != null && o.oSalesEvent.OccurDate < oCalcMgr.OccurDate && o.IsEligable == True)
select new { SaleAmount = o.SaleAmount.GetValueOrDefault(), CompanyID = o.oSalesEvent.CompanyID };
... I'm guessing that would speed up the execution, but just a thought...

This is as much to improve readability as to possibly speed it up, but give this a try:
var lQueryTemp = from o in oEvents
where (o.oSalesEvent != null && o.oSalesEvent.OccurDate < oCalcMgr.OccurDate)
if (oCalcMgr.InclTransTypes == Definitions.TransactionTypes.SalesAll)
{
lQueryTemp = from o in lQueryTemp
where (o.oSalesEvent.EventStateID == ApprovedID || o.oSalesEvent.EventStateID == PendingID) &&
(o.oSalesMan.oEmployment.EventStateID == ApprovedID || o.oSalesMan.oEmployment.EventStateID == PendingID);
}
else
{
lQueryTemp = from o in lQueryTemp
where (o.oSalesEvent.EventStateID == ApprovedID && o.oSalesMan.oEmployment.EventStateID == ApprovedID);
}
var lQuery = from o in lQueryTemp
select new { SaleAmount = o.SaleAmount.GetValueOrDefault(), CompanyID = o.oSalesEvent.CompanyID };
This might speed it up by pulling out both checks of oCalcMgr.InclTransTypes, which is effectively a constant for purposes of this query.

Related

Fastest way to search huge records using Linq query in AngularJs & C#

I have to Perform Global Search on table means if user enters any keyword or multiple keywords and clicks on search button then based on the entered keywords it should bring all the combination records.
We have to search those 2 keywords in every column of a table (Like clause in SQL with OR operator for multiple keywords) and query should fetch the data.
I have around 200k of records in the database.
First calling function to load the data
if ((Role)user.Role == Role.InternalAdministrator || (Role)user.Role ==
Role.InternalStaff)
{
listJobs = (
from d in db.Jobs
where d.TimeCreated.Value.Year >= 2020
select new JobModel()
{
AlternatePickupDelivery = d.AlternatePickupDelivery,
Branch = (
from b in db.Branches
where b.BranchId == d.ProcessingCity
select b.Branch1
).FirstOrDefault(),
ClientName = d.ClientName,
ClientId = d.ClientId,
ContactName = d.ContactName,
MatterReference = d.MatterReference,
JMSNumber = d.JmsNumber,
JobDescription = d.JobDescription,
JobId = d.JobId,
JobShortDescription = d.JobShortDescription,
OrderType = d.OrderType,
OrderTypeDisplay = (
from jt in db.JobTypes
where jt.Id == d.OrderType
select jt.JobTypeName
).FirstOrDefault(),
ProcessingCity = d.ProcessingCity ?? 0,
DisplayProcessingCity = (
from jt in db.ProcessingCities
where jt.ProcessingCityId == d.ProcessingCity
select jt.ProcessingCity1
).FirstOrDefault(),
Status = d.Status,
DisplayStatus = (
from jt in db.JobStatuses
where jt.Id == d.Status
select jt.JobStatusName
).FirstOrDefault(),
StatusDisplayOrder = (from js in db.JobStatuses
where js.Id == d.Status
select js.DisplayOrder).FirstOrDefault(),//d.JobStatus.DisplayOrder,
StatusLastModifiedBy = (
from u in db.Users
where (u.UserId == d.StatusLastModifiedById)
select u.FirstName + " " + u.LastName
).FirstOrDefault(),
StatusLastModifiedById = d.StatusLastModifiedById,
StatusLastModified = d.StatusLastModified ?? DateTime.UtcNow,
CreatedByDisplay = (
from u in db.Users
where (u.UserId == d.CreatedById)
select u.FirstName + " " + u.LastName
).FirstOrDefault(),
CreatedById = d.CreatedById,
ModifiedByDisplay = (
from u in db.Users
where (u.UserId == d.LastModifiedById)
select u.FirstName + " " + u.LastName
).FirstOrDefault(),
LastModifiedById = d.LastModifiedById,
TimeCreated = d.TimeCreated ?? DateTime.UtcNow,
TimeDelivered = (d.Status == (int)JMS4.Utilities.JobStatus.Delivered) ? d.StatusLastModified :
null,
TimeDue = d.TimeDue ?? DateTime.UtcNow,
TimeReady = d.TimeReady ?? DateTime.UtcNow,
TimeZoneId = timeZoneId.ToString(),
ExtClientId = d.ExtClientId,
Address = d.Address,
ReceivedBy = d.ReceivedBy,
ContactPhone = d.ContactPhone,
AfterHourContactNumber = d.AfterHoursContactNumber,
Email = d.Email,
CostEstimateNumber = d.CostEstimateNumber,
LastModifiedBy = d.LastModifiedBy,
MatterType = d.MatterType,
QaData = d.QaData,
InternalInstructions = d.InternalInstructions,
GlobalSearch = d.GlobalSearch
}
);
Then calling second function if search textbox have any keyword/keywords to search
jobs = jobs.Where(x => x.JMSNumber.ToLower().Contains(keyword.ToLower())
|| (x.ClientName != null && x.ClientName.ToLower().Contains(keyword.ToLower()))
|| (x.MatterReference != null && x.MatterReference.ToLower().Contains(keyword.ToLower()))
|| (x.ContactName != null && x.ContactName.ToLower().Contains(keyword.ToLower()))
|| (x.JobShortDescription != null &&
x.JobShortDescription.ToLower().Contains(keyword.ToLower()))
|| (x.StatusLastModifiedBy != null &&
x.StatusLastModifiedBy.ToLower().Contains(keyword.ToLower()))
|| (x.Address != null && x.Address.ToLower().Contains(keyword.ToLower()))
|| (x.Email != null && x.Email.ToLower().Contains(keyword.ToLower()))
|| (x.LastModifiedBy != null &&
x.LastModifiedBy.ToString().ToLower().Contains(keyword.ToLower()))
|| (x.CostEstimateNumber != null &&
x.CostEstimateNumber.ToLower().Contains(keyword.ToLower()))
|| (x.ClientId != null && x.ClientId.ToString().ToLower().Contains(keyword.ToLower()))
|| (x.JobDescription != null && !String.IsNullOrEmpty(x.JobDescription.ToString()) &&
x.JobDescription.ToString().ToLower().Contains(keyword.ToLower()))
|| (x.CreatedByDisplay != null && !String.IsNullOrEmpty(x.CreatedByDisplay.ToString()) &&
x.CreatedByDisplay.ToString().ToLower().Contains(keyword.ToLower()))
|| (x.ModifiedByDisplay != null && !String.IsNullOrEmpty(x.ModifiedByDisplay.ToString()) &&
x.ModifiedByDisplay.ToString().ToLower().Contains(keyword.ToLower()))
|| (x.InternalInstructions != null &&
x.InternalInstructions.ToString().ToLower().Contains(keyword.ToLower()))
);
AFter using these queries, it is taking more than 3+ minutes to fetch the records.
Please suggest how can i improve the search performance and optimize the query.
To optimise a query like this against a database there are a few rules to try and follow
Make sure the query is passed through to the database, do not operate in memory
Remove or reduce the use of functions
Don't bother comparing nulls
Split the query into multiple parallel queries
Improve the structure to optimise the query
The general idea is that you want to make your comparison directly in the index entries, function or conversions on records in the database will not use the indexes. Databases are specifically optimised to query against Indexes, so it will be important to also create the necessary indexes on your search columns.
You have tagged this as linq-to-sql so we assume that your query is being passed through to the DB, it is important that you make sure it does. The following code and advice will only work if the LINQ expression is a genuine IQueryable<T> that will be resolved into SQL.
If your database uses a CASE INSENSITIVE collation, then you can drop all the .ToLower() function calls, you want to avoid function calls so any indexes can be accessed directly.
Event though C# is sensitive to case by default, if the LINQ query is translated to SQL then it will obey to collation settings for standard LIKE '%' + #param + '%' comparison.
Skip the null comparison, just like the .ToLower() it is not necessary in SQL to check the nullability of a field first before executing a comparison on that field.
This is already a far better filter:
jobs = jobs.Where(x => x.JMSNumber.Contains(keyword)
|| x.ClientName.Contains(keyword)
|| x.MatterReference.Contains(keyword)
|| x.ContactName.Contains(keyword)
|| x.JobShortDescription.Contains(keyword)
|| x.StatusLastModifiedBy.Contains(keyword)
|| x.Address.Contains(keyword)
|| x.Email.Contains(keyword))
|| x.LastModifiedBy.Contains(keyword))
|| x.CostEstimateNumber.Contains(keyword))
|| x.ClientId.ToString().Contains(keyword))
|| x.JobDescription.Contains(keyword))
|| x.CreatedByDisplay.Contains(keyword))
|| x.ModifiedByDisplay.Contains(keyword))
|| x.InternalInstructions.Contains(keyword))
);
Store values in a searchable format.
We can do better that the above if you really need to search on numeric fields, like in this case ClientId then it helps to store the numeric values in a string based column because our search argument is a string.
The easiest way to implement a variant of a field in the database is to use a computed column, however it needs to be a write computed or peristed column to realise the benefit for a search index. Make the computed column out of the expression:
CAST(ClientId as char(10))
The same rule applies for any other column that might need a function applied to it, you will see greater performance if you move the function evaluation to the time that the record is INSERT or UPDATE, which happens with a much lower frequency to reads via SELECT.
Normalize the structure, most of your comparisons are on a user - displayname if the query joins on to a user table, then you only have one seek for any user that matches instead of 1 separate seek for each user
There is a clear related table here for the User who is applying the data modifications. This can add a great deal of redundant information in the search query. Ideally we do not search across the user fields, as any match there would bring up all records that are associated with them, it is not usually a good search candidate, unless users do not edit many records. So if you can, exclude them from the general search, and allow the user to pick from a list of users to scope the results, or to search from the users in parallel with the main search
Now the search is much quicker: (this assumes a new column called ClientIdString)
jobs = jobs.Where(x => x.JMSNumber.Contains(keyword)
|| x.ClientName.Contains(keyword)
|| x.MatterReference.Contains(keyword)
|| x.ContactName.Contains(keyword)
|| x.JobShortDescription.Contains(keyword)
|| x.Address.Contains(keyword)
|| x.Email.Contains(keyword))
|| x.CostEstimateNumber.Contains(keyword))
|| x.ClientIdString.Contains(keyword))
|| x.JobDescription.Contains(keyword))
|| x.InternalInstructions.Contains(keyword))
);
Hypothetical User Search for searching and then filtering by the users:
var userIds = db.Users.Where(u => u.UserName.Contains(keyword))
.Select(u => u.Id)
.ToList();
//Filter to only rows that match the user lookup
if (jobs.Any())
{
jobs = jobs.Where(x => userIds.Contains(x.StatusLastModifiedByUserId)
|| userIds.Contains(x.LastModifiedByUserId)
|| userIds.Contains(x.CreatedByUserId)
|| userIds.Contains(x.ModifiedByUserId)
);
}
If the schema is NOT already normalised, then I strongly suggest you do at least normalise out the users into their own table, indexing the possible users is much more efficient than searching across 200K+ records.
It is also possible that we can write the query directly in SQL. Sometimes we can write much more efficient SQL by hand than we might be able to achieve through LINQ don't feel bad about that, just recognise that it is one of many tools at your disposal.
Most Linq providers will give you an explicit mechanism for executing raw SQL that will return into a linq expression. The detail for this is out of scope, but searching is a specific scenario where this is accepted.
There are other external options too like SQL Server Full Text Search or MySQL FULLTEXT Indexes or even Microsoft Azure Search or Elastic Search. These external mechanisms can be used to return the search content directly or they might return references that you can use to access the records in your local DB.
Many NoSQL providers can be used to construct an efficient search index, the products I listed above as simply designed for searching and are likely to implement a lot of industry
Indexes
All of the above assumes that you have implemented adequate indexes on the underlying data store. Searching can have such a large impact on the user experience, it is worth putting in the effort to get it right.
Assessing and Implementing Indexes is out of scope for this question

Linq Loop Compare Values ListBox ASP.NET C#

I am trying to condense my code by avoiding multiple foreach loops to accomplish this task. I have a listbox that is populated by Table A. I need to compare those values with Table B to populate another list.
Table A has a 1 to many relationship with Table B and while my solution worked for the time being, it is using quite a bit of memory so I need to condense it.
List<int> listProj = new List<int>();
var _tableB = from t in TableB
where t.StatID == 1 || t.StatID == 2
select p.ID;
var _tableA = from ListItem n in lstTableA.Items
where _tableB.Contains(int.Parse(n.Value)) && n.Selected == true
select n;
foreach (ListItem i in _tableA)
{
int affID = Convert.ToInt32(i.Value);
if (TableB.Where(t => t.ID == affID && t.StatID == 1 || t.StatID == 2).Any()
{
foreach(var item in TableB.Where(t => t.ID == affID && t.StatID == 1 || t.StatID == 2)
{
int pID = Convert.ToInt32(item.pID);
listProject.Add(projID);
}
}
}
The main problem is that these two loops are looping through quite a bit of records which is causing a memory leak. I feel that there is a way to grab many records at once and add them to the list, hence the one to many relationship between Table A and Table B.
I think this would give you the same result as the whole code above unless I'm making a mistake on the logic of your program.
This would return the project ids:
List<int> listProj = (from t in TableB
where (from n in lstTableA.Items.Cast<ListItem>().ToList()
where n.Selected == true
select n).Any(g => int.Parse(g.Value) == t.ID)
&& (t.StatID == 1 || t.StatID == 2)
select t.pID).ToList();

Retrieving a field that does not belong to the model

Considere this piece of code in LINQ (please focus on var list2):
var list1 = ....... /* This linq doesnt matter. Just for clarify that it is used in the below linq */
var list2 = dba.OrderForm
.Where(q => q.OrderPriority.OrderPriorityID == orderpriorityID
&& q.StockClass.StockClassID == stockclassID
&& dba.AuditTrailLog.Where(log => q.OrderID == log.ObjectID)
.Any(log => log.ColumnInfoID == 486
&& log.OldValue == "2"
&& log.NewValue == "3")
&& dba.AuditTrailLog.Where(log2 => q.OrderID == log2.ObjectID)
.Any(log2 => log2.ColumnInfoID == 487
&& log2.OldValue == "1"
&& log2.NewValue == "2")
&& lista.Contains(q.OrderID));
This way I have in list2 a list of records that belongs to OrderForm model. I need to pass it to another model called ViewResult:
What I need is to get the variable log2.ModificationDate that belongs to AuditTrailLog table but it is not included on OrderForm Model
List<ViewResult> vr = new List<ViewResult>();
foreach (OrderForm o in list2)
{
ViewResult r = new ViewResult();
r.NumOrden = o.FormNo;
r.Title = o.Title;
r.Com = o.OrderPriority.Descr;
r.OClass = o.StockClass.Descr;
r.RodT = /* <<------ Here is where I need to assign log2.ModificationDate
vr.Add(r);
}
Thanks.
What I understand is AuditTrailLog relation is null while you are getting data. And you want to fill it with related data.
You must Include this table like:
(That means you are doing join on sql)
var list2 = dba.OrderForm.Include("AuditTrailLog")...
It is important the relation between them. "One to many" or "many to one". Use AuditTrailLog or AuditTrailLogs according to your relation.

How to improve Linq-To-Sql code

Using the StackExchange.Profiling.MiniProfiler class to profile an ASP.NET MVC application with Linq-To-Sql as ORM.
I'm trying to reduce one action to one SQL, so that I don't have any duplicates anymore.
So I changed my linq-to-sql code accordingly, but it didn't have any positive effect on the speed.
Then I checked the time that is needed for the SQL.
This shows the MiniProfiler:
When I fire up the exact same SQL in Management Studio it is super fast:
Here is the code:
from t in type
let tDoc = (from d in context.documents
where d.Key == t.No
&& d.RType == (int)RType.Art
&& d.AType == (int)AType.Doc
select d).FirstOrDefault(d => d.UseForThumb)
select new Time
{
Id = t.Id,
//... more simple mappings here
// then a complex one:
DocsCount = context.documents.Count(d =>
(d.Key == t.Id.ToString()
&& d.RType == (int)RType.Type
&& d.AType == (int)AType.Doc)
||
(d.Key == t.No
&& d.RType == (int)RType.Art
&& d.AType == (int)AType.Doc)),
// and another one
ThumbId = (tDoc != null && tDoc.FRKey.HasValue) ? tDoc.FRKey.Value : 0
};
What can be the reason for the huge difference? - Edit: There is no difference, I just misenterpreted SSMS :(
Anyway, my problem persits. What could I change to make it faster?
I read sometime that the mapping from Linq-To-Sql has a performance problem. Is there a way to workaround this?
I did some trial and error and changed the Linq-To-Sql code to this:
from t in types
let docs = context.documents.Where(d => (d.RKey == t.Id.ToString()
&& d.RType == (int)RType.Type
&& d.AType == (int)AType.Doc)
||
(d.RKey == t.No
&& d.RType == (int)RType.Art
&& d.AType == (int)AType.Doc))
let tDoc = docs.FirstOrDefault(d => d.RType == (int)RType.Art && d.UseForThumb)
let docsCount = docs.Count()
select new Time
{
Id = t.Id,
//... more simple mappings here
DocsCount = docsCount,
ThumbId = (tDoc != null && tDoc.FRKey.HasValue) ? tDoc.FRKey.Value : 0,
}
This made the query much, much faster.

linq - include where parameters

I have a strange question about Linq.
I have this query:
var results = (from p in hotsheetDB.Properties
where p.PCode == pCode
&& p.PropertyStatusID == propertyStatuses
orderby p.PropertyID descending
select new
{
PropertyId = p.PropertyID,
PCode = p.PCode,
PropertyTypeName = p.cfgPropertyType.Name,
FullAddress = p.Address1 + " " + p.Address2,
ZipCode = p.ZipCode.Code,
CityName = p.cfgCity.Name,
LivingSquareFeet = p.LivingSquareFeet,
LotSquareFeet = p.LotSquareFeet,
NumBedrooms = p.NumBedrooms,
NumBathrooms = p.NumBathrooms,
PropertyStatusName = p.cfgPropertyStatuse.Name
});
You notice pCode and propertyStatuses parameters. They are input values from the users. He wants to search by pCode or/and by propertyStatuses.
So, when the user fills in only pCode he wants to return all the records with that pCode having ANY propertyStatuses...well, because propertyStatuses IS in the query but it's null, the query will not return anything (because there is no record with empty(null) propertyStatuses...
Therefore, the question: is there any way to include these where params only whey they have values? (without making separate N queries with all the combination? (I have multiple inputs)
Thanks in advance..
You could change your where clause to make the parts which include null always return true.
For example:
where (pCode == null || p.PCode == pCode)
&& (propertyStatuses == null || p.PropertyStatusID == propertyStatuses)
I'm only guessing here but try:
where p.PCode == pCode &&
(p.PropertyStatusID == null || p.PropertyStatusID == propertyStatuses)

Categories

Resources