I have a table with a list of customers.
One customer has 0, 1 or more contract(s).
I must retrieve all the enabled customer, set them in a DTO and add the current contract to this DTO (If there is one)
For the moment, it is very slow (more than 10min).
CODE
List<CustomerOverviewDto> result = new List<CustomerOverviewDto>();
customers= context.Customers.Where(c => c.IsEnabled).ToList();
foreach (Customer customer in customers)
{
CustomerOverviewDto customerDto = GetCustomer(customer);
Framework.Contract contract =
customer.Contracts.Where(c => c.ContractEndDate >= DateTime.Today && c.ContractStartDate <= DateTime.Today)
.FirstOrDefault();
if (contract != null)
{
SetContract(customerDto, contract);
}
result.add(customerDto);
}
Use projection to only return the columns you work with by using "Select". If you have 36 columns this will give you better results.
customers= context.Customers.Where(c => c.IsEnabled).Select(cust => new Customer
{
Id = cust .Id
}).ToList();
https://www.talksharp.com/entity-framework-projection-queries
After that check in the queryplan if you have table scans or index scans. Try to avoid them by setting apropriate indexes.
I think the the problem is the query that retrieves the contract inside the loop. It would be better to retrieve all the data with one query like this:
var date = DateTime.Today;
var query =
from customer in context.Customers
where customer => customer.IsEnabled
select new
{
customer,
contract = customer.Contracts.FirstOrDefault(c => c.ContractEndDate >= date && c.ContractStartDate <= date)
};
var result = new List<CustomerOverviewDto>();
foreach (var entry in query)
{
CustomerOverviewDto customerDto = GetCustomer(entry.customer);
if (entry.contract != null)
SetContract(customerDto, entry.contract);
result.add(customerDto);
}
Ok so first of all, when you use .ToList(), you are executing the query right there, and pulling back every row that IsEnabled into the memory for working on. You want to do more on the database side.
result = context.Customers.Where(c => c.IsEnabled); //be lazy
Secondly, the query is only going to perform well and be optimised by the execution engine properly if it has indexes to work with.
Add some indexes on the fields that you are performing comparisons on.
Think about this line of code for example
customer.Contracts.Where(c => c.ContractEndDate >= DateTime.Today &&
c.ContractStartDate <= DateTime.Today).FirstOrDefault();
Is you don't have a foreign key from customers to contracts, and you have no indexes on ContractStartDate and ContractEndDate it is going to perform extremely poorly and is going to be run once for every customer that 'IsEnabled'
It seems you only want to do some thing when returns a value. So you can add this in you initial query, and include the contracts:
customers= context.Customers
.Include(c => c.Contracts)
.Where(c => c.IsEnabled
&& c.Contracts.Any(con => con.ContractEndDate >= DateTime.Today && con .ContractStartDate <= DateTime.Today))
.ToList();
foreach (Customer customer in customers)
{
CustomerOverviewDto customerDto = GetCustomer(customer);
Framework.Contract contract =
customer.Contracts.Where(c => c.ContractEndDate >= DateTime.Today && c.ContractStartDate <= DateTime.Today)
.First();
SetContract(customerDto, contract);
}
As I have no idea of what your domain model structure looks like or why you are not using navigation properties to map the CURRENT contract to the customer, you could do something like this.
You could do just 2 roundtrips to the database by materializing all the customers and contracts and then mapping them in memory to your DTO objects. Assuming you have CustomerId as FK and Customer.Id as PK.
List<CustomerOverviewDto> result = new List<CustomerOverviewDto>();
customers = context.Customers.Where(c => c.IsEnabled).ToList();
contracts = context.Contracts.Where(c => c.ContractEndDate >= DateTime.Today && c.ContractStartDate <= DateTime.Today).ToList();
foreach (Customer customer in customers)
{
var customerDto = GetCustomer(customer);
var contract = contracts.Where(c => c.CustomerId == customer.Id).FirstOrDefault();
if (contract != null)
{
SetContract(customerDto, contract);
}
result.add(customerDto);
}
I finally solved the problem by using 1 query and projection
context.Customers.Where(c => c.IsEnabled).Select(c => new CustomerOverviewDto{...}).ToList();
I directly retrieve the contract when creating the CustomerOverviewDto
Related
While evaluating some queries we found some possible optimization. The ideia is shown below but I currently don't know how to solve this.
Current query:
public static List<Object> SampleQuerySales(int store_id)
{
var query = (from clients in db.table1.Where(p => p.store_id == store_id)
from sales in db.table2.Where(q => q.customer_id == clients.customer_id))
select new Object {
...
}).ToList();
return query;
}
This returns all sales made, but its required only the latest sale (OperationDate) from a datetime reference. As obvious this became a bottleneck.
My ideia was to make it similar to query below, which is incorrect (doesn't compile). How can I achieve this dataset?
var query = (from clients in db.table1.Where(p => p.store_id == store_id)
from sales in db.table2.Where(q => q.customer_id == clients.customer_id
&& q.OperationDate <= dateReference)
.OrderByDescending(s => s.OperationDate).FirstOrDefault() //error
select new Object {
...
}).Tolist();
Since you only want one value from table2, use let instead of from:
var query = (from client in db.table1.Where(p => p.store_id == store_id)
let mostRecentSaleAfterDateReference = db.table2
.Where(q => q.customer_id == client.customer_id
&& q.OperationDate <= dateReference)
.OrderByDescending(s => s.OperationDate)
.FirstOrDefault()
select new Object {
...
}).Tolist();
I have the following Entity Framework function that it joining a table to a list. Each item in serviceSuburbList contains two ints, ServiceId and SuburbId.
public List<SearchResults> GetSearchResultsList(List<ServiceSuburbPair> serviceSuburbList)
{
var srtList = new List<SearchResults>();
srtList = DataContext.Set<SearchResults>()
.AsEnumerable()
.Where(x => serviceSuburbList.Any(m => m.ServiceId == x.ServiceId &&
m.SuburbId == x.SuburbId))
.ToList();
return srtList;
}
Obviously that AsEnumerable is killing my performance. I'm unsure of another way to do this. Basically, I have my SearchResults table and I want to find records that match serviceSuburbList.
If serviceSuburbList's length is not big, you can make several Unions:
var table = DataContext.Set<SearchResults>();
IQuerable<SearchResults> query = null;
foreach(var y in serviceSuburbList)
{
var temp = table.Where(x => x.ServiceId == y.ServiceId && x.SuburbId == y.SuburbId);
query = query == null ? temp : query.Union(temp);
}
var srtList = query.ToList();
Another solution - to use Z.EntityFramework.Plus.EF6 library:
var srtList = serviceSuburbList.Select(y =>
ctx.Customer.DeferredFirstOrDefault(
x => x.ServiceId == y.ServiceId && x.SuburbId == y.SuburbId
).FutureValue()
).ToList().Select(x => x.Value).Where(x => x != null).ToList();
//all queries together as a batch will be sent to database
//when first time .Value property will be requested
I'm trying to recreate this SQL query in LINQ:
SELECT *
FROM Policies
WHERE PolicyID IN(SELECT PolicyID
FROM PolicyRegister
WHERE PolicyRegister.StaffNumber = #CurrentUserStaffNo
AND ( PolicyRegister.IsPolicyAccepted = 0
OR PolicyRegister.IsPolicyAccepted IS NULL ))
Relationship Diagram for the two tables:
Here is my attempt so far:
var staffNumber = GetStaffNumber();
var policyRegisterIds = db.PolicyRegisters
.Where(pr => pr.StaffNumber == staffNumber && (pr.IsPolicyAccepted == false || pr.IsPolicyAccepted == null))
.Select(pr => pr.PolicyID)
.ToList();
var policies = db.Policies.Where(p => p.PolicyID.//Appears in PolicyRegisterIdsList)
I think I'm close, will probably make two lists and use Intersect() somehow but I looked at my code this morning and thought there has to be an easier way to do this,. LINQ is supposed to be a more readble database language right?
Any help provided is greatly appreciated.
Just use Contains:
var policies = db.Policies.Where(p => policyRegisterIds.Contains(p.PolicyID));
Also better store policyRegisterIds as a HashSet<T> instead of a list for search in O(1) instead of O(n) of List<T>:
var policyRegisterIds = new HashSet<IdType>(db.PolicyRegisters......);
But better still is to remove the ToList() and let it all happen as one query in database:
var policyRegisterIds = db.PolicyRegisters.Where(pr => pr.StaffNumber == staffNumber &&
(pr.IsPolicyAccepted == false || pr.IsPolicyAccepted == null));
var policies = db.Policies.Where(p => policyRegisterIds.Any(pr => pr.PolicyID == p.PolicyID));
I need to check for duplicate entries before saving entity to the database. Below is my current code
if (db.Product.Any(x => x.Code == entity.Code))
{
error.Add('Duplicate code');
}
if (db.Product.Any(x => x.Name == entity.Name))
{
error.Add('Duplicate name');
}
if (db.Product.Any(x => x.OtherField == entity.OtherField))
{
error.Add('Duplicate other field');
}
The problem with code above is that it made 3 db call to validate entity. This table has millions of record and this app will be used by thousand users. So it will hurt the performance badly. I could make it one query though
if (db.Product.Any(x => x.Code == entity.Code || x.Name == entity.Name || x.OtherField == entity.OtherField))
{
error.Add('Duplication found');
}
The problem with the second code is that i wouldnt know which field is duplicate.
What is the better way of doing this? Should i depend only on unique constraint in the database? However error from the database is ugly.
EDIT
I need to show all errors to the user if more than 1 duplicate fields.
Consider the scenario: if the duplicate fields are code and name. If i tell the user that the code already exists, then he changes the code and try to save it again. Then the second error (name field) shows. It makes the user hit save for a couple of times before successfully saving it.
If you have indexes on the fields Name, Code, and OtherField, then duplicate checking will not too long, but will still be 3 calls to the database instead of 1.
The usual solution in this case is the counting of duplicates. Then if count is equals to 0, there isn't duplicates.
Here you'll find some hacks to do it.
Short example:
var counts =(
from product in db.Products
group product by 1 into p
select new
{
Name = p.Count(x => x.Name == name),
Code = p.Count(x => x.Code == code),
OtherField = p.Count(x => x.OtherField == otherFields)
}
).FirstOrDefault();
if (counts.Name > 0)
error.Add("Duplicate name");
if (counts.Code > 0)
error.Add("Duplicate code");
Update: it's seems that it's possible to solve the problem even more simple method:
var duplicates =(
from product in db.Products
group product by 1 into p
select new
{
Name = p.Any(x => x.Name == name),
Code = p.Any(x => x.Code == code),
OtherField = p.Any(x => x.OtherField == otherFields)
}
).FirstOrDefault();
if (duplicates.Name)
error.Add("Duplicate name");
You can do something like this:
string duplicateField;
bool validationResult = db.Product.Any(x => {
if(x.Code == entity.Code){
duplicateField = "Code";
return true;
}
// Other field checks here
}
if(validationResult){
// Error in field <duplicateField>
}
1- You can select duplicate entity
var product = db.Product.FirstOrDefault(x => x.Code == entity.Code
|| x.Name == entity.Name
|| x.OtherField == entity.OtherField);
if (product == null)
;//no duplicates
if (product.Code == entity.Code)
{
error.Add('Duplicate code');
}
if (product.Name == entity.Name)
{
error.Add('Duplicate name');
}
if (product.OtherField == entity.OtherField)
{
error.Add('Duplicate other field');
}
2- You can create stored procedure for insert and check for duplicates in it;
EDIT:
OK, you can write something like this
var duplicates = (from o in db.Products
select new
{
codeCount = db.Products.Where(c => c.Code == entity.Code).Count(),
nameCount = db.Products.Where(c => c.Name == entity.Name).Count(),
otherFieldCount = db.Products.Where(c => c.OtherField == entity.OtherField).Count()
}).FirstOrDefault();
This will select number of each duplicate by fields.
One thing to note: you should have unique constraints in database anyway, because while u validating and saving data, another row with these values may be inserted before u insert them.
I have a table named "Employees". The employeeId from this table MAY be in another table (a many-to-many join tbale) named "Tasks". The other field in the "Tasks" table is a taskId linked to the "TaskDetails" table. This table includes details such as budgetHours.
Using EF4, how do I write the WHERE statement such that the return is employees assigned to tasks where the budgetHours is > 120 hours?
THe WHERE statement in the following limits rows in the Employees table but now I need to add the conditions on the TaskDetails table.
var assocOrg = Employees.Where(x => x.EmployeeTypeID == 2 && x.DepartmentName == "ABC").Select (x => x.EmployeeID);
Thanks!
If Employees has a navigation property named Tasks, try this:
var assocOrg = Employees.Where(x => x.EmployeeTypeID == 2 &&
x.DepartmentName == "ABC" &&
x.Tasks.Any(t => t.BudgetHours > 120))
.Select (x => x.EmployeeID);
If you table sturcture is as below,
TableName Employee Task TaskDetails
ReferenceKeys EmpID EmpdID/TaskID TaskID/BudgetHours
then use,
Employee.Where(x => x.Task.EmpID == x.EmpID && x.Task.TaskDetails.TaskID == x.Task.TaskID && x.Task.TaskDetails.BudgetHours > 120).select(x => x.EmpID)
Assuming you have a Tasks navigation property on the Employee entity, this should be straightforward:
var assocOrg = Employees.Where(x => x.Tasks.Any(t => t.BudgetHours > 120) && x.DepartmentName == "ABC").Select (x => x.EmployeeID);
Of course, this requires the Tasks property to be resolved at this point, either explicitly, via lazy-loading, or with .Include().
(kudos to #adrift for getting Tasks.Any() right... oops.)