I've been using Stopwatch and it looks like the below query is very expensive in terms of performance, even though what I already have below I find most optimal based on various reading (change foreach loop with for, use arrays instead of collection, using anonymous type not to take the whole table from DB). Is there a way to make it faster? I need to fill the prices array, which needs to be nullable. I'm not sure if I'm missing something?
public float?[] getPricesOfGivenProducts(string[] lookupProducts)
{
var idsAndPrices = from r in myReadings select
new { ProductId = r.ProductId, Price = r.Price };
float?[] prices = new float?[lookupProducts.Length];
for(int i=0;i<lookupProducts.Length;i++)
{
string id = lookupProducts[i];
if (idsAndPrices.Any(r => r.ProductId == id))
{
prices[i] = idsAndPrices.Where(p => p.ProductId == id)
.Select(a=>a.Price).FirstOrDefault();
}
else
{
prices[i] = null;
}
}
return prices;
}
It's likely every time you call idsAndPrices.Any(r => r.ProductId == id), you are hitting the database, because you haven't materialized the result (.ToList() would somewhat fix it). That's probably the main cause of the bad performance. However, simply loading it all into memory still means you're searching the list for a productID every time (twice per product, in fact).
Use a Dictionary when you're trying to do lookups.
public float?[] getPricesOfGivenProducts(string[] lookupProducts)
{
var idsAndPrices = myReadings.ToDictionary(r => r.ProductId, r => r.Price);
float?[] prices = new float?[lookupProducts.Length];
for (int i = 0; i < lookupProducts.Length; i++)
{
string id = lookupProducts[i];
if (idsAndPrices.ContainsKey(id))
{
prices[i] = idsAndPrices[id];
}
else
{
prices[i] = null;
}
}
return prices;
}
To improve this further, we can identify that we only care about products passed to us in the array. So let's not load the entire database:
var idsAndPrices = myReadings
.Where(r => lookupProducts.Contains(r.ProductId))
.ToDictionary(r => r.ProductId, r => r.Price);
Now, we might want to avoid the 'return null price if we can't find the product' scenario. Perhaps the validity of the product id should be handled elsewhere. In that case, we can make the method a lot simpler (and we won't have to rely on having the array in order, either):
public Dictionary<string, float> getPricesOfGivenProducts(string[] lookupProducts)
{
return myReadings
.Where(r => lookupProducts.Contains(r.ProductId))
.ToDictionary(r => r.ProductId, r => r.Price);
}
And a note unrelated to performance, you should use decimal for money
Assuming that idsAndPrices is an IEnumerable<T>, you should make it's initialization:
var idsAndPrices = (from r in myReadings select
new { ProductId = r.ProductId, Price = r.Price })
.ToList();
It's likely that the calls to:
idsAndPrices.Any(r => r.ProductId == id)
and:
idsAndPrices.Where(p => p.ProductId == id)
..are causing the IEnumerable<T> to be evaluated every time it's called.
Based on
using anonymous type not to take the whole table from DB
I assume myReadings is the database table and
var idsAndPrices =
from r in myReadings
select new { ProductId = r.ProductId, Price = r.Price };
is the database query.
Your implementation is far from optimal (I would rather say quite inefficient) because the above query is executed twice per each element of lookupProducts array - idsAndPrices.Any(...) and idsAndPrices.Where(...) statements.
The optimal way I see is to filter as much as possible the database query, and then use the most efficient LINQ to Objects method for correlating two in memory sequences - join, in your case left outer join:
var dbQuery =
from r in myReadings
where lookupProducts.Contains(r.ProductId)
select new { ProductId = r.ProductId, Price = r.Price };
var query =
from p in lookupProducts
join r in dbQuery on p equals r.ProductId into rGroup
from r in rGroup.DefaultIfEmpty().Take(1)
select r?.Price;
var result = query.ToArray();
The Any and FirstOrDefault are O(n) and redundant. You can get a 50% speed up just by removing theAll call. FirstOrDefault will give you back a null, so use it to get a product object (remove the Select). If you want to really speed it up you should just loop through the products and check if prices[p.ProductId] != null before setting prices[p.ProductId] = p.Price.
bit of extra code code there
var idsAndPrices = (from r in myReadings select
new { ProductId = r.ProductId, Price = r.Price })
.ToList();
for(int i=0;i<lookupProducts.Length;i++)
{
string id = lookupProducts[i];
prices[i] = idsAndPrices.FirstOrDefault(p => p.ProductId == id);
}
better yet
Dictionary<Int, Float?> dp = new Dictionary<Int, Float?>();
foreach(var reading in myReadings)
dp.add(r.ProductId, r.Price);
for(int i=0;i<lookupProducts.Length;i++)
{
string id = lookupProducts[i];
if(dp.Contains(id)
prices[i] = dp[id];
else
prices[i] = null;
}
Related
I searched some links here to change nested loops to single Linq, I tried using those, part of code is not working, I need some expert guidance to fix this,
UPDATE 1:
I guess wasn't clear in my explanation, the loops works fine! as expected, I am getting correct results, but I am doing optimization, instead of using two loops i need the same code to be converted to single linq.
here is the code :
foreach (var ob in all_request_list.Where(x => x.StartDate != x.EndDate)) {
int consq_dates = ob.EndDate.DateDiff(ob.StartDate);
for (int i = 0; i <= consq_dates; i++) {
combined_list.Add(new { ShiftID = ob.ShiftID, SkillID = ob.SkillID, EmployeeID = ob.EmployeeID, AssignDate = ob.StartDate.AddDays(i), ProfileID = ob.ProfileID });
}
}
I have problem adding increment variable i to ob.StartDate.AddDays(i).
any help will be appreciated.
Is this what you're looking for?
var items = from ob in all_request_list
where ob.StartDate != ob.EndDate
let consq_dates = ob.EndDate.DateDiff(ob.StartDate)
from i in Enumerable.Range(0, consq_dates + 1)
select new { ShiftID = ob.ShiftID, SkillID = ob.SkillID, EmployeeID = ob.EmployeeID, AssignDate = ob.StartDate.AddDays(i), ProfileID = ob.ProfileID };
combined_list.AddRange(items);
But: You've code that works. You understand that code. Why do you wan't to change that? BTW: Your two loops will be faster than that linq.
You can use the following Linq:
var items = all_request_list
.Where(x => x.StartDate != x.EndDate)
.SelectMany(x => Enumerable.Range(0, x.EndDate.DateDiff(x.StartDate) + 1)
.Select(y => new { ShiftID = x.ShiftID, SkillID = x.SkillID, EmployeeID = x.EmployeeID, AssignDate = x.StartDate.AddDays(y), ProfileID = x.ProfileID }))
combined_list.AddRange(items);
What it does exactly is Creating an IEumerable<> of results for each item in the all_request_list.Where using Enumerable.Range (This is the part which replaces your for loop), than flattens it using the SelectMany method.
It might be better than a for loop in the terms of readability/maintainability but keep in mind that Linq usually slower than plain loops (tl;dr: Understand what Linq does internally and what it will do in your case).
I don't know exactly what error you're getting, but it could be due to the fact that certain functions cannot be executed inside a linq statement, since it internally translates it to sql. Try this:
foreach (var ob in all_request_list.Where(x => x.StartDate != x.EndDate))
{
int consq_dates = ob.EndDate.DateDiff(ob.StartDate);
for (int i = 0; i <= consq_dates; i++)
{
var newDate = ob.StartDate.AddDays(i);
combined_list.Add(new { ShiftID = ob.ShiftID, SkillID = ob.SkillID, EmployeeID = ob.EmployeeID, AssignDate = newDate , ProfileID = ob.ProfileID });
}
}
If it still gives you an error, could you specify what error you're receiving, such as the name, type, etc.
How do you suppose I tackle this? Basically, I have this inital query:
var orders = (from order in _dbContext.Orders
join orderDetail in _dbContext.OrderDetails on order.ID equals orderDetail.OrderID
where order.StoreID == storeID
select new Order
{
ID = order.ID,
No = order.ID,
Type = "", // Notice that this is empty; this one needs updating
Quantity = order.Quantity,
// more properties here
}).AsQueryable();
After this query, I need to loop through the result and update the Type property based on different criteria like this:
string type = "";
foreach (OrderDetailDto order in orders)
{
if (order.UserID != null)
type = "UserOrder";
else if (order.UserID == null)
type = "NonUserOrder";
else if (order.Cook == null && (order.Option == "fiery"))
type = "FieryCook";
else if (check if this has corresponding records in another table) // this part I don't know how to effectively tackle
type = "XXX";
// Update.
order.Type = type;
}
The problem is one of my criteria needs me to check if there are existing record in the database. I would use JOIN but if I have to loop thru several hundreds or thousands of records and then JOIN each one of them then check on db just to get one value, I think that would be very slow.
I can't do the JOIN on the initial query because I might do a different JOIN based on a different criterion. Any ideas?
You could just join all the lookup tables you might possibly need in left join type way:
from o in Orders
from c in Cooks.Where(x => x.OrderId == m.OrderId).DefaultIfEmpty()
from u in Users.Where(x => x.OrderId == o.OrderId).DefaultIfEmpty()
select new
{
Order = m,
Cook = c,
User = u
}
or depending on your usage patterns you could build the required tables into local Lookups or Dictionaries for linear time searching thereafter:
var userDict = Users.ToDictionary(x => x.UserId);
var userIdDict = Users.Select(x => x.UserId).ToDictionary(x => x);
var cooksLookup = Cooks.ToLookup(x => x.Salary);
I am wondering if there is a better, more efficient way to re-code the linq syntax below to make the query run faster i.e. with a single call to the database. My database is located remotely which causes this to be quite slow:
var query = (from ticket in dataClassesDataContext.Tickets.Where(TicketsToShow.And(SearchVals))
select new
{
Priority = ticket.TicketPriority.TicketPriorityName,
Ticket = string.Format(TicketFormat, ticket.TicketID),
AssetId = ticket.Asset.Serial,
OpenDate = ticket.CheckedInDate,
OpenFor = CalculateOpenDaysAndHours(ticket.CheckedInDate, ticket.ClosedDate),
Account = ticket.Account.Customer.Name,
Description = ticket.Description.Replace("\n", ", "),
Status = ticket.TicketStatus.TicketStatusName,
Closed = ticket.ClosedDate,
THIS IS THE CAUSE ====>>> Amount = GetOutstandingBalanceForTicket(ticket.TicketID),
Paid = ticket.Paid,
Warranty = ticket.WarrantyRepair,
AssetLocation = GetAssetLocationNameFromID(ticket.Asset.LocationID, AssLocNames)
}).Skip(totalToDisplay * page).Take(totalToDisplay);
if (SortOrder.ToLower().Contains("Asc".ToLower()))
{
query = query.OrderBy(p => p.OpenDate);
}
else
{
query = query.OrderByDescending(p => p.OpenDate);
}//ENDIF
The main cause for the poor performance is the code in the function GetOutstandingBalanceForTicket below which calculates the sum of all items in an invoice and returns this as a total in a string:
public static string GetOutstandingBalanceForTicket(int TicketID)
{
string result = string.Empty;
decimal total = 0;
try
{
using (DataClassesDataContext dataClassesDataContext = new DataClassesDataContext(cDbConnection.GetConnectionString()))
{
var queryCustomerTickets = from ticket in dataClassesDataContext.Tickets
where
(ticket.TicketID == TicketID)
select ticket;
if (queryCustomerTickets != null)
{
foreach (var ticket in queryCustomerTickets)
{
var queryTicketChargeItems = from chargeItem in dataClassesDataContext.ProductChargeItems
where chargeItem.ChargeID == ticket.ChargeID &&
chargeItem.Deleted == null
select chargeItem;
foreach (var chargeItem in queryTicketChargeItems)
{
total += (chargeItem.Qty * chargeItem.Price);
}
}
}
}
}
catch (Exception ex)
{
}
return total.ToString("0.##");
}
Thank you in advance.
As you pointed out this code is quite slow as a query will be required for each ticket.
to eliminate the need for multiple queries you should look at applying an inner join between the ticketsToShow and the tickets entity (on the ticketid), using groupby to provide the sum of the charges for each ticket.
This is well illustrated in the answers to LINQ: Using INNER JOIN, Group and SUM
Ideally you would probably approach it more as an eager loading all at once type of setup. However, I do not think linq2sql supports that (I know EF does). One thing you can do is avoid the nested query though. Since you already have access to the ticket table, perhaps you should just issue a Sum() on it from your select statement. Hard for me to verify if any of this is an improvement so this code is kind of on the fly if you will.
//(from ticket in dataClassesDataContext.Tickets.Where(TicketsToShow.And(SearchVals))
(from ticket in dataClassesDataContext.Tickets
//this would be where you could eager load if possible (not entirely required)
//.Include is an EF method used only as example
/*.Include(t => t.TicketPriority)//eager load required entities
.Include(t => t.Asset)//eager load required entities
.Include(t => t.Account.Customer)//eager load required entities
.Include(t => t.TicketStatus)//eager load required entities
.Include(t => t.ProductChargeItems)//eager load required entities
*/
.Where(TicketsToShow.And(SearchVals))
select new
{
Priority = ticket.TicketPriority.TicketPriorityName,
Ticket = string.Format(TicketFormat, ticket.TicketID),
AssetId = ticket.Asset.Serial,
OpenDate = ticket.CheckedInDate,
OpenFor = CalculateOpenDaysAndHours(ticket.CheckedInDate, ticket.ClosedDate),
Account = ticket.Account.Customer.Name,
Description = ticket.Description.Replace("\n", ", "),
Status = ticket.TicketStatus.TicketStatusName,
Closed = ticket.ClosedDate,
//Use Sum and the foreign relation instead of a nested query
Amount = ticket.ProductChargeItems.Where(pci => pci.Deleted == null).Sum(pci => pci.Qty * pci.Price),
Paid = ticket.Paid,
Warranty = ticket.WarrantyRepair,
AssetLocation = GetAssetLocationNameFromID(ticket.Asset.LocationID, AssLocNames)
}).Skip(totalToDisplay * page).Take(totalToDisplay);
if (SortOrder.ToLower().Contains("Asc".ToLower()))
{
query = query.OrderBy(p => p.OpenDate);
}
else
{
query = query.OrderByDescending(p => p.OpenDate);
}
I think, you can make this query simplier. Somethink like this:
public static string GetOutstandingBalanceForTicket(DataClassesDataContext context, int TicketID)
{
decimal total = 0;
var total = (from ticket in context.Tickets
join chargeItem from context.ProductChargeItems on chargeItem.ChargeID == ticket.ChargeID
where (ticket.TicketID == TicketID && chargeItem.Deleted == null)
select chargeItem).Sum(chargeItem => chargeItem.Qty * chargeItem.Price);
return total.ToString("0.##");
}
/*...*/
Amount = GetOutstandingBalanceForTicket(dataClassesDataContext, ticket.TicketID),
Now, you can inline this methos in your query.
It can contains syntax errors, because I wrote it in notepad.
First, I'm grabbing ClientID. Then, I get all Invoices associated with that ClientID. I want to return data all ordered by InvoiceNumber, descending. Here's my code:
var rvInvoices =
(from i in db.QB_INVOICES_HEADER
where i.ClientID == cId
select i).ToList();
foreach (var itm in rvInvoices)
{
InvoiceModel cm = new InvoiceModel()
{
InvoiceNumber = itm.InvoiceNumber,
InvoiceSentDt = itm.InvoiceSentDt,
InvoiceDt = itm.InvoiceDt,
Amount = itm.Amount,
Term = itm.Term,
ClientName = itm.CI_CLIENTLIST.ClientName
};
listInvoices.Add(cm);
}
return listInvoices;
listInvoices.OrderByDescending(x => x.InvoiceNumber).ToList()
You should try something like this:
var rvInvoices =
(from i in db.QB_INVOICES_HEADER
where i.ClientID == cId
select i).OrderByDescending(x => x.InvoiceNumber);
And I don't see a reason you need to call .ToList().
You can do the order in three places.
In the initial query,
In the foreach, or
In the return
Option 1:
var rvInvoices =
(from i in db.QB_INVOICES_HEADER
where i.ClientID == cId
select i).OrderByDescending(i => i.InvoiceNumber).ToList();
Option 2:
foreach (var itm in rvInvoices.OrderByDescending(i => i.InvoiceNumber))
Option 3:
return listInvoices.OrderByDescending(i => i.InvoiceNumber).ToList();
I would suggest taking route 1 since it will run the order at the database level.
You should order them on the database instead of the client:
var rvInvoices = db.QB_INVOICES_HEADER
.Where(i => i.ClientID == cId)
.OrderByDescending(i => i.InvoiceNumber);
The method you currently have creates multiple lists, has an explicit foreach loop, and needs to have its output sorted. It can be done with just creating a single list, no explicit looping, and with the database doing the sorting for you:
return
(from i in db.QB_INVOICES_HEADER
where i.ClientID == cId
// have the database do the sorting
orderby i.InvoiceNumber descending
select i)
// break out of the DB query to make InvoiceModel
.ToEnumerable()
.Select(itm => new InvoiceModel()
{
InvoiceNumber = itm.InvoiceNumber,
InvoiceSentDt = itm.InvoiceSentDt,
InvoiceDt = itm.InvoiceDt,
Amount = itm.Amount,
Term = itm.Term,
ClientName = itm.CI_CLIENTLIST.ClientName
})
// only create one list as the last step
.ToList();
I have the following code. The function has a lot of Linq calls and I had help on putting this into place.
public IList<Content.Grid> Details(string pk)
{
IEnumerable<Content.Grid> details = null;
IList<Content.Grid> detailsList = null;
var data = _contentRepository.GetPk(pk);
var refType = this.GetRefType(pk);
var refStat = this.GetRefStat(pk);
var type = _referenceRepository.GetPk(refType);
var stat = _referenceRepository.GetPk(refStat);
details =
from d in data
join s in stat on d.Status equals s.RowKey into statuses
from s in statuses.DefaultIfEmpty()
join t in type on d.Type equals t.RowKey into types
from t in types.DefaultIfEmpty()
select new Content.Grid
{
PartitionKey = d.PartitionKey,
RowKey = d.RowKey,
Order = d.Order,
Title = d.Title,
Status = s == null ? null : s.Value,
StatusKey = d.Status,
Type = t == null ? null : t.Value,
TypeKey = d.Type,
Link = d.Link,
Notes = d.Notes,
TextLength = d.TextLength
};
detailsList = details
.OrderBy(item => item.Order)
.ThenBy(item => item.Title)
.Select((t, index) => new Content.Grid()
{
PartitionKey = t.PartitionKey,
RowKey = t.RowKey,
Row = index + 1,
Order = t.Order,
Title = t.Title,
Status = t.Status,
StatusKey = t.StatusKey,
Type = t.Type,
TypeKey = t.TypeKey,
Link = t.Link,
Notes = t.Notes,
TextLength = t.TextLength,
})
.ToList();
return detailsList;
}
The first uses one format for Linq and the second another. Is there some way that I could simplify and/or combine these? I would really like to make this code simpler but I am not sure how to do this. Any suggestions would be much appreciated.
Of course you can combine them. The Linq keywords such as from, where and select get translated into calls like the Extension methods that you call below, so effectively there's no difference.
If you really want to combine them, the quickest way is to put () around the first query, then append the method calls you use on details in the second query. Like this:
detailsList =
(from d in data // <-- The first query
// ...
select new Content.Grid
{
// ...
})
.OrderBy(item => item.Order) // <-- The calls from the second query
.ThenBy(item => item.Title)
.Select((t, index) => new Content.Grid()
{
//...
}).ToList();
But i think that would be ugly. Two queries are just fine IMO.