I have a simple foreach loop that goes through the productID's I have stored in a user's basket and looks up the product's details from the database.
As you can see from my code, what I have at present will return the very last item on screen - as the variable is overwritten within the loop. I'd like to be able to concat this so that I can display the product details for the items only in the basket.
I know I could do something very easy like store only ProductIDs in the repeater I use and onitemdatabound call the database there but I'd like to make just one database call if possible.
Currently I have the following (removed complex joins from example, but if this matters let me know):
IQueryable productsInBasket = null;
foreach (var thisproduct in store.BasketItems)
{
productsInBasket = (from p in db.Products
where p.Active == true && p.ProductID == thisproduct.ProductID
select new
{
p.ProductID,
p.ProductName,
p.BriefDescription,
p.Details,
p.ProductCode,
p.Barcode,
p.Price
});
}
BasketItems.DataSource = productsInBasket;
BasketItems.DataBind();
Thanks for your help!
It sounds like you really want something like:
var productIds = store.BasketItems.Select(x => x.ProductID).ToList();
var query = from p in db.Products
where p.Active && productIds.Contains(p.ProductID)
select new
{
p.ProductID,
p.ProductName,
p.BriefDescription,
p.Details,
p.ProductCode,
p.Barcode,
p.Price
};
In Jon's answer, which works just fine, the IQueryable will however be converted to an IEnumerable, since you call ToList() on it. This will cause the query to be executed and the answer retrieved. For your situation, this may be OK, since you want to retrieve products for a basket, and where the number of products will probably be considerably small.
I am, however, facing a similar situation, where I want to retrieve friends for a member. Friendship depends on which group two members belongs to - if they share at least one group, they are friends. I thus have to retrieve all membership for all groups for a certain member, then retrieve all members from those groups.
The ToList-approach will not be applicable in my case, since that would execute the query each time I want to handle my friends in various ways, e.g. find stuff that we can share. Retrieving all members from the database, instead of just working on the query and execute it at the last possible time, will kill performance.
Still, my first attempt at this situation was to do just this - retrieve all groups I belonged to (IQueryable), init an List result (IEnumerable), then loop over all groups and append all members to the result if they were not already in the list. Finally, since my interface enforced that an IQueryable was to be returned, I returned the list with AsIQueryable.
This was a nasty piece of code, but at least it worked. It looked something like this:
var result = new List<Member>();
foreach (var group in GetGroupsForMember(member))
result.AddRange(group.GroupMembers.Where(x => x.MemberId != member.Id && !result.Contains(x.Member)).Select(groupMember => groupMember.Member));
return result.AsQueryable();
However, this is BAD, since I add ALL shared members to a list, then convert the list to an IQueryable just to satisfy my post condition. I will retrieve all members that are affected from the database, every time I want to do stuff with them.
Imagine a paginated list - I would then just want to pick out a certain range from this list. If this is done with an IQueryable, the query is just completed with a pagination statement. If this is done with an IEnumerable, the query has already been executed and all operations are applied to the in-memory result.
(As you may also notice, I also navigate down the entity's relations (GroupMember => Member), which increases coupling can cause all kinds of nasty situations further on. I wanted to remove this behavior as well).
So, tonight, I took another round and ended up with a much simpler approach, where I select data like this:
var groups = GetGroupsForMember(member);
var groupMembers = GetGroupMembersForGroups(groups);
var memberIds = groupMembers.Select(x => x.MemberId);
var members = memberService.GetMembers(memberIds);
The two Get methods honor the IQueryable and never convert it to a list or any other IEnumerable. The third line just performs a LINQ query ontop of the IEnumerable. The last line just takes the member IDs and retrieves all members from another service, which also works exclusively with IQueryables.
This is probably still horrible in terms of performance, but I can optimize it further later on, if needed. At least, I avoid loading unnecessary data.
Let me know if I am terribly wrong here.
Related
I am having difficulty trying to use LINQ to query a sql database in such a way to group all objects (b) in one table associated with an object (a) in another table into an anonymous type with both (a) and a list of (b)s. Essentially, I have a database with a table of offers, and another table with histories of actions taken related to those offers. What I'd like to be able to do is group them in such a way that I have a list of an anonymous type that contains every offer, and a list of every action taken on that offer, so the signature would be:
List<'a>
where 'a is new { Offer offer, List<OfferHistories> offerHistories}
Here is what I tried initially, which obviously will not work
var query = (from offer in context.Offers
join offerHistory in context.OffersHistories on offer.TransactionId equals offerHistory.TransactionId
group offerHistory by offerHistory.TransactionId into offerHistories
select { offer, offerHistories.ToList() }).ToList();
Normally I wouldn't come to SE with this little information but I have tried many different ways and am at a loss for how to proceed.
Please try to avoid .ToList() calls, only do if really necessary. I have an important question: Do you really need all columns of OffersHistories? Because it is very expensive grouping a full object, try only grouping the necessary columns instead. If you really need all offerHistories for one offer then I'm suggesting to write a sub select (this is also cost more performance):
var query = (from offer in context.Offers
select new { offer, offerHistories = (from offerHistory in context.OffersHistories
where offerHistory.TransactionId == offer.TransactionId
select offerHistory) });
P.s.: it's a good idea to create indexes for foreign key columns, columns that are used in where and group by statements, those are going to make the query faster,
I have an app that retrieves data requested by the user. All parameters except Type are optional. If a parameter is not specified, all items are retrieved. If it is specified, only items corresponding that parameter are retrieved. For example, here I retrieve products by year of release (-1 is the default value, if the user hasn't specified one):
var products = context.Products.Where(p => p.type == Type).ToList();
if (!(Year == -1))
products = products.Where(p => p.year == Year).ToList();
This works perfectly fine for some of the years. E.g., if I search 2001, I get all entries needed. But since products has a limited size and only retrieves 1500 entries, later years are simply not retrieved, not in the products list, and it comes up as no data for that year, even though there is data in the DB.
How can I get around this problem?
One of the nice things about deferred execution on LINQ is it can help make code that has variable filtering rules a lot more neat and readable. If you're not sure what deferred execution is, in a nutshell it's a mechanism that only runs the LINQ query when you ask for the results rather than when you make the statements that comprise the query.
In essence this means we can have code like:
//always adults
var p = person.Where(x => x.Age > 18);
//we maybe filter on these
if(email != null)
p = p.Where(x => x.Email == email);
if(socialSN != null)
p = p.Where(x => x.SSN == socialSN);
var r = p.ToList(); //the query is only actually run now
The multiple calls to where here are cumulative; they will conceptually build a where clause but not execute the query until ToList is called. At this point, if a database is in use then the db sees the query with all its Where clauses and can leverage indexes and statistics
If we were to use ToList after every Where, then the first Where would hit the db and it's whole dataset would download to the client app, and the runtime would set about converting an enumerable to a list (a lot of copying and memory allocating). The subsequent Where would filter the list in the client app, enumerating it but then converting it to a list again - the big problem being its done in the memory of the client app as some naive unindexed loop, and all those millions of dollars of r&d Microsoft poured into making their SQL Server query optimizer pull huge amounts of data very quickly, are wasted :)
Consider also that that first clause in my example set- Age>18 could be huge; a million people of a spread of ages over age 12, for example - A large amount of data is true for that predicate. Email or SSN would be a far smaller dataset, probably indexed etc. It's a contrived example sure but hopefully well illustrates the point about performance; by ToList()ing too early we end up downloading too much data
We are working on a Project in which there are many linq queries are not optimized, because as they started on the project they used the property virtual for all of their models.
My task is to optimize the max number of the queries, in order to enhance the app performance.
The problem is if I use the Include function and delete all virtual properties from the model, lot of things stop working and the number of affected functions is huge.
So I thought if I can find some thing resemble to "exclude" to exclude the unnecessary sub queries in some cases.
(with the assumption of your result set implements ienumerable)
My first choice would be:
ListMain.Except(ItemsToExclude);
Or, I would go with (not) "Contains" as follows and have check in-between to exclude the records. This may not be the best way out there but I could work.
!ListMain.Contains(ItemsToExclude)
I don't know if I got the question right, but to avoid loading certain attributes or related objects, you can make an additional Select() including only what's needed.
Example:
A simple ToList() will bring the entire object from the table:
var resultList = await dbContext.ABTests.AsNoTracking().ToListAsync();
It will derives in the query:
SELECT [a].[Id], [a].[AssignedUsers], [a].[EndDate], [a].[Groups], [a].[Json], [a].[MaxUsers], [a].[Name], [a].[NextGroup], [a].[StartDate]
FROM [ABTests] AS [a]
(which includes all mapped fields of the ABTest object)
To avoid fetching all, you can do as follow:
var resultList = await dbContext.ABTests.AsNoTracking().Select(x => new ABTest
{
Id = x.Id,
Name = x.Name
}).ToListAsync();
(supposing that you only wan the fields Id and Name to be eagerly loaded)
The resulting SQL Query will be:
SELECT [a].[Id], [a].[Name]
FROM [ABTests] AS [a]
Profiling my code because it is taking a long time to execute, it is generating a SELECT instead of a COUNT and as there are 20,000 records it is very very slow.
This is the code:
var catViewModel= new CatViewModel();
var catContext = new CatEntities();
var catAccount = catContext.Account.Single(c => c.AccountId == accountId);
catViewModel.NumberOfCats = catAccount.Cats.Count();
It is straightforward stuff, but the code that the profiler is showing is:
exec sp_executesql N'SELECT
[Extent1].xxxxx AS yyyyy,
[Extent1].xxxxx AS yyyyy,
[Extent1].xxxxx AS yyyyy,
[Extent1].xxxxx AS yyyyy // You get the idea
FROM [dbo].[Cats] AS [Extent1]
WHERE Cats.[AccountId] = #EntityKeyValue1',N'#EntityKeyValue1 int',#EntityKeyValue1=7
I've never seen this behaviour before, any ideas?
Edit: It is fixed if I simply do this instead:
catViewModel.NumberOfRecords = catContext.Cats.Where(c => c.AccountId == accountId).Count();
I'd still like to know why the former didn't work though.
So you have 2 completely separate queries going on here and I think I can explain why you get different results. Let's look at the first one
// pull a single account record
var catAccount = catContext.Account.Single(c => c.AccountId == accountId);
// count all the associated Cat records against said account
catViewModel.NumberOfCats = catAccount.Cats.Count();
Going on the assumption that Cats has a 0..* relationship with Account and assuming you are leveraging the frameworks ability to lazily load foreign tables then your first call to catAccounts.Cats is going to result in a SELECT for all the associated Cat records for that particular account. This results in the table being brought into memory therefore the call to Count() would result in an internal check of the Count property of the in-memory collection (hence no COUNT SQL generated).
The second query
catViewModel.NumberOfRecords =
catContext.Cats.Where(c => c.AccountId == accountId).Count();
Is directly against the Cats table (which would be IQueryable<T>) therefore the only operations performed against the table are Where/Count, and both of these will be evaluated on the DB-side before execution so it's obviously a lot more efficient than the first.
However, if you need both Account and Cats then I would recommend you eager load the data on the fetch, that way you take the hit upfront once
var catAccount = catContext.Account.Include(a => a.Cats).Single(...);
Most times, when somebody accesses a sub-collection of an entity, it is because there are a limited number of records, and it is acceptable to populate the collection. Thus, when you access:
catAccount.Cats
(regardless of what you do next), it is filling that collection. Your .Count() is then operating on the local in-memory collection. The problem is that you don't want that. Now you have two options:
check whether your provider offer some mechanism to make that a query rather than a collection
build the query dynamically
access the core data-model instead
I'm pretty confident that if you did:
catViewModel.NumberOfRecords =
catContext.Cats.Count(c => c.AccountId == accountId);
it will work just fine. Less convenient? Sure. But "works" is better than "convenient".
I have a model where a Product can have multiple PriceDrops. I'm trying to generate a list of products with the most recent price drops.
Getting the most recent price drops with the products loaded is easy enough, and I thought it would be the best way to start:
dlo.LoadWith<PriceDrop>(pd => pd.Product);
db.LoadOptions = dlo;
return db.PriceDrops.OrderBy(d=>d.CreatedTime);
Works great for a list of recent price drops, but I want a list of products. If I append a ".Select(d=>d.Product)" I get a list of Products back - which is perfect - but they are no longer associated with the PriceDrops. That is, if I call .HasLoadedOrAssignedValues on the products, it returns false. If I try to interrogate the Price Drops, it tries to go back to the DB for them.
Is there a way around this, or do I have to craft a query starting with Products and not use the Select modifier? I was trying to avoid that, because in some cases I want a list of PriceDrops, and I wanted to re-use as much logic as possible (I left out the where clause and other filter code from the sample above, for clarity).
Thanks,
Tom
Try loading the Products, ordered by their latest PriceDrop:
dlo.LoadWith<Product>(p => p.PriceDrops);
db.LoadOptions = dlo;
return db.Products.OrderBy(d => d.PriceDrops.Max(pd => pd.CreatedTime));
I understand from your question that you're trying to avoid this, why?
I think what you need here is the the AssociateWith method, also on the DataLoadOptions class.
dlo.AssociateWith<Product>(p => p.PriceDrops.OrderBy(d=>d.CreatedTime))