I'm trying to get info from 3 tables, Clients, invoices, and transactions. Every client can have many invoices, and every invoice can have many transactions.
There are several hundreds of thousands of the invoices and transactions with quite a few columns on each table. I only need a few of the columns and would take too long to grab all columns on all the tables.
Here's what I'm attempting to do, but it does not work.
var Clients = cps.Clients
.Include(q => q.Invoices.Select(x=> new TInvoices
{
Invno = x.Invno,
// more columns...
})
.ThenInclude(q => q.Transactions).ToList();
I'd also like to do a select on the transactions as well.
Any thoughts on how to do this?
I was going to try some linq joins and such, but I have limited knowledge and understanding of that area. Open to suggestions or ideas!
Thanks!
Use Select instead of Include. Include always loads whole record and there is no way to do something else.
var Clients = cps.Clients
.Select(c => new // or other concrete type
{
c.id,
// other properties
Invoices = c.Invoices.Select(i => new TInvoices
{
Invno = i.Invno,
// more columns...
Transations = i.Transactions.ToList()
})
.ToList()
});
Related
I am currently coding with .net core 5 preview 3 and I am having an issue with filtering a list of best matched customers.
Given these two different code samples how come they produce different results?
How can I fix the second sample to return the same results as the first sample?
Sample One (this works)
//This properly gives the top 10 best matches from the database
using (var context = new CustomerContext(_contextOptions))
{
customers = await context.vCustomer.Where(c => c.Account_Number__c.Contains(searchTerm))
.Select(c => new
{
vCustomer = c,
MatchEvaluator = searchTerm.Contains(c.Account_Number__c)
})
.OrderByDescending(c => c.MatchEvaluator)
.Select(c => new CustomerModel
{
CustomerId = c.vCustomer.Account_Number__c,
CustomerName = c.vCustomer.Name
})
.Take(10)
.ToListAsync();
}
Customer Id Results from sample one (these are the best results)
247
2470
247105
247109
247110
247111
247112
247113
247116
247117
Sample Two (This doesn't work the same even though its the same code)
//this take all customers from database and puts them in a list so they can be cached and sorted on later.
List<CustomerModel> customers = new List<CustomerModel>();
using (var context = new CustomerContext(_contextOptions))
{
customers = await context.vCustomer
.Select(c => new CustomerModel
{
CustomerId = c.Account_Number__c,
CustomerName = c.Name
})
.ToListAsync();
}
//This does not properly gives the top 10 best matches from the list that was generated from the database
List<CustomerModel> bestMatchedCustomers = await Task.FromResult(
customers.Where(c => c.CustomerId.Contains(searchTerm))
.Select(c => new
{
Customer = c,
MatchEvaluator = searchTerm.Contains(c.CustomerId)
})
.OrderByDescending(c => c.MatchEvaluator)
.Select(c => new CustomerModel
{
CustomerId = c.Customer.CustomerId,
CustomerName = c.Customer.CustomerName
})
.Take(10)
.ToList()
);
Customer Id Results from sample two
247
1065247
247610
32470
324795
624749
762471
271247
247840
724732
You asked "why are they different" and for this you need to appreciate that databases have a optimizer that looks at the query being run and changes its data access strategy according to various things like how many records are being selected, whether indexes apply, what sorting is requested etc
One of your queries selects all the database table into the client side list and then uses the list to do the filter and sort, the other uses the database to do the filter and the sort. To a database these will be very different things; hitting a table you likely get the rows out in the order they're stored on disk, which could be random. Using a filter you might see the database using some indexing strategy where it includes/discounts a large number of rows based on an index, or it might even use the index to retrieve the requested data. How it then sorts the ties, if it does, might be completely different to how the client side list sorts ties (does nothing with them actually). Either way, the important point is the database is planning and executing your two different queries differently. It sees different queries because your second version runs the query without a where or order by
When you couple this up with your sort operation being on a column that is incredibly cardinality (how unique the values in the column are) i.e. your lead result, the one where the record equals the search term, is 1 and EVERYTHING else is 0. This means that one record bubbles to the top then the rest of the records are free to be sorted however the system doing the sorting likes, and then you take a subset of them
..hence why one looks like X and the other like Y
If you didn't take the subset the two datasets would be in different orders but everything in set 1 would be in set 2 somewhere... it's just that one set is like 1 3 5 7 2 4 6, the other is like 1 7 6 5 4 3 2, you're taking the first three results and asking "why is 1 3 5 different to 1 7 6"
In terms of your code, I think I would have just done something simple that also sorts in a stable fashion (rows in same order because there is no ambiguity/ties) like:
await context.vCustomer
.Where(c => c.Account_Number__c.Contains(searchTerm))
.OrderBy(c => c.Account_Number__c.Length)
.ThenBy(c => c.Account_Number__c) //stable, if unique
.Take(10)
.Select(c => new CustomerModel
{
CustomerId = c.vCustomer.Account_Number__c,
CustomerName = c.vCustomer.Name
}
)
.ToListAsync();
If you sort the results by their length in chars then 247 is better match than 2470, which is better than 24711 or 12471 etc
"Contains" can be quite performance penalising; perhaps consider StartsWith; theoretically at least, an index could still be used for that
ps: calling your var a MatchEvaluator makes things really confusing for people who know regex well btw
You're ordering by the MatchEvaluator value which is either 1 or 0.
If I understood correctly what you want to do is first order by the MatchEvaluator and then by the CustomerId:
List<CustomerModel> bestMatchedCustomers =
await Task.FromResult(
customers.Where(c => c.CustomerId.Contains(searchTerm))
.OrderBy(c => c.CustomerId.IndexOf(searchTerm))
.ThenBy(c => c.CustomerId)
.Select(c => new CustomerModel
{
CustomerId = c.Customer.CustomerId,
CustomerName = c.Customer.CustomerId
})
.Take(10)
.ToList()
);
I have a database that contains 3 tables:
Phones
PhoneListings
PhoneConditions
PhoneListings has a FK from the Phones table(PhoneID), and a FK from the Phone Conditions table(conditionID)
I am working on a function that adds a Phone Listing to the user's cart, and returns all of the necessary information for the user. The phone make and model are contained in the PHONES table, and the details about the Condition are contained in the PhoneConditions table.
Currently I am using 3 queries to obtain all the neccesary information. Is there a way to combine all of this into one query?
public ActionResult phoneAdd(int listingID, int qty)
{
ShoppingBasket myBasket = new ShoppingBasket();
string BasketID = myBasket.GetBasketID(this.HttpContext);
var PhoneListingQuery = (from x in myDB.phoneListings
where x.phonelistingID == listingID
select x).Single();
var PhoneCondition = myDB.phoneConditions
.Where(x => x.conditionID == PhoneListingQuery.phonelistingID).Single();
var PhoneDataQuery = (from ph in myDB.Phones
where ph.PhoneID == PhoneListingQuery.phonePageID
select ph).SingleOrDefault();
}
You could project the result into an anonymous class, or a Tuple, or even a custom shaped entity in a single line, however the overall database performance might not be any better:
var phoneObjects = myDB.phoneListings
.Where(pl => pl.phonelistingID == listingID)
.Select(pl => new
{
PhoneListingQuery = pl,
PhoneCondition = myDB.phoneConditions
.Single(pc => pc.conditionID == pl.phonelistingID),
PhoneDataQuery = myDB.Phones
.SingleOrDefault(ph => ph.PhoneID == pl.phonePageID)
})
.Single();
// Access phoneObjects.PhoneListingQuery / PhoneCondition / PhoneDataQuery as needed
There are also slightly more compact overloads of the LINQ Single and SingleOrDefault extensions which take a predicate as a parameter, which will help reduce the code slightly.
Edit
As an alternative to multiple retrievals from the ORM DbContext, or doing explicit manual Joins, if you set up navigation relationships between entities in your model via the navigable join keys (usually the Foreign Keys in the underlying tables), you can specify the depth of fetch with an eager load, using Include:
var phoneListingWithAssociations = myDB.phoneListings
.Include(pl => pl.PhoneConditions)
.Include(pl => pl.Phones)
.Single(pl => pl.phonelistingID == listingID);
Which will return the entity graph in phoneListingWithAssociations
(Assuming foreign keys PhoneListing.phonePageID => Phones.phoneId and
PhoneCondition.conditionID => PhoneListing.phonelistingID)
You should be able to pull it all in one query with join, I think.
But as pointed out you might not achieve alot of speed from this, as you are just picking the first match and then moving on, not really doing any inner comparisons.
If you know there exist atleast one data point in each table then you might aswell pull all at the same time. if not then waiting with the "sub queries" is nice as done by StuartLC.
var Phone = (from a in myDB.phoneListings
join b in myDB.phoneConditions on a.phonelistingID equals b.conditionID
join c in ph in myDB.Phones on a.phonePageID equals c.PhoneID
where
a.phonelistingID == listingID
select new {
Listing = a,
Condition = b,
Data = c
}).FirstOrDefault();
FirstOrDefault because single throws error if there exists more than one element.
I have three tables, which two of them are in many to many relationship.
Picture:
This is the data in middle mm table:
Edit:
Got until here, I get proper 4 rows back, but they are all the same result(I know I need 4 rows back, but there are different results)
return this._mediaBugEntityDB.LotteryOffers
.Find(lotteryOfferId).LotteryDrawDates
.Join(this._mediaBugEntityDB.Lotteries, ldd => ldd.LotteryId, lot => lot.Id, (ldd, lot) =>
new Lottery
{
Name = lot.Name,
CreatedBy = lot.CreatedBy,
ModifiedOn = lot.ModifiedOn
}).AsQueryable();
My question is, how can I retrieve all the Lotteries via many to many table WHERE I have LotteryOfferId given only?
What I want to achieve is to get data from lottery table by LotteryDrawDateId.
First I use LotteryOfferId to get DrawDates from middle table, and by middle table I get drawDateIds to use them in LotteryDrawDate table. From that table I need to retreive Lottey table by LotteryId in LotteryDrawDate table.
I gain this by normal SQL(LotteryOffersLotteryDrawDates is middle table in DB, not seen in model):
select
Name, Lotteries.CreatedBy, Lotteries.ModifiedOn, count(Lotteries.Id)
as TotalDrawDates from Lotteries join LotteryDrawDates on Lotteries.Id
= LotteryDrawDates.LotteryId join LotteryOffersLotteryDrawDates on LotteryDrawDates.Id =
LotteryOffersLotteryDrawDates.LotteryDrawDate_Id
where LotteryOffersLotteryDrawDates.LotteryOffer_Id = 19 group by
Name, Lotteries.CreatedBy, Lotteries.ModifiedOn
But Linq is different story :P
I would like to do this with lambda expressions.
Thanks
db.LotteryOffer.Where(lo => lo.Id == <lotteryOfferId>)
.SelectMany(lo => lo.LotteryDrawDates)
.Select( ldd => ldd.Lottery )
.GroupBy( l => new { l.Name, l.CreatedBy, l.ModifiedOn } )
.Select( g => new
{
g.Key.Name,
g.Key.CreatedBy,
g.Key.ModifiedOn,
TotalDrawDates = g.Count()
} );
You can do this:
var query = from lo in this._mediaBugEntityDB.LotteryOffers
where lo.lotteryOfferId == lotteryOfferId
from ld in lo.LotteryDrawDates
group ld by ld.Lottery into grp
select grp.Key;
I do this in query syntax, because (in my opinion) it is easier to see what happens. The main point is the grouping by Lottery, because you get a number of LotteryDrawDates any of which can have the same Lottery.
If you want to display the counts of LotteryDrawDates per Lottery it's better to take a different approach:
from lot in this._mediaBugEntityDB.Lotteries.Include(x => x.LotteryDrawDates)
where lot.LotteryDrawDates
.Any(ld => ld.LotteryDrawDates
.Any(lo => lo.lotteryOfferId == lotteryOfferId))
select lot
Now you get Lottery objects with their LotteryDrawDates collections loaded, so afterwards you can access lottery.LotteryDrawDates.Count() without lazy loading exceptions.
I have a following foreach statement:
foreach (var articleId in cleanArticlesIds)
{
var countArt = context.TrackingInformations.Where(x => x.ArticleId == articleId).Count();
articleDictionary.Add(articleId, countArt);
}
Database looks like this
TrackingInformation(Id, ArticleId --some stuff
Article(Id, --some stuff
what I want to do is to get all the article ids count from TrackingInformations Table.
For example:
ArticleId:1 Count:1
ArticleId:2 Count:8
ArticleId:3 Count:5
ArticleId:4 Count:0
so I can have a dictionary<articleId, count>
Context is the Entity Framework DbContext. The problem is that this solution works very slow (there are > 10k articles in db and they should rapidly grow)
Try next query to gather grouped data and them add missing information. You can try to skip Select clause, I don't know if EF can handle ToDictionary in good manner.
If you encounter Select n + 1 problem (huge amount of database requests), you can add ToList() step between Select and ToDictionary, so that all required information will be brought into memory.
This depends all your mapping configuration, environment, so in order to get good performance, you need to play a little bit with different queries. Main approach is to aggregate as much data as possible at database level with few queries.
var articleDictionary =
context.TrackingInformations.Where(trackInfo => cleanArticlesIds.Contains(trackInfo.ArticleId))
.GroupBy(trackInfo => trackInfo.ArticleId)
.Select(grp => new{grp.Key, Count = grp.Count()})
.ToDictionary(info => "ArticleId:" + info.Key,
info => info.Count);
foreach (var missingArticleId in cleanArticlesIds)
{
if(!articleDictionary.ContainsKey(missingArticleId))
articleDictionary.add(missingArticleId, 0);
}
If TrackingInformation is a navigatable property of Article, then you can do this:
var result=context.Article.Select(a=>new {a.id,Count=a.TrackingInformation.Count()});
Putting it into a dictionary is simple as well:
var result=context.Article
.Select(a=>new {a.id,Count=a.TrackingInformation.Count()})
.ToDictionary(a=>a.id,a=>a.Count);
If TrackingInforation isn't a navigatable property, then you can do:
var result=context.Article.GroupJoin(
context.TrackingInformation,
foo => foo.id,
bar => bar.id,
(x,y) => new { id = x.id, Count = y.Count() })
.ToDictionary(a=>a.id,a=>a.Count);
What I want to do, is basically what this question offers: SQL Server - How to display most recent records based on dates in two tables .. Only difference is: I am using Linq to sql.
I have to tables:
Assignments
ForumPosts
These are not very similar, but they both have a "LastUpdated" field. I want to get the most recent joined records. However, I also need a take/skip functionality for paging (and no, I don't have SQL 2012).
I don't want to create a new list (with ToList and AddRange) with ALL my records, so I know the whole set of records, and then order.. That seems extremely unefficient.
My attempt:
Please don't laugh at my inefficient code.. Well ok, a little (both because it's inefficient and... it doesn't do what I want when skip is more than 0).
public List<TempContentPlaceholder> LatestReplies(int take, int skip)
{
using (GKDBDataContext db = new GKDBDataContext())
{
var forumPosts = db.dbForumPosts.OrderBy(c => c.LastUpdated).Skip(skip).Take(take).ToList();
var assignMents = db.dbUploadedAssignments.OrderBy(c => c.LastUpdated).Skip(skip).Take(take).ToList();
List<TempContentPlaceholder> fps =
forumPosts.Select(
c =>
new TempContentPlaceholder()
{
Id = c.PostId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
}).ToList();
List<TempContentPlaceholder> asm =
assignMents.Select(
c =>
new TempContentPlaceholder()
{
Id = c.UploadAssignmentId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
}).ToList();
fps.AddRange(asm);
return fps.OrderBy(c=>c.LastUpdated).ToList();
}
}
Any awesome Linq to SQl people, who can throw me a hint? I am sure someone can join their way out of this!
First, you should be using OrderByDescending, since later dates have greater values than earlier dates, in order to get the most recent updates. Second, I think what you are doing will work, for the first page, but you need to only take the top take values from the joined list as well. That is if you want the last 20 entries from both tables combined, take the last 20 entries from each, merge them, then take the last 20 entries from the merged list. The problem comes in when you attempt to use paging because what you will need to do is know how many elements from each list went into making up the previous pages. I think, your best bet is probably to merge them first, then use skip/take. I know you don't want to hear that, but other solutions are probably more complex. Alternatively, you could take the top skip+take values from each table, then merge, skip the skip values and apply take.
using (GKDBDataContext db = new GKDBDataContext())
{
var fps = db.dbForumPosts.Select(c => new TempContentPlaceholder()
{
Id = c.PostId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
})
.Concat( db.dbUploadedAssignments.Select(c => new TempContentPlaceholder()
{
Id = c.PostId,
LastUpdated = c.LastUpdated,
Type = ContentShowingType.ForumPost
}))
.OrderByDescending( c => c.LastUpdated )
.Skip(skip)
.Take(take)
.ToList();
return fps;
}