NHibernate Group By parent entity without N+1 query? - c#

I have two tables that have a parent-child relationship. I want to count the records of the child table, grouping them by the parent entity and gather the results. So I want to see how many times each parent entity is referenced in the child table.
So if my parent table is Cats:
| Id | Name |
| 1 | Bob |
| 2 | Garfield |
and the child table is CatSkills:
| Id | Cat_Id | Skill |
| 1 | 1 | Land on feet |
| 2 | 2 | Eat lasagne |
| 3 | 2 | Escape diets |
I want to receive this:
| Id | Name | count of skills |
| 1 | Bob | 1 |
| 2 | Garfield | 2 |
I've tried with NHibernate LINQ, the query seems to be correct, but I get a "feature not supported" exception.
I tried with NHibernate QueryOver, there I get a N+1 problem:
var q = Session.QueryOver<CatSkill>()
.Fetch(s => s.Cat).Eager
.Select(Projections.ProjectionList()
.Add(Projections.Group<CatSkill>(s => s.Cat))
.Add(Projections.RowCount()))
.List<object[]>();
The above query works but will fetch all parent records in separate queries.
In other parts of experimenting I ended up with a SQL exception about how the referenced columns in the SELECT statement are not part of the GROUP BY clause.
Does anyone have an idea on how to implement this query? Thanks!
Update
The updated code, thanks to Radim, looks like this:
// a private class, just to make the query work
class CatDto : Cat
{
public int Count { get; set; }
}
// the actual query code
Cat parent = null;
CatSkill child = null;
CatDto dto = null;
// this is in fact a subselect, which will be injected into parent's SELECT
var subQuery = QueryOver.Of<CatSkill>(() => child)
.Where(() => child.Cat.ID == parent.ID)
.Select(Projections.RowCount());
// this is another subquery to filter out cats without skills
var skillFilterSubQuery = QueryOver.Of<CatSkill>(() => child)
.Where(() => child.Cat.ID == parent.ID /* && more criteria on child table here... */)
.Select(p => p.Cat);
// the alias here is essential, because it is used in the subselect
var query = session.QueryOver<Cat>(() => parent);
// I only want cats with skills
query = query.WithSubquery.WhereExists(skillFilterSubQuery);
query.SelectList(l => l
.Select(p => p.ID).WithAlias(() => dto.ID)
.Select(p => p.Name).WithAlias(() => dto.Name)
// annoying part: I have to repeat the property mapping for all needed properties of parent...
// see the parent.Count property
.Select(Projections.SubQuery(subQuery)).WithAlias(() => dto.Count));
query.TransformUsing(Transformers.AliasToBean<CatDto>());
return query.List<CatDto>();
So this gets rid of the N+1 problem but I have to map every property of the parent class (Cat in the example) manually to the DTO.
It would be nice if I could map it like .Select(s => s) but that throws an Exception saying it can't map the "" property.

An elegant way could be to directly query the parent Cat, and extend it with the required count - as a subselect.
Cat parent = null;
CatSkills child = null;
// this is in fact a subselect, which will be injected into parent's SELECT
var subQuery = QueryOver.Of<CatSkills>(() => child)
.Where(() => child.Cat.ID == parent.ID)
.Select(Projections.RowCount());
// the alias here is essential, because it is used in the subselect
var query = session.QueryOver<Cat>(() => parent);
query.SelectList(l => l
.Select(p => p.ID).WithAlias(() => parent.ID)
.Select(p => p.Name).WithAlias(() => parent.Name)
// see the parent.Count property
.Select(Projections.SubQuery(subQuery)).WithAlias(() => parent.Count)
);
query.TransformUsing(Transformers.AliasToBean<Cat>());
So in this case, we do expect, that Parent does have a property
public virtual int Count { get; set ;}
which is not mapped by NHiberante. If we cannot extend the C# object, we can create some CatDTO (having same properties as Cat entity - plus the Count)

Related

How to merge multiple list by id and get specific data?

i have 3 lists with common IDs. I need to group by object in one list, and extract data from other two. Will give example for more understanding
table for groupNames:
| Id | Name |
|--------------|
| 1 | Hello |
| 2 | Hello |
| 3 | Hey |
| 4 | Dude |
| 5 | Dude |
table for countId:
| Id | whatever |
|---------------|
| 1 | test0 |
| 1 | test1 |
| 2 | test2 |
| 3 | test3 |
| 3 | test4 |
table for lastTime:
| Id | timestamp |
|-----------------|
| 1 | 1636585230 |
| 1 | 1636585250 |
| 2 | 1636585240 |
| 3 | 1636585231 |
| 3 | 1636585230 |
| 5 | 1636585330 |
and I'm expecting result in list like this
| Name | whateverCnt | lastTimestamp |
|---------------------------------------|
| Hello | 3 | 1636585250 |
| Hey | 2 | 1636585231 |
| Dude | 0 | 1636585330 |
for now i had something like this, but it doesnt work
return groupNames
.GroupBy(x => x.Name)
.Select(x =>
{
return new myElem
{
Name = x.Name,
lastTimestamp = new DateTimeOffset(lastTime.Where(a => groupNames.Where(d => d.Name == x.Key).Select(d => d.Id).Contains(a.Id)).Max(m => m.timestamp)).ToUnixTimeMilliseconds(),
whateverCnt = countId.Where(q => (groupNames.Where(d => d.Name == x.Key).Select(d => d.Id)).ToList().Contains(q.Id)).Count()
};
})
.ToList();
Many thanks for any advice.
I think I'd mostly skip LINQ for this
class Thing{
public string Name {get;set;}
public int Count {get;set;}
public long LastTimestamp {get;set;}
}
...
var ids = new Dictionary<int, string>();
var result = new Dictionary<string, Thing>();
foreach(var g in groupNames) {
ids[g.Id] = g.Name;
result[g.Name] = new Whatever { Name = n };
}
foreach(var c in counts)
result[ids[c.Id]].Count++;
foreach(var l in lastTime){
var t = result[ids[l.Id]];
if(t.LastTimeStamp < l.Timestamp) t.LastTimeStamp = l.TimeStamp;
}
We start off making two dictionaries (you could ToDictionary this).. If groupNames is already a dictionary that maps id:name then you can skip making the ids dictionary and just use groupNames directly. This gives us fast lookup from ID to Name, but we actually want to colelct results into a name:something mapping, so we make one of those too. doing result[name] = thing always succeeds, even if we've seen name before. We could skip on some object creation with a ContainsKey check here if you want
Then all we need to do is enumerate our other N collections, building the result. The result we want is accessed from result[ids[some_id_value_here]] and it always exists if groupnames id space is complete (we will never have an id in the counts that we do not have in groupNames)
For counts, we don't care for any of the other data; just the presence of the id is enough to increment the count
For dates, it's a simple max algorithm of "if known max is less than new max make known max = new max". If you know your dates list is sorted ascending you can skip that if too..
In your example, the safest would be a list of the last specified object and just LINQ query the other arrays of objects for the same id.
So something like
public IEnumerable<SomeObject> MergeListsById(
IEnumerable<GroupNames> groupNames,
IEnumerable<CountId> countIds,
IEnumerable<LastTime> lastTimes)
{
IEnumerable<SomeObject> mergedList = new List<SomeObject>();
groupNames.ForEach(gn => {
mergedList.Add(new SomeObject {
Name = gn.Name,
whateverCnt = countIds.FirstOrDefault(ci => ci.Id == gn.Id)?.whatever,
lastTimeStamp = lastTimes.LastOrDefault(lt => lt.Id == gn.Id)?.timestamp
});
});
return mergedList;
}
Try it in a Fiddle or throwaway project and tweak it to your needs. A solution in pure LINQ is probably not desired here, for readability and maintainability sake.
And yes, as the comments say do carefully consider whether LINQ is your best option here. While it works, it does not always do better in performance than a "simple" foreach. LINQ's main selling point is and always has been short, one-line querying statements which maintain readability.
Well, having
List<(int id, string name)> groupNames = new List<(int id, string name)>() {
( 1, "Hello"),
( 2, "Hello"),
( 3, "Hey"),
( 4, "Dude"),
( 5, "Dude"),
};
List<(int id, string comments)> countId = new List<(int id, string comments)>() {
( 1 , "test0"),
( 1 , "test1"),
( 2 , "test2"),
( 3 , "test3"),
( 3 , "test4"),
};
List<(int id, int time)> lastTime = new List<(int id, int time)>() {
( 1 , 1636585230 ),
( 1 , 1636585250 ),
( 2 , 1636585240 ),
( 3 , 1636585231 ),
( 3 , 1636585230 ),
( 5 , 1636585330 ),
};
you can, technically, use the Linq below:
var result = groupNames
.GroupBy(item => item.name, item => item.id)
.Select(group => (Name : group.Key,
whateverCnt : group
.Sum(id => countId.Count(item => item.id == id)),
lastTimestamp : lastTime
.Where(item => group.Any(g => g == item.id))
.Max(item => item.time)));
Let's have a look:
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
(Hello, 3, 1636585250)
(Hey, 2, 1636585231)
(Dude, 0, 1636585330)
But be careful: List<T> (I mean countId and lastTime) are not efficient data structures here. In the Linq query we have to scan them in order to get Sum and Max. If countId and lastTime are long, turn them (by grouping) into Dictionary<int, T> with id being Key

Return certain record based on criteria (2)

I asked this question previously, but missed a vital part of my problem.
Return certain record based on criteria
Take this list of results
Client | Date | YESorNO
-------------------------------
A1 | 01/01/2001 | NO
A1 | 01/01/2002 | NO
A1 | 01/01/2003 | YES
A1 | 01/01/2004 | NO
A1 | 01/01/2005 | NO
A1 | 01/01/2006 | NO
A1 | 01/01/2007 | YES
A1 | 01/01/2008 | YES
A1 | 01/01/2009 | YES
A2 | 01/01/2001 | NO
A2 | 01/01/2002 | NO
A2 | 01/01/2003 | YES
A2 | 01/01/2004 | NO
A2 | 01/01/2005 | YES
A2 | 01/01/2006 | YES
A3 | 01/01/2001 | NO
...etc...
The list is ordered chronologically and I cannot sort this is any other way other than descending / ascending.
I cannot sort for Yes | NO and find the First() or Last() as this won't give me the required value.
I want to be able to return the first 'YES' after all 'NO's have been accounted for, per Client.
In the above example for Client[A1] row 7 is the record I want returned (on 01/01/2007).
Client[A2] - row 5 (01/01/2005) ..etc
My code is as follows
var query =
(
from m in db.MyTable
where m.Criteria == XYZ
select new
{
Client = m.Client,
Date = m.Date,
YESorNO = m.YESorNO
}
).OrderBy(x => x.Date);
Using .FirstOrDefault(x => x.YesOrNO == "YES") returns the 3rd record.
User #RenéVogt advised that
var result = query.AsEnumerable()
.TakeWhile(x => x.YESorNO == "YES")
.LastOrDefault();
would get the job done and it does, but I forgot to add that the query will be returning many Clients and I need the first 'YES' for each Client, therefore the above code won't suffice.
Iterating over my results would be hugely time consuming and whilst that is a solution I would prefer this logic to be within the database query itself (if possible)
Many thanks
What you have to do is grouping by client,and then find the last YES of each one starting from the end. Something like this (ClientList is a List<>, you may have to change it depending on where is your data):
var query = ClientList.OrderBy(x => x.client).ThenBy(x => x.date).GroupBy(x => x.client);
foreach (var client in query)
{
var lastYES=client.Reverse().TakeWhile(x => x.YESorNO == "YES")
.LastOrDefault();
Console.WriteLine(String.Format("{0} {1}",client.Key,lastYES.date));
}
//Output: A1 01/01/2007 0:00:00
// A2 01/01/2005 0:00:00
Edit
Mansur Anorboev rightly suggested ordering by descending date, thus eliminating the need of Reverse, so the code would be:
var query = ClientList.OrderBy(x => x.client).ThenByDescending(x => x.date).GroupBy(x => x.client);
foreach (var client in query)
{
var lastYES=client.TakeWhile(x => x.YESorNO == "YES")
.LastOrDefault();
Console.WriteLine(String.Format("{0} {1}",client.Key,lastYES.date));
}
Edit 2
I still was not completly happy with my solution, as it is using a foreach. This does everything in one Linq command:
var query = ClientList.OrderBy(x => x.client)
.ThenByDescending(x => x.date)
.GroupBy(x => x.client, (key, g) => g.TakeWhile(x => x.YESorNO == "YES").LastOrDefault())
.ToList();
This returns a list with one element per client and with the correct date.
I can provide a little sql query
;WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Client DESC) AS rn
FROM [dbo].[tblSkaterhaz]
)
,gte AS (
SELECT Client,max(rn) mx FROM cte
WHERE YesOrNo = 'NO'
GROUP BY Client
)
SELECT cte.* FROM gte
INNER JOIN cte on cte.Client = gte.Client and cte.rn = gte.mx + 1
Although it is not the required solution, but it yields the required result. You can create a stored proc and use it in your code.
NOTE: This is tested against the same table (and data) mentioned in question above
I hope this will be helpful for you.

How to SUM up results by column value in db query result

My database has a sales table with entries like so:
_____________________________________
| id | title_id | qty |
-------------------------------------
| 0 | 6 | 10 |
-------------------------------------
| 1 | 5 | 5 |
-------------------------------------
| 2 | 6 | 2 |
-------------------------------------
Title_id is Foreign key pointing to Titles table which is as follows:
_____________________________________
| id | title_id | title |
-------------------------------------
| 0 | 5 | Soda |
-------------------------------------
| 1 | 6 | Coffee |
-------------------------------------
I want to find top 5 sold products wich means i need to calculate the qty value for each product for all it's entried in sales table then order the result by qty in descending order and limit the select to 5.
However I'm new to C# ASP.NET and somewhat new to SQL. I dont know how to do this with LINQ.
This is my code so far:
var getIds = (from sale in db.sales
join tit in db.titles on sale.title_id equals tit.title_id
group sale by sale.qty into result
orderby result.Sum(i => i.qty) descending
select new Publication
{
PubID = sales.title_id, Title = tit.title
}
).Take(5);
Assuming you have a navigation property Sale.Title, something like this should do:
var tops =
db.Sales
.GroupBy( o => o.Title )
.Select( o => new { Title = o.Key, Sum = o.Sum( x => x.Quantity ) } )
.OrderByDescending( o => o.Sum )
.Take( 5 )
.ToList();
tops is then a list of an anonymous type with two properties: the Title object and the sum of the quantities.
You can then get the values like this:
foreach( var top in tops )
{
int titleId = top.Title.title_id;
string title = top.Title.title;
int sumOfQuantities = top.Sum;
...
If you just want the top Title objects, can can select them like this:
List<Title> topTitles = tops.Select( o => o.Title ).ToList();
var result= (from p in sales
let k = new
{
Name = p.Name
}
group p by k into t
orderby Name descending
select new
{
Name = t.Name,
Qty = t.Sum(p => p.Qty)
}).Take(5);
If the entries in the Sales table are more than one per item (ie: in your example you have 'Soda' 10 + 'Soda' 2, then you need to GroupBy(), using the name as the key (or it's related id if it's in another table), but not the qty.
var topSales = db.sales.GroupBy(x => x.title)
.Select(g => new
{
Title = g.Key,
Qty = g.Sum(x => x.qty)
})
.OrderByDescending(x => x.Qty)
.Select(x => new Publication
{
PubID = x.Title.title_id,
Title = x.Title.title1
})
.Take(5)
.ToList();
Note that I've omitted the join statement assuming that you have a foreign key between sales.title_id -> title.id, and you are using LINQ to SQL. Also note that I've avoided using the query syntax in favor of the chained method syntax, I think it's much clear in this use case (although not always true, ie: cross-joins).
Also, SQL and LINQ have some similarities but don't let the names of clauses/methods fool you, LINQ is not SQL, IMHO, Microsoft just tried to make people comfortable by making it look similar ;)
EDIT: fixed GroupBy()
var result= (from p in sales
let k = new
{
Name = p.Name
}
group p by k into t
select new
{
Name = t.Name,
Qty = t.Sum(p => p.Qty)
}).OrderByDescending(i => i.Qty).Take(5);
You need to look at GroupBy; this will give you what you need
http://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b

NHibernate - LINQ Query Many to Many Issue

I'm trying to upgrade an existing application to use NHibernate. My database has the following tables:
Sites:
- Id (PK)
- Name
Categories:
- Id (PK)
- Name
CategoriesSite
- CategoryId (PK)
- SiteId (PK)
- Active
For each category and site a record may or may not exist in the CategoriesSite table. If an item exists in the CategoriesSite table then it can turn the Category off by setting Active to false. If it doesn't then it assumes Active is true.
I'd like to create a LINQ query in NHibernate to filter for categories of a particular site (that are active). For example say I have the following data:
Sites:
Id | Name
1 | Site 1
2 | Site 2
Categories:
Id | Name
1 | Category 1
2 | Category 2
CategoriesSite:
CategoryId | SiteId | Active
1 | 1 | True
1 | 2 | True
2 | 1 | False
I could say:
var categories = session.Query<CategorySite>()
.Where(s => s.Site.Id == 2 && s.Active)
.Select(s => s.Category)
.ToList();
However this will only get Category 1 and not Category 2 which I'd like it to do. I was wondering if anyone has done anything similar and could suggest either a way to query this or offer any recommendations on how I can map this scenario better.
Without seeing the generated query, I can only guess but, try this instead:
var categories = session.Query<CategorySite>()
.Where(s => s.SiteId == 2 && s.Active) // not s.Site.Id
.Select(s => s.Category)
.ToList();
I think i've solved this. I added a one to many collection against the Category for the list of Categorory Sites. This allows me to say:
var categories = session.Query<Category>()
.Where(c => !c.Sites.Any(s => s.Site.Id == 2 && !s.Active))
.ToList();
This means it will only not return the category when it has been set in-active and will still return the Category when no record exists in the CategoriesSite table.

LINQ GroupBy, whilst keeping all object fields

I've currently got this sample table of data:
ID | Policy ID | History ID | Policy name
1 | 1 | 0 | Test
2 | 1 | 1 | Test
3 | 2 | 0 | Test1
4 | 2 | 1 | Test1
Out of this, I want to group by the Policy ID and History ID (MAX), so the records I want to be kept are ID's 2 and 4:
ID | Policy ID | History ID | Policy name
2 | 1 | 1 | Test
4 | 2 | 1 | Test1
I've tried to do this in LINQ and stumbling on the same issue every time. I can group my entities, but always into a group where I have to re-define the properties, rather than have them kept from my Policy objects. Such as:
var policies = _context.Policies.GroupBy(a => a.intPolicyId)
.Select(group => new {
PolicyID = group.Key,
HistoryID = group.Max(a => a.intHistoryID)
});
This simply just brings out a list of objects which have "Policy ID" and "History ID" within them. I want all the properties returned from the Policies object, without having to redefine them all, as there are around 50+ properties in this object.
I tried:
var policies = _context.Policies.GroupBy(a => a.intPolicyId)
.Select(group => new {
PolicyID = group.Key,
HistoryID = group.Max(a => a.intHistoryID)
PolicyObject = group;
});
But this errors out.
Any ideas?
Group by composite key
_context.Policies.GroupBy(a => new {a.intPolicyId, *other fields*}).Select(
group=> new {
PolicyId = group.Key.intPolicyId,
HistoryId = group.Max(intHistoryId),
*other fields*
}
);
Another way - grab histories, than join back with the rest of the data, something like this (won't work out of the box, will require some refining)
var historyIDs = _context.Policies.GroupBy(a=>a.intPolicyId).Select(group => new {
PolicyID = group.Key,
HistoryID = group.Max(a => a.intHistoryID)
});
var finalData = from h in historyIDs
join p in _context.Policies on h.intPolicyId equals p.intPolicyId
select new {h.HistoryId, *all other policy fields*}
And yet another way, even simpler and not require a lot of typing :):
var historyIDs = _context.Policies.GroupBy(a=>a.intPolicyId).Select(group => new {
PolicyID = group.Key,
HistoryID = group.Max(a => a.intHistoryID)
});
var finalData = from h in historyIDs
join p in _context.Policies on h.PolicyId equals p.intPolicyId && h.HistoryId equals p.HistoryId
select p
Basically it's somewhat equivalent to the following SQL query:
select p.*
from Policy p
inner join (
select pi.policyId, max(pi.historyId)
from Policy pi
group by pi.policyId
) pp on pp.policyId = p.policyId and pp.historyId = p.historyId
In LINQ to Objects, I'd do this as
var policies = _context.Policies
.GroupBy(a => a.intPolicyId)
.Select(g => g.OrderByDescending(p => p.intHistoryID).First());
but your _context impleis there might be a database involved and I'm not 100% sure this will translate.
Basically it groups by the policy ID as you'd expect, then within each group orders by history ID and from each group selects the row with the highest history ID. It returns exactly the same type as is found in Policies.

Categories

Resources